[Xen-devel] [Patch 0/6] Xen vMCE implement


These patches are Xen vMCE implement, among them
Patch 1: This patch is a middle-work patch, prepare for future new vMCE model. 
It remove mci_ctl array, and keep MCi_CTL all 1's.
Patch 2: This patch is a middle-work patch, prepare for future new vMCE model. 
It remove mcg_ctl, disable MCG_CTL_P, and set bank number to 2.

Patch 3: vMCE emulation
This patch provides virtual MCE support to guest. It emulates a simple and 
clean MCE MSRs interface to guest by faking caps to guest if needed and masking 
caps if unnecessary:
1. Providing a well-defined MCG_CAP to guest, filter out un-necessary caps and 
provide only guest needed caps;
2. Disabling MCG_CTL to avoid model specific;
3. Sticking all 1's to MCi_CTL to guest to avoid model specific;
4. Enabling CMCI cap but never really inject to guest to prevent polling 
5. Masking MSCOD field of MCi_STATUS to avoid model specific;
6. Keeping natural semantics by per-vcpu instead of per-domain variables;
7. Using bank1 and reserving bank0 to work around 'bank0 quirk' of some very 
old processors;
8. Cleaning some vMCE# injection logic which shared by Intel and AMD but 
useless under new vMCE implement;
9. Keeping compatilbe w/ old xen version which has been backported to SLES11 
SP2, so that old vMCE would not blocked when migrate to new vMCE;

Patch 4: vMCE injection
In our test for win8 guest mce, we find a bug that no matter what SRAO/SRAR 
error xen inject to win8 guest, it always reboot. The root cause is, current 
Xen vMCE logic inject vMCE# only to vcpu0, this is not correct for Intel MCE 
(Under Intel arch, h/w generate MCE# to all CPUs).
This patch fix vMCE injection bug, injecting vMCE# to all vcpus.

Patch 5: vMCE save and restore
This patch provide vMCE save/restore when migration.
1. MCG_CAP is well-defined. However, considering future cap extension, we keep 
save/restore logic that Jan implement at c/s 24887;
2. MCi_CTL2 initialized by guestos when booting, so need save/restore otherwise 
guest would surprise;
3. Other MSRs do not need save/restore since they are either error-related and 
pointless to save/restore, or, unified among all vMCE platform;

Patch 6: Cleanup guest vMCE check
This patch simplify vMCE logic by removing guest vMCE check, since hypervisor 
should be agnostic to guest.
With guest vMCE check, hypervisor would actively kill guest when guest vMCE not 
ready. Without guest vMCE check, hypervisor would always inject vMCE to guest: 
if guest ready it would happily handle it, and if guest not ready, it 
automatically kill itself.

These patches have been tested w/ Linux and Windows 8 guestos, and worked fine.

Considering Intel and AMD common vMCE interface to guest, these patches are 
mainly put in vmce.c (except mci_ctl2 in intel specific code since this MSRs 
are supported by Intel only). AMD code could update them according to AMD's 
specific requirement.

These patches didn't handle the corner case 'MCE occur when live migration'. We 
have draftly done the patch for the corner case but meet issue now (seems 
caused from Xen live migration logic itself, we are debugging it now, and would 
present soon when it fixed).

