WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [RFC] RAS(Part II)--MCA enalbing in XEN

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, Christoph Egger <Christoph.Egger@xxxxxxx>, "Frank.Vanderlinden@xxxxxxx" <Frank.Vanderlinden@xxxxxxx>, Gavin Maltby <Gavin.Maltby@xxxxxxx>, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Subject: [Xen-devel] [RFC] RAS(Part II)--MCA enalbing in XEN
From: "Ke, Liping" <liping.ke@xxxxxxxxx>
Date: Mon, 16 Feb 2009 13:35:14 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Sun, 15 Feb 2009 21:36:04 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcmP91rNRBRRcc1PThyWQa3ZT8GwXQAADnNg
Thread-topic: [RFC] RAS(Part II)--MCA enalbing in XEN
Hi, all
These patches are for MCA enabling in XEN. It is sent as RFC firstly to collect 
some feedbacks for refinement if 
needed before the final patch. We also attach one description txt documents for 
your reference.
 
Some implementation notes:
1) When error happens, if the error is fatal (pcc = 1) or can't be recovered 
(pcc = 0, yet no good recovery methods),
    for avoiding losing logs in DOM0, we will reset machine immediately. Most 
of MCA MSRs are sticky. After reboot, 
    MCA polling mechanism will send vIRQ to DOM0 for logging.
2) When MCE# happens, all CPUs enter MCA context. The first CPU who read&clear 
the error MSR bank will be this
    MCE# owner. Necessary locks/synchronization will help to judge the owner 
and select most severe error.
3) For convenience, we will select the most offending CPU to do most of 
processing&recovery job.
4) MCE# happens, we will do three jobs:
    a. Send vIRQ to DOM0 for logging
    b. Send vMCE# to Impacted Guest (Currently Only inject to impacted DOM0)
    c. Guest vMCE MSR virtualization
5) Some further improvement/adds might be done if needed:
    a) Impacted DOM judgement algorithm. 
    b) Now vMCE# injection is controlled by centralized data(vmce_data). The 
injection algorithm is a bit complex. 
        We might change the algorithm which's based on PER_DOM data if you 
preferred.
        Notes for understanding:
        1) If several banks impact one domain, yet those banks belong to the 
same pCPU, it will be injected only once.
        2) If more than one bank impact one domain, yet error banks belong to 
different pCPU, ith will be injected nr_num(pCPU) times.
        3) We use centralized data [two arrays impact_domid, impact_cpus map in 
vmce_data] to represent the injection 
            algorithm. Combined the two array item (idx, impact_domid) and 
(idx, impact_cpus) into one item 
            (idx, impact_domid, impact_cpus). This item records the 
impact_domain id and the error pCPU map 
            (Finding UC errors on this CPU which impact this domain). Then, we 
can judge how to inject the vMCE
            (domid, impact_times[nr_pCPUs]).
        4) Although data structure is ready, we only inject vMCE# to DOMD0 
currently.
    c) Connection with recovery actions (cpu/memory online/offline)
    d) More refines and tests for HVM might be done when needed.
 
Patch Description:
1. basic_mca_support: Enable MCA support in XEN. 
2. vmsr_virtualization: Guest MCE# MSR read/write virtualization support in XEN.
3. mce_dom0: Cooperating with XEN, DOM0 add vIRQ and vMCE# handler. Translate 
XEN log to DOM0, re-use 
    Linux kernel and MCELOG mechanisms and MCE handler. This is mainly a 
demonstration patch. 
 
About Test:
We did some internal test and the result is just fine.
 
Any feedback is welcome and thanks a lot for your help! :-)
Regards,
Criping

Attachment: MCA_desc.txt
Description: MCA_desc.txt

Attachment: basic_mca_support.patch
Description: basic_mca_support.patch

Attachment: vmsr_virtualization.patch
Description: vmsr_virtualization.patch

Attachment: mce_dom0.patch
Description: mce_dom0.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel