[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Xen/MCE: adjust for future new vMCE model



On 07/05/12 11:20, Liu, Jinsong wrote:

> Jan Beulich wrote:
>>>>> On 05.07.12 at 05:30, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>> wrote: @@ -68,7 +74,7 @@ 
>>>
>>>  int vmce_restore_vcpu(struct vcpu *v, uint64_t caps)  {
>>> -    if ( caps & ~g_mcg_cap & ~MCG_CAP_COUNT & ~MCG_CTL_P )
>>> +    if ( caps & ~g_mcg_cap & ~MCG_CAP_COUNT )
>>>      {
>>>          dprintk(XENLOG_G_ERR, "%s restore: unsupported MCA
>>>                  capabilities" " %#" PRIx64 " for d%d:v%u
>>>          (supported: %#Lx)\n", @@ -77,6 +83,12 @@ return -EPERM;
>>>      }
>>>
>>> +    if ( (caps & MCG_CAP_COUNT) != GUEST_BANK_NUM ) +    {
>>> +        dprintk(XENLOG_G_ERR, "Bank number inconsistency\n"); +    
>>> return -EPERM; +    }
>>> +
>>>      v->arch.mcg_cap = caps;
>>>      return 0;
>>>  }
>>
>> These two changes, as pointed out before, need some further
>> thought and tweaking: As I said earlier, we are already shipping
>> the code in its prior form, so outright rejecting MCG_CTL_P set
>> and non-default bank counts is not a proper solution. Warning
>> about them being in an unsupported state is certainly acceptable.
>>
>> And I think the guest visible MCG_CAP value also shouldn't
>> change across migration, so tolerating (but ignoring) a higher
>> bank count seems necessary. Not sure what to do when the
>> bank count is lower (0 or 1) - for 1, all communication to the
>> guest should probably go through bank 0, while 0 should
>> probably disable vMCE  for that vCPU.
>>
>> Further I just noticed that you don't touch fill_vmsr_data() at
>> all (sending patches created with diff's -p option or equivalent
>> helps spotting where individual changes belong), yet that
>> function uses the hardware bank number to fill the struct
>> bank_entry. With the intended concept, the "bank" member
>> of that structure can probably go away altogether.
>>
>> Jan
> 
> 
> Seems things become a little bit chaos, let's jump out from details,

> make agreement first about what's the general purpose of this
> middle-work patch.

> 
> ============
> 1. current xen vmce status
>   1.1) current xen vmce has 2 kinds of bugs:
>     * functional bug: it actually doesn't work correctly for guest;
>     * migration bug: partly solved by c/s 24887;
>   1.2) c/s 24887 not in formal xen release version, it's a temporary fix (but 
> unfortunately has been backported to SELS11 sp2)
> ============
> 2. target of this middle-work patch
>   2.1) it's not used to address functional bug
>   2.3) it does minimal work, just in order not to bring trouble/dirty to 
> future new vMCE, so it would
>     * remove MCG_CTL --> otherwise new vMCE have to add useless MCG_CTL_P and 
> MCG_CTL, w/ potential model specific issue;
>     * stick all 1's to MCi_CTL --> to avoid semantic issue;
>     * set MCG_CTL_P=0 and 2 banks at MCG_CAP --> otherwise dirty code at new 
> vMCE, like bank number issue;
> ============
> 
> I think in this middle-work patch, we don't need constrained by c/s 24887:
> 1. c/s 24887 not in formal xen release
> 2. the benefit of keep compatible w/ 24887 is minor:
>   * currently all xen version have migration bug, 3.x -> 4.0 -> 4.1 -> 4.2
>   * keep compatible w/ 24887 only benefit one case --> migration from SLES11 
> sp2 to xen 4.2 (for both SLES11 sp2 and xen 4.2, vMCE actually doesn't 
> functionally work fine to guest)
> 3. the price/hurt to future vMCE is high (to keep compatible w/ 24887)
> 
> Considering above, I prefer an outright cleanup in xen4.2,

> not constrained by c/s 24887 and not bring trouble/dirty to future
> vMCE.

> 


This all is fine from AMD side.

The point I want to bring up is this:

In xen-unstable vMCE is Intel only *but*
I have a set of patches which will unify a lot of logic
so that vMCE, page-offlining, etc. is also used on AMD

I want to make sure that the vMCE interface works with both
Intel and AMD to avoid yet another vMCE interface change
(and all discussion around this) after the end of the
feature freeze. Another vMCE interface change will affect
both Intel and AMD. So I think this is also in Intel's
interest to have a sane vMCE interface that fits for
both sides.

Christoph


-- 

---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85689 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.