[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>
From: "Jan Beulich" <JBeulich@xxxxxxxx>
Date: Mon, 28 Jul 2014 11:01:56 +0100
Cc: "keir@xxxxxxx" <keir@xxxxxxx>, "Ian.Campbell@xxxxxxxxxx" <Ian.Campbell@xxxxxxxxxx>, "stefano.stabellini@xxxxxxxxxxxxx" <stefano.stabellini@xxxxxxxxxxxxx>, "George.Dunlap@xxxxxxxxxxxxx" <George.Dunlap@xxxxxxxxxxxxx>, "Ian.Jackson@xxxxxxxxxxxxx" <Ian.Jackson@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Dongxiao Xu <dongxiao.xu@xxxxxxxxx>, "dgdegra@xxxxxxxxxxxxx" <dgdegra@xxxxxxxxxxxxx>
Delivery-date: Mon, 28 Jul 2014 10:02:10 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 23.07.14 at 11:09, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 23/07/14 08:45, Jan Beulich wrote:
>>>>> On 15.07.14 at 12:00, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> Hmm - I am very sorry for pushing at continue_hypercall_on_cpu(), but
>>> looking at the code, there is no possible way it can be made to work in
>>> its current form.  It will BUG() at the second attempt to nest, making
>>> it inappropriate to use.
>> Not sure why - look at microcode.c's use of it, which also makes
>> use of it more than once for a single hypercall.
> 
> Eugh.  That highlights several more issues.
> 
> Any two continue_hypercall_on_cpu()'s which target the same cpu may fall
> over the BUG_ON(info->nest != 0).  A microcode update and this proposed
> hypercall run a very real risk of intersecting.

I don't think so: info is this_cpu(continue_info) or a freshly allocated
one, but never per_cpu(continue_info, cpu).

> The vcpu_pause_nosync(curr); in continue_hypercall_on_cpu() is bogus
> given the do_microcode_update() example, as it will be pausing the wrong
> vcpu.

And again no: continue_hypercall_tasklet_handler() sets its CPU's
continue_info to the respective struct migrate_info instance (and
only for the duration of the callout). Consequently a nesting call to
continue_hypercall_on_cpu() won't come through the if() branch
leading to said vcpu_pause_nosync().

> As a result, the vcpu which made the hypercall will be unpaused after
> the first continue_hypercall_on_cpu() returns, which is before the
> hypercall has actually completed.
> 
> Irrespective of that, copying data back to guest context from the
> continue_hypercall_tasklet is in the wrong vcpu context, and likely even
> the wrong domain context.

Yes, that's an issue that would need addressing - the microcode use
simply doesn't need to do any copying (nor do any of the other current
ones afaict).

> Finally, we have seen in the past that using AMD SVM with mismatched
> microcode patch levels can result in hardware lockup.  The only safe way
> to apply microcode in these circumstances would be to rendezvous in Xen
> and update every core at once, ensuring that no SVM operations are being
> performed on sibling processors.

But even that doesn't guarantee all CPUs to get updated at _exactly_
the same moment. Such a problem needs to be eliminated in hardware,
or there would need to be an extremely precise list of operations that
are permitted while microcode on different cores is out of sync.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall
  - From: Xu, Dongxiao
- Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall
  - From: Andrew Cooper
- Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall
  - From: Andrew Cooper

Prev by Date: Re: [Xen-devel] [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
Next by Date: Re: [Xen-devel] [PATCH v2] fix Remus failover regression
Previous by thread: Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall
Next by thread: [Xen-devel] [xen-unstable test] 29319: tolerable FAIL - PUSHED
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.