[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall



>>> On 23.07.14 at 11:09, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 23/07/14 08:45, Jan Beulich wrote:
>>>>> On 15.07.14 at 12:00, <andrew.cooper3@xxxxxxxxxx> wrote:
>>> Hmm - I am very sorry for pushing at continue_hypercall_on_cpu(), but
>>> looking at the code, there is no possible way it can be made to work in
>>> its current form.  It will BUG() at the second attempt to nest, making
>>> it inappropriate to use.
>> Not sure why - look at microcode.c's use of it, which also makes
>> use of it more than once for a single hypercall.
> 
> Eugh.  That highlights several more issues.
> 
> Any two continue_hypercall_on_cpu()'s which target the same cpu may fall
> over the BUG_ON(info->nest != 0).  A microcode update and this proposed
> hypercall run a very real risk of intersecting.

I don't think so: info is this_cpu(continue_info) or a freshly allocated
one, but never per_cpu(continue_info, cpu).

> The vcpu_pause_nosync(curr); in continue_hypercall_on_cpu() is bogus
> given the do_microcode_update() example, as it will be pausing the wrong
> vcpu.

And again no: continue_hypercall_tasklet_handler() sets its CPU's
continue_info to the respective struct migrate_info instance (and
only for the duration of the callout). Consequently a nesting call to
continue_hypercall_on_cpu() won't come through the if() branch
leading to said vcpu_pause_nosync().

> As a result, the vcpu which made the hypercall will be unpaused after
> the first continue_hypercall_on_cpu() returns, which is before the
> hypercall has actually completed.
> 
> Irrespective of that, copying data back to guest context from the
> continue_hypercall_tasklet is in the wrong vcpu context, and likely even
> the wrong domain context.

Yes, that's an issue that would need addressing - the microcode use
simply doesn't need to do any copying (nor do any of the other current
ones afaict).

> Finally, we have seen in the past that using AMD SVM with mismatched
> microcode patch levels can result in hardware lockup.  The only safe way
> to apply microcode in these circumstances would be to rendezvous in Xen
> and update every core at once, ensuring that no SVM operations are being
> performed on sibling processors.

But even that doesn't guarantee all CPUs to get updated at _exactly_
the same moment. Such a problem needs to be eliminated in hardware,
or there would need to be an extremely precise list of operations that
are permitted while microcode on different cores is out of sync.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.