[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about VPID during MOV-TO-CR3



On Fri, Oct 7, 2016 at 9:32 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>>> On 04.10.16 at 17:06, <tim@xxxxxxx> wrote:
>> At 08:29 -0600 on 04 Oct (1475569774), Jan Beulich wrote:
>>> >>> On 04.10.16 at 16:12, <tamas.lengyel@xxxxxxxxxxxx> wrote:
>>> > yes, I understand that is the case when you do need to flush a guest.
>>> > And yes, there seem to be paths that require to bump the tag of a
>>> > specific guest for certain events (mov-to-cr4 with paging mode changes
>>> > for example). What I'm poking at it here is that we invalidate the
>>> > guest TLBs for _all_ guests very frequently. I can't find an
>>> > explanation for why _that_ is required. AFAIK having the TLB tag
>>> > guarantees that no other guest or Xen will have a chance to bump into
>>> > stale entries given no guests or Xen share a TLB tag with each other.
>>> > So the only time I see that we would have to flush all guest TLBs is
>>> > when the tag overflows and we start from 1 again. What am I missing
>>> > here?
>>>
>>> Oh, I see - this indeed looks to be quite a bit more flushing than is
>>> desirable. So the question, as you did put it already, is why it got
>>> done that way in the first place. At the very least it would look like
>>> more control would need to be given to the callers of both
>>> write_cr3() and flush_area_local(). Tim?
>>
>> IIRC:
>>  - Remote TLB flushes are used for safety, e.g. to be sure that no
>>    guest has a mapping of a page before its type or owner changes.
>>    The callers rely on _all_ mappings of the page being gone after
>>    the remote flush.  The simplest way to do that is to flush all tags.
>
> Ah, of course. And that means that no matter that Tamas observed
> no breakage with some of the flushing removed, it can't be dropped
> altogether.
>
>>  - We believed that on the then-current hardware, and with the
>>    scheduling timeslice we had, there wasn't an awful lot of
>>    benefit to keeping the tags of descheduled VMs around.
>>  - Although it might sometimes be safe to leave some tags unflushed,
>>    it wasn't clear exactly when that would be.  E.g. I don't think
>>    that whether the tag is 'current' is a very useful test -- either
>>    the tag might contain dangerous mappings or it might not.
>>
>> Since there are cases where we already mask TLB flushes by domain
>> (usign the dirty-cpumask) I can see that we might pass that domain ID
>> to the remote CPU and drop only that domain's tags.
>>
>> And for HAP guests it may be possible to distinguish between "guest"
>> flushes (e.g. emulating guest CR3 writes) and "hypervisor" flushes
>> (e.g. after grant/p2m ops), and target "guest" flushes at particular
>> VCPUs.
>
> Right. Question is whether there are any such operations
> occurring frequently enough that optimizing this would make
> sense. I don't see HVM code paths leading to write_cr3(), and
> I don't think there are a whole lot leading to flush_area_local().
> Did you gain any insight in this regard, Tamas?

There are a ton of calls to flush_area_local, and a good chunk of them
with the idle vCPU being the active one when it is called. As for
write_cr3, there are also a lot of calls there. When I added some
debug output to observe just how many dom0 would take almost an hour
to boot and the serial line would just be spammed with that printk. So
even if there no HVM paths leading there, others paths definitely do
that affect HVM guests by making all of them take on a new tag next
time they are scheduled.

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.