[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/PV: make post-migration page state consistent



On 11.09.2020 13:55, Andrew Cooper wrote:
> On 11/09/2020 11:34, Jan Beulich wrote:
>> When a page table page gets de-validated, its type reference count drops
>> to zero (and PGT_validated gets cleared), but its type remains intact.
>> XEN_DOMCTL_getpageframeinfo3, therefore, so far reported prior usage for
>> such pages. An intermediate write to such a page via e.g.
>> MMU_NORMAL_PT_UPDATE, however, would transition the page's type to
>> PGT_writable_page, thus altering what XEN_DOMCTL_getpageframeinfo3 would
>> return. In libxc the decision which pages to normalize / localize
>> depends solely on the type returned from the domctl. As a result without
>> further precautions the guest won't be able to tell whether such a page
>> has had its (apparent) PTE entries transitioned to the new MFNs.
> 
> I'm afraid I don't follow what the problem is.
> 
> Yes - unvalidated pages probably ought to be consistently NOTAB, so this
> is probably a good change, but I don't see how it impacts the migration
> logic.

It's not the migration logic itself that's impacted, but the state
of guest pages after migration. I'm afraid I can only try to expand
on the original description.

Case 1: Once an Ln page has been unvalidated, due to the described
behavior the migration code in libxc will normalize and then localize
it. Therefore the guest could go and directly try to use it as a
page table again. This should work as long as all of the entries in
the page can still be successfully validated (i.e. unless the guest
itself has made changes to the state of other pages).

Case 2: Once an Ln page has been unvalidated, the guest for whatever
reason still writes to it through e.g. MMU_NORMAL_PT_UPDATE. Prior
to migration, and provided the new entry can be validated (and no
other reference page has changed state), the page can still be
converted to a proper page table one again. If, however, migration
occurs inbetween, the page now won't get normalized and then
localized. The MFNs in it are unlikely to make sense anymore, and
hence an attempt to make the page a page table again is likely to
fail (or if it doesn't fail the result is unlikely to be what's
intended).

Since there's no way to make case 2 "work", the only choice is to
make case 1 behave like case 2, in order for the behavior to be
predictable / consistent.

> We already have to cope with a page really changing types in parallel
> with the normalise/localise logic (that was a "fun" one to debug), which
> is why errors in that logic are specifically not fatal while the guest
> is live - the frame gets re-marked as dirty, and deferred until the next
> round.
> 
> Errors encountered after the VM has been paused are fatal.
> 
> However, at no point, even with an unvalidated pagetable type, can the
> contents of the page be anything other than legal PTEs.  (I think)

Correct, because in order to write to the page one has to either
make it a page table one again (and then write through hypercall
or for L1 through PTWR) or the mmu-normal-pt-update would first
convert the page to a writable one.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.