[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5][XSA-97] x86/paging: make log-dirty operations preemptible



On 15/09/2014 13:54, Jan Beulich wrote:
On 15.09.14 at 09:50, <andrew.cooper3@xxxxxxxxxx> wrote:
It is indeed migration v2, which is necessary in XenServer given our
recent switch from 32bit dom0 to 64bit.  The counts are only used for
logging, and debugging purposes; all movement of pages is based off the
bits in the bitmap alone.  In particular, the dirty count is used as a
basis of the statistics for the present iteration of migration.  While
getting it wrong is not the end of the world, it would certainly be
preferable for the count to be accurate.

As for the memory corruption, XenRT usually tests pairs of VMs at a time
(32 and 64bit variants) and all operations as back-to-back as possible.
Therefore, it is highly likely that a continued operation on one domain
intersects with other paging operations on another.
But there's nothing I can see where domains would have a way
of getting mismatched. It is in particular this one

(XEN) [ 7832.953068] mm.c:827:d0v0 pg_owner 100 l1e_owner 100, but 
real_pg_owner 99

which puzzles me: Assuming Dom99 was the original one, how
would Dom100 get hold of any of Dom99's pages (IOW why would
Dom0 map one of Dom99's pages into Dom100)? The patch doesn't
alter any of the page refcounting after all. Nor does your v2
migration series I would think.

In this case, dom99 was migrating to dom100. The failure was part of verifying dom100v0's cr3 at the point of loading vcpu state, so Xen was in the process of pinning pagetables.

There were no errors on pagetable normalisation, so dom99's PTEs were all correct, and there were no errors restoring any of dom100's memory, so Xen fully allocated frames for dom100's memory during populate_phymap() hypercalls.

During pagetable normalisation, dom99's pfns in the stream are converted to dom100's mfns as per the newly created p2m from the populate_physmap() allocations. Then during dom100's cr3 validation, it finds a dom99 PTE and complains.

Therefore, a frame Xen handed back to the toolstack as part of allocating dom100's memory still belonged to dom99.


In general I understand you - as much as I - suspect that we're
losing one or more bits from the dirty bitmap (too many being set
wouldn't do any harm other than affecting performance afaict),
but that scenario doesn't seem to fit with your observations.

I would agree - it is not obvious how this corruption, given only changes to the logdirty handling, appears to be causing these problems.

I think I will need to debug this issue properly, but I won't be in a position to do that until next week.


The results (now they have run fully) are 10 tests each.  10 passes
without this patch, and 10 failures in similar ways with the patch,
spread across a randomly selected set of hardware.
I was meanwhile considering the call to
d->arch.paging.log_dirty.clean_dirty_bitmap() getting made only
in the final success exit case to be a problem (with the paging lock
dropped perhaps multiple times in between), but I'm pretty certain
it isn't: Newly dirtied pages would get accounted correctly in the
bitmap no matter whether they're in the range already processed
or the remainder, and ones already having been p2m_ram_rw
would have no problem if further writes to them happen while we
do continuations. The only thing potentially suffering here seems
efficiency: We might return a few pages to p2m_ram_logdirty
without strict need (but that issue existed before already, we're
just widening the window).

It will defer the notification of a page being dirtied until the subsequent CLEAN/PEEK operation, but I believe its all fine. The final CLEAN operation is after pausing the domain, so will be no activity (other than the backends, which are compensated for).

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.