Actually CCing Patrick this time...
Tim.
At 10:22 +0000 on 24 Nov (1290594122), Tim Deegan wrote:
> Hi,
>
> At 21:01 +0000 on 23 Nov (1290546118), Olaf Hering wrote:
> > what is the purpose of the mfn_to_gfn() check in
> > guest_physmap_add_entry()?
>
> It's intended to stop a VM aliasing two PFNs to the same MFN.
>
> > This function gets a fresh mfn and its gfn passed to update the global
> > p2m state. But before doing that, it checks wether that fresh mfn maps
> > still to some gfn. If it does, it checks if that gfn maps to any mfn. If
> > it doesnt, Xen crashes with an assert.
> >
> > Now, if that mfn was part of an old guest, should that old guest cleanup
> > all of its mfns and update the machine_to_phys_mapping[]? Appearently
> > that rarely happens.
> > And if there is still some random gfn number in that array, the function
> > tries to see what happens with this number in the current guests
> > context. IF that number happens to be a gfn in paged-out state, there
> > will be no mfn. So the ASSERT triggers.
>
> The assert is guarded by p2m_is_ram(), which ought to imply that there
> was actual RAM behind it. There are other places in the code where
> p2m_is_ram() is used to check that the associated mfn is valid
> (e.g. when loading CR3).
>
> I think that adding the paging types (in particular p2m_ram_paged) to
> P2M_RAM_TYPES is a mistake, unless gfn_to_mfn() guarantees to get the
> pfn into a state where it's backed by an MFN before it returns (which it
> probably can't).
>
> Another option would be to audit all callers of p2m_is_ram() and check
> whether they can handle paged-out PFNs (which I though had already been
> done but seems not to be). Keir's VCPU yield patch might be useful in
> some of those places too.
>
> Cc'ing Patrick for comments.
>
> > I would guess that if guest_physmap_add_entry() gets a page with type
> > p2m_ram_rw, nothing else can own that page. Is that right?
> > If so, this ASSERT or most of the loop can be removed.
>
> The loop shouldn't be removed without some more thought about aliasing
> PFNs, and I think that removing the ASSERT plasters over a deeper
> problem.
>
> The BUG_ON() just above it, on the other hand, is not correct, and I'll
> remove it.
>
> Cheers,
>
> Tim.
>
> --
> Tim Deegan <Tim.Deegan@xxxxxxxxxx>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
> # HG changeset patch
> # User Tim Deegan <Tim.Deegan@xxxxxxxxxx>
> # Date 1290594003 0
> # Node ID 79b71c77907b80772ee8cba0c5bbf8e444e61226
> # Parent e5c4e925e1bd15baeadc0817dcceb5fff54b8a74
> x86/mm: remove incorrect BUG_ON.
> This BUG_ON tests a property of an effectively random PFN in the guest,
> and is explicitly _not_ seeing the MFN that's known to be owned.
>
> Signed-off-by: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
>
> diff -r e5c4e925e1bd -r 79b71c77907b xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c Tue Nov 23 19:36:14 2010 +0000
> +++ b/xen/arch/x86/mm/p2m.c Wed Nov 24 10:20:03 2010 +0000
> @@ -2380,9 +2380,6 @@ guest_physmap_add_entry(struct p2m_domai
> P2M_DEBUG("aliased! mfn=%#lx, old gfn=%#lx, new gfn=%#lx\n",
> mfn + i, ogfn, gfn + i);
> omfn = gfn_to_mfn_query(p2m, ogfn, &ot);
> - /* If we get here, we know the local domain owns the page,
> - so it can't have been grant mapped in. */
> - BUG_ON( p2m_is_grant(ot) );
> if ( p2m_is_ram(ot) )
> {
> ASSERT(mfn_valid(omfn));
--
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|