[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V6 4/4] x86/altp2m: fix display frozen when switching to a new view early


  • To: Jan Beulich <JBeulich@xxxxxxxx>, Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>
  • From: George Dunlap <george.dunlap@xxxxxxxxxx>
  • Date: Fri, 16 Nov 2018 12:03:07 +0000
  • Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABzSRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT7CwYAEEwEKACoCGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEFAlpk2IEFCQo9I54ACgkQpjY8MQWQtG1A1BAAnc0oX3+M/jyv4j/ESJTO U2JhuWUWV6NFuzU10pUmMqpgQtiVEVU2QbCvTcZS1U/S6bqAUoiWQreDMSSgGH3a3BmRNi8n HKtarJqyK81aERM2HrjYkC1ZlRYG+jS8oWzzQrCQiTwn3eFLJrHjqowTbwahoiMw/nJ+OrZO /VXLfNeaxA5GF6emwgbpshwaUtESQ/MC5hFAFmUBZKAxp9CXG2ZhTP6ROV4fwhpnHaz8z+BT NQz8YwA4gkmFJbDUA9I0Cm9D/EZscrCGMeaVvcyldbMhWS+aH8nbqv6brhgbJEQS22eKCZDD J/ng5ea25QnS0fqu3bMrH39tDqeh7rVnt8Yu/YgOwc3XmgzmAhIDyzSinYEWJ1FkOVpIbGl9 uR6seRsfJmUK84KCScjkBhMKTOixWgNEQ/zTcLUsfTh6KQdLTn083Q5aFxWOIal2hiy9UyqR VQydowXy4Xx58rqvZjuYzdGDdAUlZ+D2O3Jp28ez5SikA/ZaaoGI9S1VWvQsQdzNfD2D+xfL qfd9yv7gko9eTJzv5zFr2MedtRb/nCrMTnvLkwNX4abB5+19JGneeRU4jy7yDYAhUXcI/waS /hHioT9MOjMh+DoLCgeZJYaOcgQdORY/IclLiLq4yFnG+4Ocft8igp79dbYYHkAkmC9te/2x Kq9nEd0Hg288EO/OwE0EVFq6vQEIAO2idItaUEplEemV2Q9mBA8YmtgckdLmaE0uzdDWL9To 1PL+qdNe7tBXKOfkKI7v32fe0nB4aecRlQJOZMWQRQ0+KLyXdJyHkq9221sHzcxsdcGs7X3c 17ep9zASq+wIYqAdZvr7pN9a3nVHZ4W7bzezuNDAvn4EpOf/o0RsWNyDlT6KECs1DuzOdRqD oOMJfYmtx9hMzqBoTdr6U20/KgnC/dmWWcJAUZXaAFp+3NYRCkk7k939VaUpoY519CeLrymd Vdke66KCiWBQXMkgtMGvGk5gLQLy4H3KXvpXoDrYKgysy7jeOccxI8owoiOdtbfM8TTDyWPR Ygjzb9LApA8AEQEAAcLBZQQYAQoADwIbDAUCWmTXMwUJB+tP9gAKCRCmNjwxBZC0bb+2D/9h jn1k5WcRHlu19WGuH6q0Kgm1LRT7PnnSz904igHNElMB5a7wRjw5kdNwU3sRm2nnmHeOJH8k Yj2Hn1QgX5SqQsysWTHWOEseGeoXydx9zZZkt3oQJM+9NV1VjK0bOXwqhiQyEUWz5/9l467F S/k4FJ5CHNRumvhLa0l2HEEu5pxq463HQZHDt4YE/9Y74eXOnYCB4nrYxQD/GSXEZvWryEWr eDoaFqzq1TKtzHhFgQG7yFUEepxLRUUtYsEpT6Rks2l4LCqG3hVD0URFIiTyuxJx3VC2Ta4L H3hxQtiaIpuXqq2D4z63h6vCx2wxfZc/WRHGbr4NAlB81l35Q/UHyMocVuYLj0llF0rwU4Aj iKZ5qWNSEdvEpL43fTvZYxQhDCjQTKbb38omu5P4kOf1HT7s+kmQKRtiLBlqHzK17D4K/180 ADw7a3gnmr5RumcZP3NGSSZA6jP5vNqQpNu4gqrPFWNQKQcW8HBiYFgq6SoLQQWbRxJDHvTR YJ2ms7oCe870gh4D1wFFqTLeyXiVqjddENGNaP8ZlCDw6EU82N8Bn5LXKjR1GWo2UK3CjrkH pTt3YYZvrhS2MO2EYEcWjyu6LALF/lS6z6LKeQZ+t9AdQUcILlrx9IxqXv6GvAoBLJY1jjGB q+/kRPrWXpoaQn7FXWGfMqU+NkY9enyrlw==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 16 Nov 2018 12:03:47 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 11/16/18 10:08 AM, Jan Beulich wrote:
>>>> On 15.11.18 at 18:54, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> On 11/15/18 7:34 PM, George Dunlap wrote:
>>>> On Nov 14, 2018, at 8:40 PM, Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx> 
>>>> wrote:
>>>> @@ -1440,6 +1443,11 @@ void p2m_init_altp2m_ept(struct domain *d, unsigned 
>>>> int i)
>>>>     struct p2m_domain *hostp2m = p2m_get_hostp2m(d);
>>>>     struct ept_data *ept;
>>>>
>>>> +    p2m->max_mapped_pfn = hostp2m->max_mapped_pfn;
>>>
>>> Why we need to copy this value?
>>>
>>> I’ve just spent the afternoon tracing around the source code, and while I’m 
>> pretty sure it won’t hurt, I’m also not sure why it’s necessary.
>>
>> I think I did that because I assumed that it would matter for
>> ept_get_entry() and misconfigurations, when:
>>
>>  954     /* This pfn is higher than the highest the p2m map currently
>> holds */
>>  955     if ( gfn > p2m->max_mapped_pfn )
> 
> The question is whether under any condition such checks require that
> the accumulated value be in sync between the host and the various
> alt p2m-s.
> 
>>  956     {
>>  957         for ( i = ept->wl; i > 0; --i )
>>  958             if ( (gfn & ~((1UL << (i * EPT_TABLE_ORDER)) - 1)) >
>>  959                  p2m->max_mapped_pfn )
>>  960                 break;
>>  961         goto out;
>>  962     }
>>
>> but I am not certain it is required either. I can certainly remove this
>> assignment and see if anything breaks.
> 
> I've seen you mention such or alike a couple of times now while
> discussing this series, and I have to admit that I'm a little frightened:
> Making a change just based on the observation that nothing breaks
> is setting us up for breakage in some corner case. That is - seeing
> something break is a good indication that a change was wrong, but
> not seeing any breakage doesn't alone justify a change.
> 
> In the particular case here I think whether the copying of the field
> is needed depends on what else is being copied (I'm sorry, I'm not
> that familiar with altp2m): If mappings get cloned from the host p2m,
> then this value ought to be cloned too. For any mappings that
> might be kept in sync between p2m-s, I think this field also would
> need to be updated in sync.

So the basic question is, does "max_mapped_pfn" mean, "Maximum pfn _for
the domain_", or "Maximum pfn _for this p2m_".  When the element was
added to the p2m struct those were the same thing.  Which do the various
use cases expect it to be, and which would be the most robust going forward?

I spent a bunch of time going through the code yesterday, and I'm pretty
sure that as long as the value in the p2m is one or the other, there
will be at the moment no _correctness_ issues.  It looked to me like in
the case where altp2m->max_mapped_pfn > altp2m->max_remapped_gfn, then
the p2m machinery would do a certain amount of unnecessary work, but
that's all.

It also looked to me like before this patch, the value mostly ends up
being  "maximum pfn ever mapped in this p2m (even across altp2m
desroys)".  That's because when the altp2m is allocated, it starts as 0;
every time an entry is propagated from the hostp2m to the altp2m,
ept_set_entry() updates max_mapped_pfn; but nothing sets it back to zero.

Also, hostp2m->max_mapped_pfn is never decreased, only increased.

So either before or after this patch, altp2m->max_remapped_gfn <=
altp2m->max_mapped_pfn <= hostp2m->max_mapped_pfn.

In the vast majority of cases, max_mapped_pfn is explicitly being read
from the hostp2m.

Below are the cases where it may be read from an altp2m:

 - ept_get_entry(), ept_walk_tables(): If ==max_remapped_gfn, it will
return INVALID_MFN early.  If higher than max_remapped_gfn, it falls
back to walking the altp2m's ept tables, which will give you the same
answer, just a bit more slowly.

 - ept_dump_p2m_tables(): If ==max_remapped_gfn, it will dump only up to
the current map; if >max_remapped_gfn, it will dump a number of
unnecessary INVALID_MFN entries, but no more entries than the hostp2m.

 - p2m.c:change_type_range(): If ==max_remapped_gfn, it will only
invalidate entries in the altp2m as high have been currently remapped.
If >max_remapped_gfn, it will write invalid ept entries that *haven't*
yet been copied over.  But I don't think either one should cause a
correctness issue: either way, accessing a gfn > max_remapped_gfn will
cause a fault, at which point either the correct value will be copied
from the hostp2m (perhaps going through resolve_misconfig() on the
hostp2m in the process) or the correct value will be calculated via
resolve_misconfig().

So, it seemed from my investigation like it would be more useful if we
got rid of altp2m->max_remapped_gfn, and used atlp2m->max_mapped_pfn
instead.  If people want to know the maximum gfn mapped by the domain as
a whole, they should just use hostp2m->max_mapped_pfn.

The code is definitely complicated enough, though, that I may have
missed something, which is why I asked Razvan if there was a reason he
changed it.

For the purposes of this patch, I propose having p2m_altp2m_init_ept()
set max_mapped_pfn to 0 (if that works), and leaving "get rid of
max_remapped_pfn" for a future clean-up series.

That said -- Razvan, I realize the code is quite complex, but it's still
important that when you modify things you work to have a clear
understanding of why you're making changes.  The analysis I did above
with how max_mapped_pfn is used and what the effects would be one way or
the other should have been something you did before modifying it.
Otherwise things will only get more incomprehensible and buggy.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.