[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] x86/altp2m: fix display frozen when switching to a new view early


  • To: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>, <george.dunlap@xxxxxxxxxxxxx>
  • From: George Dunlap <george.dunlap@xxxxxxxxxx>
  • Date: Wed, 19 Sep 2018 13:15:53 +0100
  • Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABzSRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT7CwYAEEwEKACoCGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEFAlpk2IEFCQo9I54ACgkQpjY8MQWQtG1A1BAAnc0oX3+M/jyv4j/ESJTO U2JhuWUWV6NFuzU10pUmMqpgQtiVEVU2QbCvTcZS1U/S6bqAUoiWQreDMSSgGH3a3BmRNi8n HKtarJqyK81aERM2HrjYkC1ZlRYG+jS8oWzzQrCQiTwn3eFLJrHjqowTbwahoiMw/nJ+OrZO /VXLfNeaxA5GF6emwgbpshwaUtESQ/MC5hFAFmUBZKAxp9CXG2ZhTP6ROV4fwhpnHaz8z+BT NQz8YwA4gkmFJbDUA9I0Cm9D/EZscrCGMeaVvcyldbMhWS+aH8nbqv6brhgbJEQS22eKCZDD J/ng5ea25QnS0fqu3bMrH39tDqeh7rVnt8Yu/YgOwc3XmgzmAhIDyzSinYEWJ1FkOVpIbGl9 uR6seRsfJmUK84KCScjkBhMKTOixWgNEQ/zTcLUsfTh6KQdLTn083Q5aFxWOIal2hiy9UyqR VQydowXy4Xx58rqvZjuYzdGDdAUlZ+D2O3Jp28ez5SikA/ZaaoGI9S1VWvQsQdzNfD2D+xfL qfd9yv7gko9eTJzv5zFr2MedtRb/nCrMTnvLkwNX4abB5+19JGneeRU4jy7yDYAhUXcI/waS /hHioT9MOjMh+DoLCgeZJYaOcgQdORY/IclLiLq4yFnG+4Ocft8igp79dbYYHkAkmC9te/2x Kq9nEd0Hg288EO/OwE0EVFq6vQEIAO2idItaUEplEemV2Q9mBA8YmtgckdLmaE0uzdDWL9To 1PL+qdNe7tBXKOfkKI7v32fe0nB4aecRlQJOZMWQRQ0+KLyXdJyHkq9221sHzcxsdcGs7X3c 17ep9zASq+wIYqAdZvr7pN9a3nVHZ4W7bzezuNDAvn4EpOf/o0RsWNyDlT6KECs1DuzOdRqD oOMJfYmtx9hMzqBoTdr6U20/KgnC/dmWWcJAUZXaAFp+3NYRCkk7k939VaUpoY519CeLrymd Vdke66KCiWBQXMkgtMGvGk5gLQLy4H3KXvpXoDrYKgysy7jeOccxI8owoiOdtbfM8TTDyWPR Ygjzb9LApA8AEQEAAcLBZQQYAQoADwIbDAUCWmTXMwUJB+tP9gAKCRCmNjwxBZC0bb+2D/9h jn1k5WcRHlu19WGuH6q0Kgm1LRT7PnnSz904igHNElMB5a7wRjw5kdNwU3sRm2nnmHeOJH8k Yj2Hn1QgX5SqQsysWTHWOEseGeoXydx9zZZkt3oQJM+9NV1VjK0bOXwqhiQyEUWz5/9l467F S/k4FJ5CHNRumvhLa0l2HEEu5pxq463HQZHDt4YE/9Y74eXOnYCB4nrYxQD/GSXEZvWryEWr eDoaFqzq1TKtzHhFgQG7yFUEepxLRUUtYsEpT6Rks2l4LCqG3hVD0URFIiTyuxJx3VC2Ta4L H3hxQtiaIpuXqq2D4z63h6vCx2wxfZc/WRHGbr4NAlB81l35Q/UHyMocVuYLj0llF0rwU4Aj iKZ5qWNSEdvEpL43fTvZYxQhDCjQTKbb38omu5P4kOf1HT7s+kmQKRtiLBlqHzK17D4K/180 ADw7a3gnmr5RumcZP3NGSSZA6jP5vNqQpNu4gqrPFWNQKQcW8HBiYFgq6SoLQQWbRxJDHvTR YJ2ms7oCe870gh4D1wFFqTLeyXiVqjddENGNaP8ZlCDw6EU82N8Bn5LXKjR1GWo2UK3CjrkH pTt3YYZvrhS2MO2EYEcWjyu6LALF/lS6z6LKeQZ+t9AdQUcILlrx9IxqXv6GvAoBLJY1jjGB q+/kRPrWXpoaQn7FXWGfMqU+NkY9enyrlw==
  • Cc: andrew.cooper3@xxxxxxxxxx, kevin.tian@xxxxxxxxx, wei.liu2@xxxxxxxxxx, jbeulich@xxxxxxxx, jun.nakajima@xxxxxxxxx
  • Delivery-date: Wed, 19 Sep 2018 12:16:01 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 09/03/2018 09:25 AM, Razvan Cojocaru wrote:
> When an new altp2m view is created very early on guest boot, the
> display will freeze (although the guest will run normally). This
> may also happen on resizing the display. The reason is the way
> Xen currently (mis)handles logdirty VGA: it intentionally
> misconfigures VGA pages so that they will fault.
> 
> The problem is that it only does this in the host p2m. Once we
> switch to a new altp2m, the misconfigured entries will no longer
> fault, so the display will not be updated.

Hey Razvan, thanks for doing this, and sorry it's taken so long to respond.

> This patch:
> 
> * updates ept_handle_misconfig() to use the active altp2m instead
>   of the hostp2m;

This is probably necessary.

> * has p2m_init_altp2m_ept() copy over max_mapped_pfn,
>   logdirty_ranges, global_logdirty, ept.ad and default_access
>   from the hostp2m (the latter more for completeness than for any
>   other reason).

I think this is probably the right approach.  These values change
rarely, but after a misconfig are read repeatedly.  So it's probably a
lot more efficient to propagate changes when they happen, rather than
trying to keep a single master copy.  However...

> We should discuss if just copying over
>   logdirty_ranges (which is a pointer) is sufficient, or if
>   this code requires more synchronization).

It's clearly not sufficient. :-)  The logdirty_ranges struct is
protected by the lock of the p2m structure that contains it; if you
point to it from a different p2m structure, then you'll have
inconsistent logging, and you'll have problems if one vcpu is reading
the structure while another is modifying it.

Another issue is that it doesn't look like you're propagating updates to
this shared state either -- if someone enables or disables
global_logdirty, or changes default_access, the altp2ms will still have
the old value.

I wonder if we should collect the various bits that need to be kept in
sync between hostp2m/altp2ms, put them all in a 'sync' sub-struct within
the p2m, and enforce using a function / macro to modify the values inside.

> Also, it's worth
>   clarifying if these variables (and which of them) should be
>   copied over from the hostp2m or the currently active p2m;
> * modifies p2m_change_entry_type_global() and
>   p2m_change_type_range() to propagate their changes to all
>   valid altp2ms.
> 
> Another aspect is that, while new modifications should work with
> these changes, _old_ modifications (written to the host2pm
> _before_ we've created the new altp2m) will, if I understand the
> code correctly be lost. That is to say, misconfigurations
> performed before p2m_init_altp2m_ept() in the hostp2m will
> presumably not trigger the necessary faults after switching to
> the new altp2m.

You're worried about the following sequence?

1. Misconfigure hostp2m
2. Enable altp2m
3. Switch to altp2m 1
4. Fault on a previously-misconfigured p2m entry

Actually, I'm pretty sure you don't have to worry about this (unless
I've completely forgotten how things work). Correct me if I'm wrong:

* The altp2ms start out as empty, and entries are copied from the host
p2m as needed, using hostp2m->get_entry() (at some level)

* There are a *lot* of users that call p2m->get_entry() without causing
a fault; these callers need to get the right values during a
misconfigure (the remnants of which may last indefinitely -- i.e.,
misconfigured entries may *never* be fixed up for regions of the p2m
which aren't touched)

So the fault in 4 should end up copying over the correct value (unless I
missed something).

Just one comment on the code...

>  static void ept_enable_pml(struct p2m_domain *p2m)
>  {
> +    unsigned int i;
> +    struct domain *d = p2m->domain;
> +
>      /* Domain must have been paused */
> -    ASSERT(atomic_read(&p2m->domain->pause_count));
> +    ASSERT(atomic_read(&d->pause_count));
>  
>      /*
>       * No need to return whether vmx_domain_enable_pml has succeeded, as
>       * ept_p2m_type_to_flags will do the check, and write protection will be
>       * used if PML is not enabled.
>       */
> -    if ( vmx_domain_enable_pml(p2m->domain) )
> +    if ( vmx_domain_enable_pml(d) )
>          return;
>  
>      /* Enable EPT A/D bit for PML */
>      p2m->ept.ad = 1;
> -    vmx_domain_update_eptp(p2m->domain);
> +
> +    if ( altp2m_active(d) )
> +        for ( i = 0; i < MAX_ALTP2M; i++ )
> +        {
> +            if ( d->arch.altp2m_eptp[i] == mfn_x(INVALID_MFN) )
> +                continue;
> +
> +            p2m = d->arch.altp2m_p2m[i];
> +            p2m->ept.ad = 1;
> +        }

You're not grabbing the respective p2m locks here -- I'm pretty sure
this will end up being three separate instructions (read, set ad bit,
write).

But there would something a bit funny here about grabbing the main p2m
lock in p2m.c, and then grabbing altp2m locks within the function.  But
on the other hand, you clearly only want to call this...

> +    vmx_domain_update_eptp(d);

...once.  Some refactoring might be wanted.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.