[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/ept: fix shattering of special pages


  • To: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "Beulich, Jan" <JBeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 29 Jun 2022 11:10:30 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=546m18QLwJdo3blaNNbrIhLdTBlakkue4/DRTTBf2/s=; b=SigGM7lUvRBJvMVj+3mnVZQsTgVu560CFf7WJa34gGaT9NnGR5ML1VFIKRHaTtyoMRdDxwhSyY3eK+OlKktU5kBNwSGb+Svop5ScQ/S5HdoWNNJBp9irndgX+mfdGQjQo3KywApcQtbrXMrYNfdfXREMSwOP6kRi6lPWaI36AGeDIobupgq9kUA4/a96wgPo5m+rgvGTVf6uOdBiDR5cVXLLKUvkcJIURpEJWX25NS329UbFrcgaWqEgl79FV5qB1zkIKdmUuoIPE0ZZVAdDwM0+9Sgzpj7sz+XCLJy5z8Zyhju1VrfFwyHwwB7LivrQzO7xvLzkKyyziaJByOYJ8g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F60QSMG2JrSPiCIBAhqIvhVAP+cmAItPVHjOwGA2BD6Kk7LsfTtstkE20h8CkwNQcTro1ILJN3k818Jh8uS+w8/FNelxfb8iNZ0vt1UNuCJHEaLBHau/bJr+MpHJxvA/C0t+dYM1DYRc05uW2FTDpiDKYkl44sItyLELjgu9U3yBHOoztCIEfr0xAHXjnXryqZLRVR24yrFA+n7nQqk1TpPxBBnjSbD6vqH+kE+tzi/MwOv0rf+GYWm1xq5GcoYGZwZBDVQ7Q+8BrjRXTwntWMKbyiMdBpcVUU+9mXiC62Dgy2iA8Db9bjIhOXAKT66bQ3aYA8exWHppyzpLB+RYIA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "Cooper, Andrew" <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 29 Jun 2022 09:10:48 +0000
  • Ironport-data: A9a23:NgDGp6DLqJ6tyxVW/zviw5YqxClBgxIJ4kV8jS/XYbTApDlx1zEFz DQYUWvUMvmCZWWjc90nao/j80JV6MfXn9djQQY4rX1jcSlH+JHPbTi7wuYcHM8wwunrFh8PA xA2M4GYRCwMZiaA4E/raNANlFEkvU2ybuOU5NXsZ2YgH2eIdA970Ug5w7Bj09Yx6TSEK1jlV e3a8pW31GCNg1aYAkpMg05UgEoy1BhakGpwUm0WPZinjneH/5UmJMt3yZWKB2n5WuFp8tuSH I4v+l0bElTxpH/BAvv9+lryn9ZjrrT6ZWBigVIOM0Sub4QrSoXfHc/XOdJFAXq7hQllkPhak NpMiJaZdDwgGYbnhNsnYRx5DAhXaPguFL/veRBTsOS15mifKj7G5aUrC0s7e4oF5uxwHGdCs +QCLywAZQyCgOTwx6+nTu5rhYIoK8yD0IE34yk8i22GS6t2B8mdEs0m5vcBtNs0rtpJEvvEI dIQdBJkbQjaYg0JMVASYH47tLj02CehKW0EwL6TjZolpG3D6hB06uTGENiSatrQdOQWx3/N8 woq+Ey8WHn2Lue36xCI73atje/nhj7gVcQZE7jQ3u5nhhify3IeDDUSVECnur+ph0imQdVdJ kcIvC00osAa9lGtCN/0XBS6oXuNlh8aR9dUVeY97Wml1a788wufQG8eQVZpcNU7sOcmSDps0 UWG9+4FHhRqubyRDHibprGdqGrrPTBPdDFTIygZUQEC/t/v5pkpiQ7CRcpiF6jzicDpHTb3w HaBqy1Wa6gvsPPnHp6TpTjv6w9AbLCQJuLpzm07hl6Y0z4=
  • Ironport-hdrordr: A9a23:LYLI76haYSiECXQA+NvThgW9UXBQX0h13DAbv31ZSRFFG/FwyP rCoB1L73XJYWgqM03I+eruBEBPewK4yXdQ2/hoAV7EZnichILIFvAa0WKG+VHd8kLFltK1uZ 0QEJSWTeeAd2SS7vyKnzVQcexQp+VvmZrA7Ym+854ud3ANV0gJ1XYENu/xKDwTeOApP+taKH LKjfA32gZINE5nJ/hSQRI+Lpv+juyOsKijTQ8NBhYh5gXLpTS06ITiGxzd+hsFSTtAzZor7G CAymXCl+6emsD+7iWZ+37Y7pxQltek4txfBPaUgsxQDjn3kA6naKloRrXHljEop+OE7kosjb D30l8dFvU2z0mUUnC+oBPr1QWl+DEy60X6wVvdunfnqdyRfkNPN+NxwaZiNjfJ4Uspu99xlI hR2XiCipZRBRTc2Azg+tnhTXhR5wWJiEtntdRWo21UUIMYZrMUh5cY5llpHJAJGz+/wJw7Ed NpENrX6J9tAB+nhkjizyhSKeGXLzQO9k/seDlAhiXV6UkaoJlB9TpX+CRF9U1wtq7USPF/lp H52+pT5fRzp/QtHNNA7dc6MLWK41P2MGLx2UKpUCPa/fI8SgTwgq+yxokJz8eXX7FN5KcOuf 36ISFlXCgJCgjTNfE=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Jun 29, 2022 at 08:41:43AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monne <roger.pau@xxxxxxxxxx>
> > Sent: Monday, June 27, 2022 6:01 PM
> > 
> > The current logic in epte_get_entry_emt() will split any page marked
> > as special with order greater than zero, without checking whether the
> > super page is all special.
> > 
> > Fix this by only splitting the page only if it's not all marked as
> > special, in order to prevent unneeded super page shuttering.
> > 
> > Fixes: ca24b2ffdb ('x86/hvm: set 'ipat' in EPT for special pages')
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > ---
> > Cc: Paul Durrant <paul@xxxxxxx>
> > ---
> > It would seem weird to me to have a super page entry in EPT with
> > ranges marked as special and not the full page.  I guess it's better
> > to be safe, but I don't see an scenario where we could end up in that
> > situation.
> > 
> > I've been trying to find a clarification in the original patch
> > submission about how it's possible to have such super page EPT entry,
> > but haven't been able to find any justification.
> > 
> > Might be nice to expand the commit message as to why it's possible to
> > have such mixed super page entries that would need splitting.
> 
> Here is what I dig out.
> 
> When reviewing v1 of adding special page check, Jan suggested
> that APIC access page was also covered hence the old logic for APIC
> access page can be removed. [1]

But the APIC access page is always added using set_mmio_p2m_entry(),
which will unconditionally create an entry for it in the EPT, hence
there's no explicit need to check for it's presence inside of higher
order pages.

> Then when reviewing v2 he found that the order check in removed
> logic was not carried to the new check on special page. [2] 
> 
> The original order check in old APIC access logic came from:
> 
> commit 126018f2acd5416434747423e61a4690108b9dc9
> Author: Jan Beulich <jbeulich@xxxxxxxx>
> Date:   Fri May 2 10:48:48 2014 +0200
> 
>     x86/EPT: consider page order when checking for APIC MFN
> 
>     This was overlooked in 3d90d6e6 ("x86/EPT: split super pages upon
>     mismatching memory types").
> 
>     Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>     Acked-by: Kevin Tian <kevin.tian@xxxxxxxxx>
>     Reviewed-by: Tim Deegan <tim@xxxxxxx>
> 
> I suppose Jan may actually find such mixed super page entry case
> to bring this fix in.

Hm, I guess theoretically it could be possible for contiguous entries
to be coalesced (if we ever implement that for p2m page tables), and
hence result in super pages being created from smaller entries.

It that case it would be less clear to assert that special pages
cannot be coalesced with other contiguous entries.

> Not sure whether Jan still remember the history.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01648.html
> [2] https://lore.kernel.org/all/a4856c33-8bb0-4afa-cc71-3af4c229bc27@xxxxxxxx/
> 
> > ---
> >  xen/arch/x86/mm/p2m-ept.c | 20 +++++++++++---------
> >  1 file changed, 11 insertions(+), 9 deletions(-)
> > 
> > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> > index b04ca6dbe8..b4919bad51 100644
> > --- a/xen/arch/x86/mm/p2m-ept.c
> > +++ b/xen/arch/x86/mm/p2m-ept.c
> > @@ -491,7 +491,7 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn,
> > mfn_t mfn,
> >  {
> >      int gmtrr_mtype, hmtrr_mtype;
> >      struct vcpu *v = current;
> > -    unsigned long i;
> > +    unsigned long i, special_pgs;
> > 
> >      *ipat = false;
> > 
> > @@ -525,15 +525,17 @@ int epte_get_entry_emt(struct domain *d, gfn_t
> > gfn, mfn_t mfn,
> >          return MTRR_TYPE_WRBACK;
> >      }
> > 
> > -    for ( i = 0; i < (1ul << order); i++ )
> > -    {
> > +    for ( special_pgs = i = 0; i < (1ul << order); i++ )
> >          if ( is_special_page(mfn_to_page(mfn_add(mfn, i))) )
> > -        {
> > -            if ( order )
> > -                return -1;
> > -            *ipat = true;
> > -            return MTRR_TYPE_WRBACK;
> > -        }
> > +            special_pgs++;
> > +
> > +    if ( special_pgs )
> > +    {
> > +        if ( special_pgs != (1ul << order) )
> > +            return -1;
> > +
> > +        *ipat = true;
> > +        return MTRR_TYPE_WRBACK;
> 
> Did you actually observe an impact w/o this fix? 

Yes, the original change has caused a performance regression in some
GPU pass through workloads for XenServer.  I think it's reasonable to
avoid super page shattering if the resulting pages would all end up
using ipat and WRBACK cache attribute, as there's no reason for the
split in the first case.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.