[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.16 v4] gnttab: allow setting max version per-domain


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 2 Nov 2021 16:42:20 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/nE6avqwBdBnwLEGc4SMc56P2CAPOw2Oii2xa3UpDCM=; b=NHvsMXcm3h3NOys+5dTiGckglmr9Y/YKkeagU2RPstrrIEmjETqdsxDXMYn8GVvI6k/vRgAV+q5KUDny0CsGczg1xdRXh1Xp7Vmlv3rTC0Xo7ZcUyfPb7SAk/KEX85vN9pRmFv2FWQ5v2KEPZ3uTHMJsRZSKpOfJd8E9E28rna2f/1v64jY88KTLIO79Jx1X7tri3ipi2D1G7+lKgUFteBmgI8c5jE0mP011ch2vWLHXlI92J67mwtJDzahiujemwRKvENSGd70SFTTuFLiY3oLTAWJnhQSikjQex/bHlQ2jaj01wSNYQNaVl6aKguDVGiX2hoIxxZwMYSgDBAXYIA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=djj6+P3uChj9x/kBtbpAuRI0daClCBvgGiAoFV9Zp8E00Lde1Q7vpMtWA8hZ3L3Tkl02/7rucPPx3U/HkeSDFfe3H0JDVZ1pUlhVrmcifGbEGx5ryOPbwAMeT/SmEzO3RzxTl58pp60LeIxtUxj6rt6VcsnLYxKZnNpj5Twsb85u1M8Dm4fFf5qOfhJxZhLFLvSW7+1aIP6kGovTPLyQ7N1d3UMG0FmA7FdDlCHiz705eZKd9nTLea+ucPtEqQrC1qP8MO4vCeRqwJGZkR/MUVioBm3IPC+wFcNrz52Zq86UHrY0lWSAGDUYhJkwm1btnj1BXAt7DGajMsEMMft+fQ==
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, "Christian Lindig" <christian.lindig@xxxxxxxxxx>, David Scott <dave@xxxxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>
  • Delivery-date: Tue, 02 Nov 2021 15:42:50 +0000
  • Ironport-data: A9a23:tamIyait96uZtQu1oPitcqIjX161MRYKZh0ujC45NGQN5FlHY01je htvX27Sa6nfZTPwftxzbYSx90tU68LSzdUwTlNl+yBhEX8b9cadCdqndUqhZCn6wu8v7a5EA 2fyTvGacajYm1eF/k/F3oAMKRCQ7InQLlbGILes1htZGEk0F0/NtTo5w7Rg29cw24Dja++wk YiaT/P3aQfNNwFcagr424rbwP+4lK2v0N+wlgVWicFj5DcypVFMZH4sDfjZw0/DaptVBoaHq 9Prl9lVyI97EyAFUbtJmp6jGqEDryW70QKm0hK6UID66vROS7BbPg/W+5PwZG8O4whlkeydx /1Ckqy1FTpqIZH+neEvUTRqA3p0P5NJreqvzXiX6aR/zmXDenrohf5vEFs3LcsT/eMf7WNmr KJCbmpXN1ba2rzwkOnTpupE36zPKOHxO4wSoDd4xCzxBvc6W5HTBa7N4Le02R9t2JARR6iGP 6L1bxJUQEyQeUQUKm0OAc00p8CEoHPfWmFX/Qf9Sa0fvDGIkV0ZPKLWGNbcZNGiX8hemUec4 GXc8AzRAB4AM8eE4SGY6X/qjejK9Qv6R4A6BLC+7uRtglCY2ioUEhJ+fWW8pf61m0uvQeV1I kYf+jcthaUq/UntRd74NzWju2KNtBMYX9tWEsU55RuLx66S5ByWbkAbShZRZdpgs9U5LRQ62 1nMk973CDhHtLyOVWnb5rqStSm1OyUeMSkFfyBsZQkY59jupqkjgxSJScxseIa3hNDoHTD7w xiRsTMzwb4UiKY2O76TpA6dxWj2/96QE1Bztl6/sn+ZAh1Ra6mVYtCVqkfn3/NFHoS0fEnat no6lJ3LhAwRNq2lmCuISeQLObim4feZLTHR6WJS84kdGyeFoCD6I90JiN1qDAIwa5tfJ2e1C KPGkVoJvMc7AZe8UUNgj2td4ewOxLOoK9nqX+u8gjFmMskoL1/vEM2DiCetM4HRfKoEzfFX1 XSzK5/E4ZMm5UJPlmPeqwA1iudD+8zG7TmPLa0XNjz+uVZkWFabSK0eLHyFZf0j4aWPrW39q ogEapTbkU0ACrenPkE7FLL/y3hQdBDX4rit86RqmhOreFI6SAnN9deImdvNhLCJb4wKz7yVr xlRq2dTyUblhG2vFOl5QisLVV8bZr4m9ShTFXV1ZT6AgiF/Ca7yvPZ3X8ZmJtEPqb08pcOYu tFYIq1s9NwUEW+Zk9ncBLGgxLFfmOOD3FnTYnH6PGBnF3OiLiSQkuLZksLU3HBmJgK8tNcko q3m0QXeQJEZQB9lAtqQY/Wqp25dd1BE8A6rd0eXcNRVZmv2941md374gvMtepleIhTf3DqKk Q2RBE5A9+XKpoY09vjPhLyF8Nj1Q7cvQBICEjmJ96uyOAnb4nGnnd1KXtGXcG2PT2jz4qijO 7lYlqmuLP0dkV9WmINgCLI3n7km7t7iquYCnARpFXnGdXqxDbZkLiXU1MVDrPQVlLRYpRG3S gSE/dwDYeeFP8bsEVgwIgs5b7vciaFIy2eKtflseRf0/i56+raDQH5+BRjUhXwPNqZxPaMk3 fwl5Jwc5Tugh0d4Kd2BlC1VqTiBdyRST6U9u5gGK4b3kQ53mEpaaJnRByKqspGCb9JAbhsjL jOO3feQgr1dwgzJcmYpFGiL1u1Y3MxctBdPxV4EBlKIhtua2aNngEwPqWw6HlZP0xFK8+NvI Ww6ZUR6KJKH8ypsmMUeDXunHBtMBUHB90H8o7fTeLY1k6V8urTxEVAA
  • Ironport-hdrordr: A9a23:UMoGKatlotd38b47l3ZRypLP7skC6oMji2hC6mlwRA09TyXGra 6TdaUguiMc1gx8ZJhBo7C90KnpewK7yXdQ2/htAV7EZnibhILIFvAZ0WKG+Vzd8kLFh4tgPM tbAsxD4ZjLfCdHZKXBkXmF+rQbsaG6GcmT7I+0pRodLnAJGtJdBkVCe32m+yVNNXh77PECZe OhD6R81l2dkSN9VLXEOpBJZZmPm/T70LbdJTIWDR8u7weDyRuu9b7BChCdmjMTSSlGz7sO+X XM11WR3NTtj9iLjjvnk0PD5ZVfn9XsjvNFGcy3k8AQbhHhkByhaohNU6CL+Bo1vOaswlA3l8 SkmWZtA+1Dr1fqOk2lqxrk3AftlB4o9n/Z0FedxUDupMToLQhKQ/ZptMZ8SF/0+kAgtNZz3O ZgxGSCradaChvGgWDU+8XIfwsCrDv1nVMS1cooy1BPW4oXb7Fc6aYF+llOLZsGFCXmrKg6De hVCt3G7vo+SyLaU5nghBgs/DWQZAV3Iv/fKXJy/vB9kgIm0kyR9nFoh/D2xRw7hdUAo5ot3Z WMDk0nrsAJciYsV9MJOA42e7rANoX8e2O/DIusGyWSKEgmAQOHl3el2sR+2AmVEKZ4u6fa3q 6xCW9liQ==
  • Ironport-sdr: UIV3ommM2bS4c22BOEnKhsnaLMxQMw2azHqKGPSzMXZhRWV9FKBOUXcHg99jbBdxocRM2rMdAx nTf850mLgYT7SnrnC4NC6rsayptUJXJ33NTzf2vSLVdABaWwalpWYkw1m9Q3kWExPTLZQfZrmX HZxyC3q47jjpSoQtDCkmfkEKLmlEMluphqMtBR1sgp8ZDPkhJKdofqVRMBAzTZT2LroAfS+TnO pasLkK65ApHAV6HTUMmeq3XILKJfeHuVq0JRTbvGmqQqeLhT7HTkBUlBeRgl/bWjIwVP/SdpIz QkYPcnWFnI7Wk4HPH3NwJa7G
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Nov 02, 2021 at 02:34:03PM +0000, Andrew Cooper wrote:
> On 30/10/2021 08:53, Roger Pau Monné wrote:
> > On Fri, Oct 29, 2021 at 05:39:52PM +0100, Andrew Cooper wrote:
> >> On 29/10/2021 08:59, Roger Pau Monne wrote:
> >>> diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
> >>> index e510395d08..f94f0f272c 100644
> >>> --- a/xen/common/grant_table.c
> >>> +++ b/xen/common/grant_table.c
> >>> @@ -1917,11 +1918,33 @@ active_alloc_failed:
> >>>  }
> >>>  
> >>>  int grant_table_init(struct domain *d, int max_grant_frames,
> >>> -                     int max_maptrack_frames)
> >>> +                     int max_maptrack_frames, unsigned int options)
> >>>  {
> >>>      struct grant_table *gt;
> >>> +    unsigned int max_grant_version = options & 
> >>> XEN_DOMCTL_GRANT_version_mask;
> >>>      int ret = -ENOMEM;
> >>>  
> >>> +    if ( max_grant_version == XEN_DOMCTL_GRANT_version_default )
> >>> +        max_grant_version = opt_gnttab_max_version;
> >>> +    if ( !max_grant_version )
> >>> +    {
> >>> +        dprintk(XENLOG_INFO, "%pd: invalid grant table version 0 
> >>> requested\n",
> >>> +                d);
> >>> +        return -EINVAL;
> >>> +    }
> >>> +    if ( max_grant_version > opt_gnttab_max_version )
> >>> +    {
> >>> +        dprintk(XENLOG_INFO,
> >>> +                "%pd: requested grant version (%u) greater than 
> >>> supported (%u)\n",
> >>> +                d, max_grant_version, opt_gnttab_max_version);
> >>> +        return -EINVAL;
> >>> +    }
> >> I think this wants to live in sanitise_domain_config() along with all
> >> the other auditing of flags and settings.
> > The reason to place those there is that the sanity checks for the
> > other grant table related parameters (max_grant_frames and
> > max_maptrack_frames) are performed in this function also. I think it's
> > better to keep the checks together.
> >
> > We should consider exporting the relevant values from grant table
> > code and then moving all the checks to sanitise_domain_config, but
> > likely a follow up work given the current point in the release.
> >
> >> Also, it can be simplified:
> >>
> >> if ( max_grant_version < 1 ||
> >>     max_grant_version > opt_gnttab_max_version )
> >> {
> >>     dprintk(XENLOG_INFO, "Requested gnttab max version %u outside of
> >> supported range [%u, %u]\n", ...);
> >> }
> > It was originally done this way so that the first check
> > (!max_grant_version) could be adjusted when support for
> > max_grant_version == 0 was introduced [0] in order to signal the
> > disabling of grant tables altogether.
> >
> >>
> >>> +    if ( unlikely(max_page >= PFN_DOWN(TB(16))) && is_pv_domain(d) &&
> >>> +         max_grant_version < 2 )
> >>> +        dprintk(XENLOG_INFO,
> >>> +                "%pd: host memory above 16Tb and grant table v2 
> >>> disabled\n",
> >>> +                d);
> >> This is rather more complicated.
> >>
> >> For PV, this going wrong in the first place is conditional on 
> >> CONFIG_BIGMEM.
> >> For HVM, it the guest address size, not the host.
> >> For ARM, I don't even know, because I've lost track of which bits of the
> >> ABI are directmap in an otherwise translated domain.
> > This was only aiming to cover the PV case, which I think it's the more
> > likely one. It's possible there's people attempting to create PV
> > guests on a 16TB machine, but I think it's more unlikely that the
> > guest itself will have 16TB of RAM.
> >
> >> I think it is probably useful to do something about it, but probably not
> >> in this patch.
> > I'm fine with this, we had no warning at all before, so I don't think
> > it should be a hard requirement to add one now. It would be nice if
> > there's consensus, but otherwise let's just skip it.
> >
> >> Perhaps modify domain_set_alloc_bitsize() to impose an upper limit for
> >> the "host memory size matters" cases?
> >>
> >> For the guest address size cases, this possibly wants to feed in to the
> >> max policy calculations in the same way that shadow kinda does.
> > So grant table version will basically clamp the amount of memory a
> > guest can use?
> >
> > What about guests that doesn't use grant tables at all, do we expect
> > those to set the future max_grant_version to 0 in order to avoid the
> > clamping without having to expose grant v2?
> >
> >>> diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> >>> index 51017b47bc..0ec57614bd 100644
> >>> --- a/xen/include/public/domctl.h
> >>> +++ b/xen/include/public/domctl.h
> >>> @@ -89,14 +89,20 @@ struct xen_domctl_createdomain {
> >>>      /*
> >>>       * Various domain limits, which impact the quantity of resources
> >>>       * (global mapping space, xenheap, etc) a guest may consume.  For
> >>> -     * max_grant_frames and max_maptrack_frames, < 0 means "use the
> >>> -     * default maximum value in the hypervisor".
> >>> +     * max_grant_frames, max_maptrack_frames and max_gnttab_version < 0
> >>> +     * means "use the default maximum value in the hypervisor".
> >>>       */
> >>>      uint32_t max_vcpus;
> >>>      uint32_t max_evtchn_port;
> >>>      int32_t max_grant_frames;
> >>>      int32_t max_maptrack_frames;
> >>>  
> >>> +/* Grant version, use low 4 bits. */
> >>> +#define XEN_DOMCTL_GRANT_version_mask    0xf
> >>> +#define XEN_DOMCTL_GRANT_version_default 0xf
> >> This needs to be a toolstack decision, not something in Xen.  This
> >> doesn't fix the case where VMs can't cope with change underfoot.
> >>
> >> It is fine for the user say "use the default", but this must be turned
> >> into an explicit 1 or 2 by the toolstack, so that the version(s) visible
> >> to the guest remains invariant while it is booted.
> > Please bear with me, as I'm afraid I don't understand why this is
> > relevant. Allowed max grant version can only change as a result of a
> > migration
> 
> No.  Allowed max grant version is (well - needs to be) a fixed property
> of the VM, even across migration.

Right, but I think we agreed we where going to punt this to post 4.16,
as noted in:

https://lore.kernel.org/xen-devel/24954.44919.8320.63375@xxxxxxxxxxxxxxxxxxxxxxxx/

It's strictly no worse than the current code, where you can migrate
from a host with a default max grant version of 2 to one with a
default max grant version of 1 and migration will succeed.

> It was a fundamentally mistake to ever have gnttab v2 active by default,
> without an enumeration, and with no way of turning it off.  Same too for
> evtchn, but we've already taken a patch to knobble fifo support.
> 
> 
> The toolstack needs to explicitly select v1 or v2.  It's fine to pick a
> default on behalf a user which doesn't care, but what moves in the
> migrate stream must an explicit, unambiguous value, so the destination
> Xen and toolstack can reconstruct the VM exactly.
> 
> "default" is ambiguous, and cannot be recovered after the fact.  In
> particular, a vm with no explicit configuration, when booted on a Xen
> with gnttab limited to v1 on the command line, should not have v2 become
> accessible by migrating to a second Xen with no command line limit.  It
> is fine, if that VM subsequently reboots, to find that v2 is now available.

There are other grant table options that have the defaults set in Xen
(ie: max_grant_frames and max_maptrack_frames), which will need to be
fetched on a per-domain basis already in order to be migrated, so I
was planning on doing something similar with the max grant version, so
that we could fetch all the grant table related parameters.

Or else we should also remove setting max_grant_frames and
max_maptrack_frames to -1 (default), and instead force the toolstack
to explicitly set those. In any case, I think we need to handle the
grant table version and max_{grant,maptrack}_frames in the same way,
and it's likely better to leave that for later.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.