[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Semantics of XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 28 Sep 2021 10:24:35 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=tR4AXSaKVfroVxqGofVIKZsWC/DUTb78WRJF6HG7UP8=; b=QOXC8ssdJKgmbKE91mp3zO3qmObywkII0pWtMf9euyC2xswnEjWGCt1HhPv3FzmkGF3YY0jc0bfn0WwlptGpJZlHZ4SoFeftZ8TT0J3H6qmbSzgduzwTBr0CrOQIbHAhzgVxTCQ3P5tK812HLT6bBKmBJ8Qs6V0v0FgjmEY3lHbtOliLHK4SBC6XeWMVN4I1vbwBs+odCwdl5y4u79cBnO9BqeXtol0f+1Zls8toOmrWmk9K82amVybyYDQWOVRIsCfp8l9AntcYXWCv838Ztqf3BnJe2iHM/o7fFpdaX27tv9qYDiBrSn/IhkE5KkVam2Aq1OOzF0HMhD5XarmPDw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CPaiTNYBRAyqRbaj2+EkPpGxWAkYGG+no+IPtRhUMUMcsHPGR2Ei1PdnqvDou+QcDC07OT6icpR/NT4d2thmA8+pKamMCOfX61VT/jZTp6p3UgHCT0iSNWV9VFirVGPMlygpdiA5KpxCaprXI0gkVAEl3ZRD/lcz3Jvig4M/1vY57m82qMFpF1QSBfjMbN2QvhpgH1K6K3irntV5r/wtK+KKoAc1IY50GH3zikjXtCXsqNYyKfMgUvh4Vk6sqnT+C9jaPclh2/kmfUVoR7cnqqM4RZdv3nlOMer/Lsx4cLYy34KOvpCFdSC+KhB8pWuYmCZb4nZezsE9kEfB/hQvHw==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: Tim Deegan <tim@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 28 Sep 2021 08:24:50 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.09.2021 23:01, Andrew Cooper wrote:
> A recent ABI change in Xen caused a total breakage under the Xapi
> toolstack, and the investigation had lead to this.

I'm curious which change this was; while it's likely one of mine, I
can't seem to be able to easily guess.

> First of all, the memory pool really needs renaming, because (not naming
> names) multiple developers were fooled into thinking that the bug was
> caused by a VM being unexpectedly in shadow mode.
> 
> Second, any MB value >= 0x1000000 will truncate to 0 between
> {hap,shadow}_domctl() and {hap,shadow}_set_allocation().

This wants fixing of course. I assume a patch is already in the works.
If not, let me know and I'll see about making one.

> But for the main issue, passing 0 in at the hypercall level is broken.
> 
> hap_enable() forces a minimum of 256 pages.  A subsequent hypercall
> trying to set 0 frees {tot 245, free 244, p2m 11} all the way down to
> {tot 1, free 0, p2m 11} before failing with -ENOMEM because there are no
> more free pages to free.  Getting -ENOMEM from this kind of operation
> isn't really correct.

It's questionable, but I wouldn't call it outright "not correct": The
function was requested to obtain memory (from the pool), so one may
view this as allocation. The set-allocation functions really are both
allocations and frees at the same time (moving pages from one pool to
another).

> Passing 0 cannot possibly function.  There are non-zero p2m frames by
> the time createdomain completes, as we allocate the top of the p2m, and
> they cannot be freed without the teardown logic which releases memory in
> the correct order.
> 
> In fact, passing anything lower than the current free size is guaranteed
> to fail.  Continuations also mean that you can't pick a value which is
> guaranteed not to fail, because even a well (poorly?) placed foreign map
> in dom0 could change the properties of the pool.

Well, I suppose outside of domain cleanup shrinking of the pool was
always meant as kind of a best effort operation.

> The shadow side rejects hypercall attempts using 0

I haven't been able to spot this rejection logic. Instead I'm getting
the impression that the BUG() at the bottom of _shadow_prealloc()
would be hit if shrinking the pool beyond what can really be freed
(i.e. in particular if any pages are in use for the p2m).

> (but can be bypassed
> with the above truncation bug), and will try to drop shadows to get down
> to the limit.  This represents a difference vs HAP, and in practice 1M
> granularity is probably enough to ensure that you can't fail to set the
> shadow allocation that low.  There is also a reachable BUG() somewhere
> in this path as reported several times on xen-devel, but I still haven't
> figured out how to tickle it.

Any pointer to one such report? I don't recall any, and hence it's
not clear to me whether that's the one in _shadow_prealloc().

> There is no code or working usecase for reducing the size of the shadow
> pool, except on domain destruction.  I think we should prohibit the
> ability to shrink the shadow pool, and defer most of this mess to anyone
> who turns up with a plausible usecase.

No present use case for reducing is a fair argument for dropping support
for doing so. That'll still mean an error return, which - according to
what you have written near the top - may still upset the Xapi tool stack.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.