Xen project Mailing List

Re: Semantics of XEN_DOMCTL_SHADOW_OP_SET_ALLOCATION

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Tue, 28 Sep 2021 10:24:35 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=tR4AXSaKVfroVxqGofVIKZsWC/DUTb78WRJF6HG7UP8=; b=QOXC8ssdJKgmbKE91mp3zO3qmObywkII0pWtMf9euyC2xswnEjWGCt1HhPv3FzmkGF3YY0jc0bfn0WwlptGpJZlHZ4SoFeftZ8TT0J3H6qmbSzgduzwTBr0CrOQIbHAhzgVxTCQ3P5tK812HLT6bBKmBJ8Qs6V0v0FgjmEY3lHbtOliLHK4SBC6XeWMVN4I1vbwBs+odCwdl5y4u79cBnO9BqeXtol0f+1Zls8toOmrWmk9K82amVybyYDQWOVRIsCfp8l9AntcYXWCv838Ztqf3BnJe2iHM/o7fFpdaX27tv9qYDiBrSn/IhkE5KkVam2Aq1OOzF0HMhD5XarmPDw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CPaiTNYBRAyqRbaj2+EkPpGxWAkYGG+no+IPtRhUMUMcsHPGR2Ei1PdnqvDou+QcDC07OT6icpR/NT4d2thmA8+pKamMCOfX61VT/jZTp6p3UgHCT0iSNWV9VFirVGPMlygpdiA5KpxCaprXI0gkVAEl3ZRD/lcz3Jvig4M/1vY57m82qMFpF1QSBfjMbN2QvhpgH1K6K3irntV5r/wtK+KKoAc1IY50GH3zikjXtCXsqNYyKfMgUvh4Vk6sqnT+C9jaPclh2/kmfUVoR7cnqqM4RZdv3nlOMer/Lsx4cLYy34KOvpCFdSC+KhB8pWuYmCZb4nZezsE9kEfB/hQvHw==

Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;

Cc: Tim Deegan <tim@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 28 Sep 2021 08:24:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.09.2021 23:01, Andrew Cooper wrote: > A recent ABI change in Xen caused a total breakage under the Xapi > toolstack, and the investigation had lead to this. I'm curious which change this was; while it's likely one of mine, I can't seem to be able to easily guess. > First of all, the memory pool really needs renaming, because (not naming > names) multiple developers were fooled into thinking that the bug was > caused by a VM being unexpectedly in shadow mode. > > Second, any MB value >= 0x1000000 will truncate to 0 between > {hap,shadow}_domctl() and {hap,shadow}_set_allocation(). This wants fixing of course. I assume a patch is already in the works. If not, let me know and I'll see about making one. > But for the main issue, passing 0 in at the hypercall level is broken. > > hap_enable() forces a minimum of 256 pages. A subsequent hypercall > trying to set 0 frees {tot 245, free 244, p2m 11} all the way down to > {tot 1, free 0, p2m 11} before failing with -ENOMEM because there are no > more free pages to free. Getting -ENOMEM from this kind of operation > isn't really correct. It's questionable, but I wouldn't call it outright "not correct": The function was requested to obtain memory (from the pool), so one may view this as allocation. The set-allocation functions really are both allocations and frees at the same time (moving pages from one pool to another). > Passing 0 cannot possibly function. There are non-zero p2m frames by > the time createdomain completes, as we allocate the top of the p2m, and > they cannot be freed without the teardown logic which releases memory in > the correct order. > > In fact, passing anything lower than the current free size is guaranteed > to fail. Continuations also mean that you can't pick a value which is > guaranteed not to fail, because even a well (poorly?) placed foreign map > in dom0 could change the properties of the pool. Well, I suppose outside of domain cleanup shrinking of the pool was always meant as kind of a best effort operation. > The shadow side rejects hypercall attempts using 0 I haven't been able to spot this rejection logic. Instead I'm getting the impression that the BUG() at the bottom of _shadow_prealloc() would be hit if shrinking the pool beyond what can really be freed (i.e. in particular if any pages are in use for the p2m). > (but can be bypassed > with the above truncation bug), and will try to drop shadows to get down > to the limit. This represents a difference vs HAP, and in practice 1M > granularity is probably enough to ensure that you can't fail to set the > shadow allocation that low. There is also a reachable BUG() somewhere > in this path as reported several times on xen-devel, but I still haven't > figured out how to tickle it. Any pointer to one such report? I don't recall any, and hence it's not clear to me whether that's the one in _shadow_prealloc(). > There is no code or working usecase for reducing the size of the shadow > pool, except on domain destruction. I think we should prohibit the > ability to shrink the shadow pool, and defer most of this mess to anyone > who turns up with a plausible usecase. No present use case for reducing is a fair argument for dropping support for doing so. That'll still mean an error return, which - according to what you have written near the top - may still upset the Xapi tool stack. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.