Xen project Mailing List

Re: [PATCH 3/4] xen/version: Drop bogus return values for XENVER_platform_parameters

To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>

Date: Thu, 5 Jan 2023 08:57:39 +0100

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2u/5XapuE10KqU2DCj7P5PcSdhcY/Y5VJzSmMaxZnD8=; b=LJu3RGSRt/odNK8gTey0v+AIFAa6CrKIr+UJbkTXsLRexf3968W+tdZveCO0ImuBQcmiz2Iyy+MadgtF3BaD6XbxUqZ7Oz2ilPBW2VaGfJjYbx9veK+4bbq/e5LclcU8r4t98zE3wWmWFSHaABjzx4EVdU6YVNlHO3nChL1zHFSMAZTDrWH2J3F1qE2hgjXpPgbZHHG7XPWnwaVM/7c9v+eYu/tP3P0XRtlrA6CKgl8+MS4ETNLp08rXGnfEfI9pWaLWuTNW0SRexIcFOQmR3e+u8fK5UpUVPeW7b51hFXGH4xXoykfSCqECb/pLkkEeVtSXmQHKCIhkuzFYCGJKaw==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eB+bYGx7rKD0h3IHROmfe58JArXY3sX9rdfEYrKV+m5mvUe2g3WCJJil5hsBGub70gxleslJ5bVgApS7TpFF2XvDH2qGj8DecX5ZtjMlzXftgALuIPJtQEWA6gLQmZznKYYwWjEj2tClsKmQr7jULRbFY2DbolQumyJi101cLaGSb+/5TqdHe8tBcETbKw3pEYWn+NxyJ5HdsYLHMqcx+DrjBRDqkbOjEs3zzpFK/Z2pQemApNEXkvsp6bQHZKxCdhYt/P6bAiyfSq/P9RgK5PdpmdeaDhoOdWvNGEFBWrlamYqsfNK03EWaymW78XLO6TZZdSKHYrfpb4wKDyxxBw==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: George Dunlap <George.Dunlap@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Thu, 05 Jan 2023 07:58:03 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.01.2023 20:55, Andrew Cooper wrote: > On 04/01/2023 4:40 pm, Jan Beulich wrote: >> On 03.01.2023 21:09, Andrew Cooper wrote: >>> A split in virtual address space is only applicable for x86 PV guests. >>> Furthermore, the information returned for x86 64bit PV guests is wrong. >>> >>> Explain the problem in version.h, stating the other information that PV >>> guests >>> need to know. >>> >>> For 64bit PV guests, and all non-x86-PV guests, return 0, which is strictly >>> less wrong than the values currently returned. >> I disagree for the 64-bit part of this. Seeing Linux'es exposure of the >> value in sysfs I even wonder whether we can change this like you do for >> HVM. Who knows what is being inferred from the value, and by whom. > > Linux's sysfs ABI isn't relevant to us here. The sysfs ABI says it > reports what the hypervisor presents, not that it will be a nonzero number. It effectively reports the hypervisor (virtual) base address there. How can we not care if something (kexec would come to mind) would be using it for whatever purpose. And thinking of it, the tool stack has uses, too. Assuming you audited them, did you consider removing dead uses in a prereq patch (and discuss the effects on live ones in the description)? >>> --- a/xen/include/public/version.h >>> +++ b/xen/include/public/version.h >>> @@ -42,6 +42,26 @@ typedef char xen_capabilities_info_t[1024]; >>> typedef char xen_changeset_info_t[64]; >>> #define XEN_CHANGESET_INFO_LEN (sizeof(xen_changeset_info_t)) >>> >>> +/* >>> + * This API is problematic. >>> + * >>> + * It is only applicable to guests which share pagetables with Xen (x86 PV >>> + * guests), and is supposed to identify the virtual address split between >>> + * guest kernel and Xen. >>> + * >>> + * For 32bit PV guests, it mostly does this, but the caller needs to know >>> that >>> + * Xen lives between the split and 4G. >>> + * >>> + * For 64bit PV guests, Xen lives at the bottom of the upper canonical >>> range. >>> + * This previously returned the start of the upper canonical range (which >>> is >>> + * the userspace/Xen split), not the Xen/kernel split (which is 8TB further >>> + * on). This now returns 0 because the old number wasn't correct, and >>> + * changing it to anything else would be even worse. >> Whether the guest runs user mode code in the low or high half (or in yet >> another way of splitting) isn't really dictated by the PV ABI, is it? > > No, but given a choice of reporting the thing which is an architectural > boundary, or the one which is the actual split between the two adjacent > ranges, reporting the architectural boundary is clearly the unhelpful thing. Hmm. To properly parallel the 32-bit variant, a [start,end] range would need exposing for 64-bit, rather than exposing nothing. Not the least because ... >> So >> whether the value is "wrong" is entirely unclear. Instead ... >> >>> + * For all guest types using hardware virt extentions, Xen is not mapped >>> into >>> + * the guest kernel virtual address space. This now return 0, where it >>> + * previously returned unrelated data. >>> + */ >>> #define XENVER_platform_parameters 5 >>> struct xen_platform_parameters { >>> xen_ulong_t virt_start; >> ... the field name tells me that all that is being conveyed is the virtual >> address of where the hypervisor area starts. > > IMO, it doesn't matter what the name of the field is. It dates from the > days when 32bit PV was the only type of guest. > > 32bit PV guests really do have a variable split, so the guest kernel > really does need to get this value from Xen. > > The split for 64bit PV guests is compile-time constant, hence why 64bit > PV kernels don't care. ... once we get to run Xen in 5-level mode, 4-level PV guests could also gain a variable split: Like for 32-bit guests now, only the r/o M2P would need to live in that area, and this may well occupy less than the full range presently reserved for the hypervisor. > For compat HVM, it happens to pick up the -1 from: > > #ifdef CONFIG_PV32 > HYPERVISOR_COMPAT_VIRT_START(d) = > is_pv_domain(d) ? __HYPERVISOR_COMPAT_VIRT_START : ~0u; > #endif > > in arch_domain_create(), whereas for non-compat HVM, it gets a number in > an address space it has no connection to in the slightest. ARM guests > end up getting XEN_VIRT_START (== 2M) handed back, but this absolutely > an internal detail that guests have no business knowing. Well, okay, this looks to be good enough an argument to make the adjustment you propose for !PV guests. > The only reason I'm not issuing an XSA for this is because we don't have > any pretence of KASLR in Xen. Pretty much every other kernel gets CVEs > for infoleaks like this. > > We feasibly could do KASLR in !PV builds, at which point this would > qualify for an XSA. I would question that, but I can see your view as one possible one. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.