Xen project Mailing List

Re: [Xen-devel] [PATCH, RFC] x86: make the GDT per-CPU

To: "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>

From: "Jan Beulich" <jbeulich@xxxxxxxxxx>

Date: Thu, 11 Sep 2008 13:28:07 +0100

Delivery-date: Thu, 11 Sep 2008 05:28:04 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

>>> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> 11.09.08 12:54 >>> >On 10/9/08 15:35, "Jan Beulich" <jbeulich@xxxxxxxxxx> wrote: > >> The major issue with supporting a significantly larger number of physical >> CPUs appears to be the use of per-CPU GDT entries - at present, x86-64 >> could support only up to 126 CPUs (with code changes to also use the >> top-most GDT page, that would be 254). Instead of trying to go with >> incremental steps here, by converting the GDT itself to be per-CPU, >> limitations in that respect go away entirely. > >Two thoughts: > >Firstly, we don't really need the LDT and TSS GST slots to be always valid. >Actually we always initialise the slot immediately before LTR or LLDT. So we >could even have per-CPU LDT and TSS initialisation share a single slot. >Then, with the extra reserved page, we'd be good for nearly 512 CPUs. No, this would break 32-bits at least: The GDT entry for the selector loaded into TR must remain a valid, busy TSS descriptor for the whole lifetime of the system. So it can't be shared with the LDT. But even for 64-bits I would fear using the same GDT slot for both LDT and GDT loading. >Secondly: Actually your patch looks not too bad. But the double LGDT in >context switch is nasty. But also I do not see why it is necessary? >Presumably your fear is about using the prev->vcpu_id's mapped GDT in >next->vcpu_id's page tables? But we should only be relying on GDT entries >(HYPERVISOR_CS, HYPERVISOR_DS, for example) which are identical in all >per-CPU GDTs. So why do you need to add that LGDT before CR3 switch at all? The goal is that the per-CPU descriptor be valid at all times (see the check_cpu() calls I put in there for debugging). As the double fault handlers have no way of deriving the current processor other than from that GDT entry (actually, I think x86-64 could, but didn't so far, so I didn't change that now), they'd break during that window. While you may argue that double faults are rare, my point here is that if we ever see one, analyzing its dump shouldn't be made more difficult than it likely already will be. >You would need to use l1e_write_atomic() in the context-switch code, to make >sure all VCPU's hypervisor reserved GDT mappings are always valid. Actually >you must at least use l1e_write() in any case -- it is not safe to not use >one of those macros on a live pagetable (by which I mean possibly in use by >some CPU) because a direct write of a PAE pte is not atomic and can cause >the pte to pass through a bogus intermediate state (which could be bogusly >prefetched by a CPU into its TLB. Yuk!). Ah, yes. l1e_write() should be sufficient, though, as the slot(s) that get(s) written cannot be validly in use on any CPU (for other than speculation). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.