[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 3/3] x86/boot: Don't double-copy the stack during physical relocation
On 07.12.2021 11:53, Andrew Cooper wrote: > cpu0_stack is contained within .data, which means the memcpy() already takes a > snapshot at the start of the critical region. > > Later, when we switch to the relocated Xen, we do end up losing updates to the > local variables, You word this as if that was the case already before your change, ... > but that's fine because the only variables we've modified go > out of scope after the printk(). Use this properly to avoid copying the whole > stack (32k) a second time. ... not the least because of "Use this properly ...". But aiui this is only a result of your change, in that you no longer "re-sync the stack" (as the comment says that you remove). I can appreciate that copying 32k is too much. Yet I do think that we're putting ourselves at risk if we rely on local variables going out of scope which have been updated. Even more so with no comment next to their declarations making clear that it is imperative for them to not move to outer scopes. (Seeing e.g. another "i" and "j" in the outer scope, one might actually be inclined to do so. For "j" this would, afaict, even work, while "i" collides with the outer one no matter what.) You mentioning the printk() - did you consider someone, perhaps just for debugging purposes, adding uses there of some of these local variables? We could limit the copying to just the primary stack(s), for example. Otoh I'm not convinced trying to save 20k or even 32k of copied memory is worth it in this boot-time code. There are larger gains to be had elsewhere - see e.g. my "x86: memcpy() / memset() (non-)ERMS flavors plus fallout" and "IOMMU: superpage support when not sharing pagetables" which continue to be pending. > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -1183,6 +1183,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) > l3_pgentry_t *pl3e; > l2_pgentry_t *pl2e; > int i, j, k; > + unsigned long tmp; > > /* Select relocation address. */ > xen_phys_start = end - reloc_size; > @@ -1193,7 +1194,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) > * Perform relocation to new physical address. > * Before doing so we must sync static/global data with main > memory > * with a barrier(). After this we must *not* modify > static/global > - * data until after we have switched to the relocated pagetables! > + * data, or locals that need to survive, until after we have > + * switched to the relocated pagetables! > */ > barrier(); > memcpy(__va(__pa(_start)), _start, _end - _start); > @@ -1239,18 +1241,21 @@ void __init noreturn __start_xen(unsigned long mbi_p) > *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) + > xen_phys_start); > } > > - /* Re-sync the stack and then switch to relocated pagetables. */ > + /* > + * Switch to relocated pagetables. This also discards updates to > + * any local variables since the memmove() call above, but that's > + * fine because we don't use any of them again. > + */ > asm volatile ( > - "rep movsq ; " /* re-sync the stack */ > - "movq %%cr4,%%rsi ; " > - "andb $0x7f,%%sil ; " > - "movq %%rsi,%%cr4 ; " /* CR4.PGE == 0 */ > - "movq %[pg],%%cr3 ; " /* CR3 == new pagetables */ > - "orb $0x80,%%sil ; " > - "movq %%rsi,%%cr4 " /* CR4.PGE == 1 */ > - : "=&S" (i), "=&D" (i), "=&c" (i) /* All outputs discarded. > */ > - : [pg] "r" (__pa(idle_pg_table)), "0" (cpu0_stack), > - "1" (__va(__pa(cpu0_stack))), "2" (STACK_SIZE / 8) > + "mov %%cr4, %[cr4]\n\t" > + "andb $~%c[pge], %b[cr4]\n\t" Why not simply %[pge] (name subject to improvement) with the value below becoming ~X86_CR4_PGE (and the constraint becoming "K")? > + "mov %[cr4], %%cr4\n\t" /* CR4.PGE = 0 */ > + "mov %[cr3], %%cr3\n\t" /* CR3 = new pagetables */ > + "orb $%c[pge], %b[cr4]\n\t" Oh, I see - the constant gets reused here. Jan > + "mov %[cr4], %%cr4\n\t" /* CR4.PGE = 1 */ > + : [cr4] "=&a" (tmp) /* Could be "r", but "a" makes better > asm */ > + : [cr3] "r" (__pa(idle_pg_table)), > + [pge] "i" (X86_CR4_PGE) > : "memory" ); > > printk("New Xen image base address: %#lx\n", xen_phys_start); >
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |