[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2-ish] x86/boot: Factor move_xen() out of __start_xen()
On 04.01.2023 21:04, Andrew Cooper wrote: > On 21/12/2022 7:40 am, Jan Beulich wrote: >> On 20.12.2022 21:56, Andrew Cooper wrote: >>> On 20/12/2022 1:51 pm, Jan Beulich wrote: >>>> On 16.12.2022 21:17, Andrew Cooper wrote: >>>>> + "mov %[cr4], %%cr4\n\t" /* CR4.PGE = 1 */ >>>>> + : [cr4] "=&a" (tmp) /* Could be "r", but "a" makes better asm */ >>>>> + : [cr3] "r" (__pa(idle_pg_table)), >>>>> + [pge] "i" (X86_CR4_PGE) >>>>> + : "memory" ); >>>> The removed stack copying worries me, to be honest. Yes, for local >>>> variables of ours it doesn't matter because they are about to go out >>>> of scope. But what if the compiler instantiates any for its own use, >>>> e.g. ... >>>> >>>>> + /* >>>>> + * End of the critical region. Updates to locals and globals now >>>>> work as >>>>> + * expected. >>>>> + * >>>>> + * Updates to local variables which have been spilled to the stack >>>>> since >>>>> + * the memcpy() have been lost. But we don't care, because they're >>>>> all >>>>> + * going out of scope imminently. >>>>> + */ >>>>> + >>>>> + printk("New Xen image base address: %#lx\n", xen_phys_start); >>>> ... the result of the address calculation for the string literal >>>> here? Such auxiliary calculations can happen at any point in the >>>> function, and won't necessarily (hence the example chosen) become >>>> impossible for the compiler to do with the memory clobber in the >>>> asm(). And while the string literal's address is likely cheap >>>> enough to calculate right in the function invocation sequence (and >>>> an option would also be to retain the printk() in the caller), >>>> other instrumentation options could be undermined by this as well. >>> Right now, the compiler is free to calculate the address of the string >>> literal in a register, and move it the other side of the TLB flush. >>> This will work just fine. >>> >>> However, the compiler cannot now, or ever in the future, spill such a >>> calculation to the stack. >> I'm afraid the compiler's view at things is different: Whatever it puts >> on the stack is viewed as virtual registers, unaffected by a memory >> clobber (of course there can be effects resulting from uses of named >> variables). Look at -O3 output of gcc12 (which is what I happened to >> play with; I don't think it's overly version dependent) for this >> (clearly contrived) piece of code: >> >> int __attribute__((const)) calc(int); >> >> int test(int i) { >> int j = calc(i); >> >> asm("nopl %0" : "+m" (j)); >> asm("nopq %%rsp" ::: "memory", "ax", "cx", "dx", "bx", "bp", "si", "di", >> "r8", "r9", "r10", "r11", "r12", "r13", "r14", >> "r15"); >> j = calc(i); >> asm("nopl %0" :: "m" (j)); >> >> return j; >> } >> >> It instantiates a local on the stack for the result of calc(); it does >> not re-invoke calc() a 2nd time. Which means the memory clobber in the >> middle asm() does not affect that (and by implication: any) stack slot. >> >> Using godbolt I can also see that clang15 agrees with gcc12 in this >> regard. I didn't bother trying other versions. > > Well this is problematic, because it contradicts what we depend on > asm("":::"memory") doing... Does it? I'm unaware of instances where we use "memory" to deal with local variables. > https://godbolt.org/z/xeGMc3sM9 > > But I don't fully agree with the conclusions drawn by this example. > > It only instantiates a local on the stack because you force a memory > operand to satisfy the "m" constraints, not to satisfy the "memory" > constraint. Sure, all I wanted to show is that the compiler may make such transformations. As said, the example is clearly contrived, but I also didn't want to spend too much time on finding a yet better one. > By declaring calc as const, you are permitting the compiler to make an > explicit transformation to delete one of the calls, irrespective of > anything else in the function. > > It is weird that 'j' ends up taking two stack slots when would be > absolutely fine for it to only have 1, and indeed this is what happens > when you remove the first and third asm()'s. It is these which force > 'j' to be on the stack, not the memory clobber in the middle. > > Observe that after commenting those two out, Clang transforms things to > spill 'i' onto the stack, rather than 'j', and then tail-call calc() on > the way out. This is actually deleting the first calc() call, rather > than the second. Which would still demonstrate what I wanted to demonstrate - we can't assume that "memory" guards against the use of stack locals by the compiler. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |