[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/3] x86/boot: Drop move_memory() and use memcpy() directly



On 07/12/2021 12:03, Jan Beulich wrote:
> On 07.12.2021 11:53, Andrew Cooper wrote:
>> @@ -1243,7 +1196,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>               * data until after we have switched to the relocated 
>> pagetables!
>>               */
>>              barrier();
>> -            move_memory(e, XEN_IMG_OFFSET, _end - _start, 1);
>> +            memcpy(__va(__pa(_start)), _start, _end - _start);
>>  
>>              /* Walk idle_pg_table, relocating non-leaf entries. */
>>              pl4e = __va(__pa(idle_pg_table));
>> @@ -1300,8 +1253,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>                     "1" (__va(__pa(cpu0_stack))), "2" (STACK_SIZE / 8)
>>                  : "memory" );
>>  
>> -            bootstrap_map(NULL);
>> -
>>              printk("New Xen image base address: %#lx\n", xen_phys_start);
>>          }
> This bootstrap_map() must have been dead code already before, except
> for the "keep" argument above needlessly having got passed as 1? Afaict
> passing 1 was pointless without using the function's return value.

bootstrap_map(NULL) is necessary to zap the constructed mappings, but it
seems like the use of the return address was dropped by c/s 0b76ce20de
"x86/setup: don't relocate the VGA hole" in 2013.

>
>> @@ -1325,9 +1276,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>                   (headroom ||
>>                    ((end - size) >> PAGE_SHIFT) > mod[j].mod_start) )
>>              {
>> -                move_memory(end - size + headroom,
>> -                            (uint64_t)mod[j].mod_start << PAGE_SHIFT,
>> -                            mod[j].mod_end, 0);
>> +                memcpy(__va(end - size + headroom),
>> +                       __va((uint64_t)mod[j].mod_start << PAGE_SHIFT),
>> +                       mod[j].mod_end);
> I'm not convinced this can be memcpy() - consider_modules() specifically
> allows for the current module's source and destination areas to overlap.
> See also the comment ahead of its invocation a few lines up from here.

The comment which says:

/* Don't overlap with other modules (or Xen itself). */
end = consider_modules(s, e, size, mod,
                       mbi->mods_count + relocated, j);

?

memmove() in move_memory() is broken, and in fact always results in a
backwards copy, which means that one way or another, overlapping source
and destination doesn't work.

If it was really broken before, then it can be fixed now by using
memmove() here, because using 2 directmap mappings means the
forward/backward check will now work as expected.

> I'm also not convinced we have the source range (fully) direct-mapped at
> this point. Only full superpages have been mapped so far, and only those
> for the current or higher address E820 entries (plus of course the pre-
> built mappings of the space below 1Gb [PREBUILT_MAP_LIMIT]) located
> below 4Gb.

PREBUILT_MAP_LIMIT is 2M, and that's only to cover the fact that we
build l1_directmap[] with the VGA UC range at build time.  I was hoping
to remove it in due course.

As to the other mappings, that is awkward.  Perhaps what we ought to do
is split the loops.  First fill in all 2M superpages into the directmap,
then relocate Xen, at which point we've got plenty of frames to feed
into the allocator, to let us do a second pass filling in non-2M regions.

We can depend on the modules living in RAM regions, but might want to
explicitly confirm.

> As to the 2nd argument - if this can indeed be converted in the first
> place, may I suggest to also switch to using pfn_to_paddr()?

Honestly, that's taking a terrible situation and making it worse.

Calling pfn_to_paddr() on what is logically a paddr_t already ought to
be a compilation error, and the logic which makes this change
deliberately is some of the most nack-worthy logic I've ever come across.

It's very much not ok to have mod_start be a paddr or pfn, and for
mod_end to either be an end or a sized, epending on where you are during
boot.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.