[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2.1] hvmloader: indicate dynamically allocated memory as ACPI NVS in e820


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
  • Date: Tue, 1 Sep 2020 11:29:37 +0100
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxx>, <jbeulich@xxxxxxxx>, <andrew.cooper3@xxxxxxxxxx>, <wl@xxxxxxx>, <iwj@xxxxxxxxxxxxxx>
  • Delivery-date: Tue, 01 Sep 2020 10:29:44 +0000
  • Ironport-sdr: eCfQlcxT7uqDbPRNo/kH5ZoZ4KCawWe0iHM/+DLmr27YKDO99IzA3El6LEAByssm/krX26jRMR kZ42sxVmm/kP9GdM9Bmk8Ld2VU/wmATx4K7b3/uDYaN6DUnQQfimPm7bFA+GhT1op+Z5yaN0yf vfbSAOskga7BSRbTN5SuN6JmIov92TyTaymseT5QdSoTjTAgaLdjP2ez2W3ChRj1xlGdDl8PDr Azl2WEBOs8FG7v3PmHkw7dPvQE/mLBOgyKymiVQyovHhNxaVAyR++sOsS5LCgeNyL+RdcSY/yE bVk=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01/09/2020 10:28, Roger Pau Monné wrote:
> On Tue, Sep 01, 2020 at 03:50:34AM +0100, Igor Druzhinin wrote:
>> Guest kernel does need to know in some cases where the tables are located
>> to treat these regions properly. One example is kexec process where
>> the first kernel needs to pass firmware region locations to the second
>> kernel which is now a requirement after 02a3e3cdb7f12 ("x86/boot: Parse SRAT
>> table and count immovable memory regions").
> 
> Can you add a note that this is a Linux commit? Albeit there's a
> reference to kexec above I don't it's entirely clear it's a Linux
> commit.

Ok.

>>
>> The memory that hvmloader allocates in the reserved region mostly contains
>> these useful tables and could be safely indicated as ACPI without the need
>> to designate a sub-region specially for that. Making it non-reclaimable
>> (ACPI NVS) in contrast with ACPI reclaim (ACPI table) memory would avoid
>> potential reuse of this memory by the guest taking into account this region
>> may contain runtime structures like VM86 TSS, etc. If necessary, those
>> can be moved away later and the region marked as reclaimable.
> 
> By looking at domain_construct_memmap from libxl I think the same
> problem is not present on that case as regions are properly marked as
> RESERVED or ACPI as required?

Uhh, I simply forgot that PVH also constructs e820 - it seems it's doing it
properly though. I want to makes e820 map as close as possible between PVH and
HVM - let me see if I can improve my HVM version.

>>
>> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> 
> Just one question below and one nit.
> 
>> ---
>> Changes in v2.1:
>> - fixed previously missed uint32_t occurence
>>
>> Changes in v2:
>> - gave more information on NVS type selection and potential alternatives
>>   in the description
>> - minor type fixes suggested
>>
>> ---
>>  tools/firmware/hvmloader/e820.c | 21 +++++++++++++++++----
>>  tools/firmware/hvmloader/util.c |  6 ++++++
>>  tools/firmware/hvmloader/util.h |  3 +++
>>  3 files changed, 26 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/firmware/hvmloader/e820.c 
>> b/tools/firmware/hvmloader/e820.c
>> index 4d1c955..0ad2f05 100644
>> --- a/tools/firmware/hvmloader/e820.c
>> +++ b/tools/firmware/hvmloader/e820.c
>> @@ -155,6 +155,8 @@ int build_e820_table(struct e820entry *e820,
>>  {
>>      unsigned int nr = 0, i, j;
>>      uint32_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT;
>> +    unsigned long firmware_mem_end =
>> +        RESERVED_MEMORY_DYNAMIC_START + (mem_mfns_allocated() << 
>> PAGE_SHIFT);
>>  
>>      if ( !lowmem_reserved_base )
>>              lowmem_reserved_base = 0xA0000;
>> @@ -199,8 +201,19 @@ int build_e820_table(struct e820entry *e820,
>>      nr++;
>>  
>>      /*
>> +     * Mark populated reserved memory that contains ACPI and other tables as
>> +     * ACPI NVS (non-reclaimable) space - that should help the guest to 
>> treat
>> +     * it correctly later (e.g. pass to the next kernel on kexec).
>> +     */
>> +
>> +    e820[nr].addr = RESERVED_MEMBASE;
>> +    e820[nr].size = firmware_mem_end - RESERVED_MEMBASE;
>> +    e820[nr].type = E820_NVS;
>> +    nr++;
>> +
>> +    /*
>>       * Explicitly reserve space for special pages.
>> -     * This space starts at RESERVED_MEMBASE an extends to cover various
>> +     * This space starts after ACPI region and extends to cover various
>>       * fixed hardware mappings (e.g., LAPIC, IOAPIC, default SVGA 
>> framebuffer).
>>       *
>>       * If igd_opregion_pgbase we need to split the RESERVED region in two.
>> @@ -210,8 +223,8 @@ int build_e820_table(struct e820entry *e820,
>>      {
>>          uint32_t igd_opregion_base = igd_opregion_pgbase << PAGE_SHIFT;
>>  
>> -        e820[nr].addr = RESERVED_MEMBASE;
>> -        e820[nr].size = (uint32_t) igd_opregion_base - RESERVED_MEMBASE;
>> +        e820[nr].addr = firmware_mem_end;
>> +        e820[nr].size = igd_opregion_base - firmware_mem_end;
> 
> Is there anything between firmware_mem_end and igd_opregion_base now?
> You already account for RESERVED_MEMBASE to firmware_mem_end.

It's possible that there is something in between. IGD opregion is allocated
dynamically from above and occupies a couple of pages - there is a gap between 
the
top of a populated region and it where other structures could be located.
I don't want to tie e820 to the current allocation order.

>>          e820[nr].type = E820_RESERVED;
>>          nr++;
>>  
>> @@ -227,7 +240,7 @@ int build_e820_table(struct e820entry *e820,
>>      }
>>      else
>>      {
>> -        e820[nr].addr = RESERVED_MEMBASE;
>> +        e820[nr].addr = firmware_mem_end;
>>          e820[nr].size = (uint32_t)-e820[nr].addr;
>>          e820[nr].type = E820_RESERVED;
>>          nr++;
>> diff --git a/tools/firmware/hvmloader/util.c 
>> b/tools/firmware/hvmloader/util.c
>> index 0c3f2d2..59cde4a 100644
>> --- a/tools/firmware/hvmloader/util.c
>> +++ b/tools/firmware/hvmloader/util.c
>> @@ -444,6 +444,12 @@ void mem_hole_populate_ram(xen_pfn_t mfn, uint32_t 
>> nr_mfns)
>>  static uint32_t alloc_up = RESERVED_MEMORY_DYNAMIC_START - 1;
>>  static uint32_t alloc_down = RESERVED_MEMORY_DYNAMIC_END;
>>  
>> +unsigned long mem_mfns_allocated(void)
> 
> mem_pages_allocated might be better.

Ok, I'll see if I keep this function in a new version.

Igor



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.