[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hvmloader: pass PCI MMIO layout to OVMF as an info table



On 01/11/21 16:21, Jan Beulich wrote:
> On 11.01.2021 15:49, Laszlo Ersek wrote:
>> On 01/11/21 15:00, Igor Druzhinin wrote:
>>> On 11/01/2021 09:27, Jan Beulich wrote:
>>>> On 11.01.2021 05:53, Igor Druzhinin wrote:
>>>>> We faced a problem with passing through a PCI device with 64GB BAR to
>>>>> UEFI guest. The BAR is expectedly programmed into 64-bit PCI aperture at
>>>>> 64G address which pushes physical address space to 37 bits. OVMF uses
>>>>> address width early in PEI phase to make DXE identity pages covering
>>>>> the whole addressable space so it needs to know the last address it needs
>>>>> to cover but at the same time not overdo the mappings.
>>>>>
>>>>> As there is seemingly no other way to pass or get this information in
>>>>> OVMF at this early phase (ACPI is not yet available, PCI is not yet 
>>>>> enumerated,
>>>>> xenstore is not yet initialized) - extend the info structure with a new
>>>>> table. Since the structure was initially created to be extendable -
>>>>> the change is backward compatible.
>>>>
>>>> How does UEFI handle the same situation on baremetal? I'd guess it is
>>>> in even more trouble there, as it couldn't even read addresses from
>>>> BARs, but would first need to assign them (or at least calculate
>>>> their intended positions).
>>>
>>> Maybe Laszlo or Anthony could answer this question quickly while I'm 
>>> investigating?
>>
>> On the bare metal, the phys address width of the processor is known.
> 
> From CPUID I suppose.
> 
>> OVMF does the whole calculation in reverse because there's no way for it
>> to know the physical address width of the physical (= host) CPU.
>> "Overdoing" the mappings doesn't only waste resources, it breaks hard
>> with EPT -- access to a GPA that is inexpressible with the phys address
>> width of the host CPU (= not mappable successfully with the nested page
>> tables) will behave super bad. I don't recall the exact symptoms, but it
>> prevents booting the guest OS.
>>
>> This is why the most conservative 36-bit width is assumed by default.
> 
> IOW you don't trust virtualized CPUID output?

That's correct; it's not trustworthy / reliable.

One of the discussions (of the many) is here:

https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg04716.html

Thanks
Laszlo




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.