[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH 07/12] hvmloader: allocate MMCONFIG area in the MMIO hole + minor code refactoring



On Wed, 28 Mar 2018 10:30:32 +0100
Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:

>On Wed, Mar 28, 2018 at 01:37:29AM +1000, Alexey G wrote:
>> On Tue, 27 Mar 2018 09:45:30 +0100
>> Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>>   
>> >On Tue, Mar 27, 2018 at 05:42:11AM +1000, Alexey G wrote:  
>> >> On Mon, 26 Mar 2018 10:24:38 +0100
>> >> Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>> >>     
>> >> >On Sat, Mar 24, 2018 at 08:32:44AM +1000, Alexey G wrote:    
>
>> BTW, another somewhat related problem at the moment is that Xen knows
>> nothing about a chipset-specific MMIO hole(s). Due to this, it is
>> possible for a guest to map PT BARs outside the MMIO hole, leading to
>> errors like this:
>> 
>> (XEN) memory_map:remove: dom4 gfn=c8000 mfn=c8000 nr=2000
>> (XEN) memory_map:add: dom4 gfn=ffffffffc8000 mfn=c8000 nr=2000
>> (XEN) p2m.c:1121:d0v5 p2m_set_entry: 0xffffffffc8000:9 -> -22
>> (0xc8000) (XEN) memory_map:fail: dom4 gfn=ffffffffc8000 mfn=c8000
>> nr=2000 ret:-22 (XEN) memory_map:remove: dom4 gfn=ffffffffc8000
>> mfn=c8000 nr=2000 (XEN) p2m.c:1228:d0v5 gfn_to_mfn failed!
>> gfn=ffffffffc8000 type:4 (XEN) memory_map: error -22 removing dom4
>> access to [c8000,c9fff] (XEN) memory_map:remove: dom4
>> gfn=ffffffffc8000 mfn=c8000 nr=2000 (XEN) p2m.c:1228:d0v5 gfn_to_mfn
>> failed! gfn=ffffffffc8000 type:4 (XEN) memory_map: error -22
>> removing dom4 access to [c8000,c9fff] (XEN) memory_map:add: dom4
>> gfn=c8000 mfn=c8000 nr=2000
>> 
>> Note that it was merely a lame BAR sizing attempt from the
>> guest-side SW (a PCI config space viewing tool) -- writing F's to
>> the high part of the MMIO BAR first.  
>
>You should disable memory decoding before attempting to size a BAR.

The problem is, that PCI config space viewer is not mine. :)
It should disable the decoding first normally, yes, but it doesn't. Yet
there are no problems on the real system and these errors while being
run in a VM. IIRC powercycling the guest and triggering these errors
multiple times even had negative impact on host's stability, so it's a
good test case.

>This error has nothing to do with trying to move a BAR outside of the
>MMIO hole, this error is caused by the gfn being bigger than the guest
>physical address width AFAICT.

In fact, it's the essence of the error -- an attempt to map the range
where it shouldn't be attempted to map at all.
p2m_set_entry is too deep to encounter this error, it should be avoided
much earlier. If we knew the limits where we can (and cannot) map the
PT device BARs, we can check if we really need to proceed with the
mapping. This way we can handle that "mid-sizing/mid-change" condition
when only half of the 64-bit mem BAR has been written.

>> If we will know the guest's MMIO hole bounds, we can adapt to this
>> behavior, avoiding erroneous mapping attempts to a wrong address
>> outside the MMIO hole. Only the MMIO hole designated range can be
>> used to map PT device BARs.
>> 
>> So, if we will be actually emulating MCH's MMIO hole related
>> registers in Xen as well -- we can use them as scratchpad registers
>> (write-once of course) to pass this kind of information between Xen
>> and other involved parties as an alternative to eg. a dedicated
>> hypercall.  
>
>I'm not sure where this information is stored in MCH, guest OSes tend
>to fetch this from the ACPI _CRS method of the host-pci bridge device.
>
>I also don't see QEMU emulating such registers, but yes, I won't be
>opposed to storing/reporting this in some registers if that's indeed
>supported. Note that I don't think this should be mandatory for adding
>Q35 support though.

This info needed for Xen, not guest OSes -- in order to avoid errors
like described above. If we will be emulating MCH in Xen internally, we
can emulate this registers as well. It will be simpler than introducing
a new hypercall to inform Xen about the established MMIO hole range.

Anyway, you're right, it's a side issue. Just an example for what else
the built-in MCH emulation may be useful.

>> >> What this approach will require:
>> >> --------------------------------
>> >> 
>> >> - Changes in QEMU code to support a new chipset-less machine(s).
>> >> In theory might be possible to implement on top of the "null"
>> >> machine concept    
>> >
>> >Not all parts of the chipset should go inside of Xen, ATM I only
>> >foresee Q35 MCH being implemented inside of Xen. So I'm not sure
>> >calling this a chipset-less machine is correct from QEMU PoV.  
>> 
>> Emulating only MCH in Xen will still require lot of changes but 
>> overall benefit will become unclear -- basically, we just move
>> PCIEXBAR emulation to Xen from QEMU.  
>
>At least it would make Xen the one controlling the MCFG area, which is
>important. It would also be the first step into moving other chipset
>functionality into Xen.
>
>Not doing it just perpetuates the bad precedent that we already have
>with the previous chipset.

I think it will be kinda ugly if we will be emulating just MCH in Xen
and ICH9 (+ all the rest) in QEMU at the same time. It looks more like
some temporary solution. It would be good to know if such approach will
be approved by maintainers.

>> >What are specifically the registers that you mention?  
>> 
>> Write-once emulation of TOLUD/TOUUD/REMAPBASE/REMAPLIMIT registers
>> for hvmloader to use. That's the approach I'm actually using to make
>> 'hvmloader/allow-memory-relocate=1' to work. Memory relocation
>> without relying on add_to_physmap hypercall for hvmloader (which it
>> does currently) while having MMIO/memory layout synchronized between
>> all parties. There are multiple benefits (mostly for PT needs),
>> including the MMIO hole auto-sizing support but this approach won't
>> be accepted well with "toolstack should do everything" attitude I'm
>> afraid.  
>
>You seem to be trying to fix several issues at the same time, which
>just makes this much more complex than needed. The initial aim of this
>series was to allow HVM guests to use the Q35 chipset. I think that's
>what we should focus on.

Agree. Initially, the main goal was to allow the PCIe extended config
space usage for PT devices. Even this particular feature is not in its
final state, there are other patches for hw/xen/xen-pt*.c pending
(dynamic fields support), but these are more common, not bound to
just Q35.

>As you have listed above (and in other emails) there are many
>limitations with the current HVM approach, which I would be more than
>happy for you to solve. But IMO not all of them must be solved in
>order to add Q35 support.
>
>Since this series and email thread has already gone quite far, would
>you mind writing a design document with the approach that we
>discussed?

I think we must all agree which approach to implement next. Basically,
whether we need to completely discard the option #1 for this series and
move on with #2. That lengthy requirements/risks email was an attempt to
provide some ground for comparison.

Leaving only required devices like vga/usb/network/storage to QEMU while
emulating everything else in Xen is a good milestone, but, as I
understood we currently targeting less ambitious goals for option #2 --
emulating only MCH in Xen while emulating ICH9 etc in QEMU.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.