[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru



On Tue, 17 Jan 2023 19:15:57 -0500
Chuck Zmudzinski <brchuckz@xxxxxxx> wrote:

> On 1/17/2023 6:04 AM, Igor Mammedov wrote:
> > On Mon, 16 Jan 2023 13:00:53 -0500
> > Chuck Zmudzinski <brchuckz@xxxxxxxxxxxx> wrote:
> >  
> > > On 1/16/23 10:33, Igor Mammedov wrote:  
> > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > Chuck Zmudzinski <brchuckz@xxxxxxx> wrote:
> > > >     
> > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:    
> > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > >> > Chuck Zmudzinski <brchuckz@xxxxxxx> wrote:
> > > >> >       
> > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:      
> > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote: 
> > > >> >> >        
> > > >> >> >> I think the change Michael suggests is very minimalistic: Move 
> > > >> >> >> the if
> > > >> >> >> condition around xen_igd_reserve_slot() into the function itself 
> > > >> >> >> and
> > > >> >> >> always call it there unconditionally -- basically turning three 
> > > >> >> >> lines
> > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem 
> > > >> >> >> specific,
> > > >> >> >> Michael further suggests to rename it to something more general. 
> > > >> >> >> All
> > > >> >> >> in all no big changes required.        
> > > >> >> > 
> > > >> >> > yes, exactly.
> > > >> >> >         
> > > >> >> 
> > > >> >> OK, got it. I can do that along with the other suggestions.      
> > > >> > 
> > > >> > have you considered instead of reservation, putting a slot check in 
> > > >> > device model
> > > >> > and if it's intel igd being passed through, fail at realize time  if 
> > > >> > it can't take
> > > >> > required slot (with a error directing user to fix command line)?     
> > > >> >  
> > > >> 
> > > >> Yes, but the core pci code currently already fails at realize time
> > > >> with a useful error message if the user tries to use slot 2 for the
> > > >> igd, because of the xen platform device which has slot 2. The user
> > > >> can fix this without patching qemu, but having the user fix it on
> > > >> the command line is not the best way to solve the problem, primarily
> > > >> because the user would need to hotplug the xen platform device via a
> > > >> command line option instead of having the xen platform device added by
> > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > >> bus, and that delay in adding the xen platform device degrades
> > > >> startup performance of the guest.
> > > >>     
> > > >> > That could be less complicated than dealing with slot reservations 
> > > >> > at the cost of
> > > >> > being less convenient.      
> > > >> 
> > > >> And also a cost of reduced startup performance    
> > > > 
> > > > Could you clarify how it affects performance (and how much).
> > > > (as I see, setup done at board_init time is roughly the same
> > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > options which should be negligible. and both ways are done before
> > > > guest runs)    
> > > 
> > > I preface my answer by saying there is a v9, but you don't
> > > need to look at that. I will answer all your questions here.
> > > 
> > > I am going by what I observe on the main HDMI display with the
> > > different approaches. With the approach of not patching Qemu
> > > to fix this, which requires adding the Xen platform device a
> > > little later, the length of time it takes to fully load the
> > > guest is increased. I also noticed with Linux guests that use
> > > the grub bootoader, the grub vga driver cannot display the
> > > grub boot menu at the native resolution of the display, which
> > > in the tested case is 1920x1080, when the Xen platform device
> > > is added via a command line option instead of by the
> > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > native resolution of the display. Once the guest fully loads,
> > > there is no noticeable difference in performance. It is mainly
> > > a degradation in startup performance, not performance once
> > > the guest OS is fully loaded.  
> >
> > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI
> > option, and actually drop at least graphics defaults explicitly.
> > So it is expected to work fine even when IGD is constructed with
> > '-device'.
> >
> > Could you provide full CLI current xen starts QEMU with and then
> > a CLI you used (with explicit -device for IGD) that leads
> > to reduced performance?
> >
> > CCing vfio folks who might have an idea what could be wrong based
> > on vfio experience.  
> 
> Actually, the igd is not added with an explicit -device option using Xen:
> 
>    1573 ?        Ssl    0:42 /usr/bin/qemu-system-i386 -xen-domid 1 
> -no-shutdown -chardev 
> socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon 
> chardev=libxl-cmd,mode=control -chardev 
> socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait 
> -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name 
> windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 
> -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 
> -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback 
> -drive 
> file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback
> 
> I think it is added by xl (libxl management tool) when the guest is created
> using the qmp-libxl socket that appears on the command line, but I am not 100
> percent sure. So, with libxl, the command line alone does not tell the whole
> story. The xl.cfg file has a line like this to define the pci devices passed 
> through,
> and in qemu they are type XEN_PT devices, not VFIO devices:
> 
> pci = [ '00:1b.0','00:14.0','00:02.0@02' ]
> 
> This means three host pci devices are passed through, the ones on the
> host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd.
> The @02 means libxl is requesting slot 2 in the guest for the igd, the
> other 2 devices are just auto assigned a slot by Qemu. Qemu cannot
> assign the igd to slot 2 for xenfv machines without a patch that prevents
> the Xen platform device from grabbing slot 2. That is what this patch
> accomplishes. The workaround involves using the Qemu pc machine
> instead of the Qemu xenfv machine, in which case the code in Qemu
> that adds the Xen platform device at slot 2 is avoided, and in that case
> the Xen platform device is added via a command line option instead
> at slot 3 instead of slot 2.
> 
> The differences between vfio and the Xen pci passthrough device
> might make a difference in Xen vs. kvm for igd-pasthru.
> 
> Also, kvm does not use the Xen platform device, and it seems the
> Xen guests behave better at startup when the Xen platform device
> is added very early, during the initialization of the emulated devices
> of the machine, which is based on the i440fx piix3 machine, instead
> of being added using a command line option. Perhaps the performance
> at startup could be improved by adding the igd via a command line
> option using vfio instead of the canonical way that libxl does pci
> passthrough on Xen, but I have no idea if vfio works on Xen the way it
> works on kvm. I am willing to investigate and experiment with it, though.
> 
> So if any vfio people can shed some light on this, that would help.

ISTR some rumors of Xen thinking about vfio, but AFAIK there is no
combination of vfio with Xen, nor is there any sharing of device quirks
to assign IGD.  IGD assignment has various VM platform dependencies, of
which the bus address is just one.  Vfio support for IGD assignment
takes a much different approach, the user or management tool is
responsible for configuring the VM correctly for IGD assignment,
including assigning the device to the correct bus address and using the
right VM chipset, with the correct slot free for the LPC controller.  If
all the user configuration of the VM aligns correctly, we'll activate
"legacy mode" by inserting the opregion and instantiate the LPC bridge
with the correct fields to make the BIOS and drivers happy.  Otherwise,
maybe the user is trying to use the mythical UPT mode, but given
Intel's lack of follow-through, it probably just won't work.  Or maybe
it's a vGPU and we don't need the legacy features anyway.

Working with an expectation that QEMU needs to do the heavy lifting to
make it all work automatically, with no support from the management
tool for putting devices in the right place, I'm afraid there might not
be much to leverage from the vfio vesion.  Thanks,

Alex




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.