[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] PCI passthrough (pci-attach) to HVM guests bug (BAR64 addresses are bogus)



>>> On 12.11.14 at 02:37, <konrad.wilk@xxxxxxxxxx> wrote:
> When we PCI insert an device, the BARs are not set at all - and hence
> the Linux kernel is the one that tries to set the BARs in. The
> reason it cannot fit the device in the MMIO region is due to the
> _CRS only having certain ranges (even thought the MMIO region can
> cover 2GB). See:
> 
> Without any devices (and me doing PCI insertion after that):
> # dmesg | grep "bus resource"
> [    0.366000] pci_bus 0000:00: root bus resource [bus 00-ff]
> [    0.366000] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7]
> [    0.366000] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff]
> [    0.366000] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff]
> [    0.366000] pci_bus 0000:00: root bus resource [mem 0xf0000000-0xfbffffff]
> 
> With the device (my GPU card) inserted so that hvmloader can enumerate it:
>  dmesg | grep 'resource'     
> [    0.455006] pci_bus 0000:00: root bus resource [bus 00-ff]
> [    0.459006] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7]
> [    0.462006] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff]
> [    0.466006] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff]
> [    0.469006] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xfbffffff]
> 
> I chatted with Bjorn and Rafeal on IRC about how PCI insertion works
> on baremetal and it sounds like Thunderbolt device insertion is an
> interesting case. The SMM sets the BAR regions to fit within the MMIO
> (which is advertised by the _CRS) and it then pokes the OS to enumerate
> the BARs. The OS is free to use what the firmware has set or renumber
> it. The end result is that since the SMM 'fits' the BAR inside the
> pre-set _CRS window it all works. We do not do that.

Who does the BAR assignment is pretty much orthogonal to the
problem at hand: If the region reserved for MMIO is too small,
no-one will be able to fit a device in there. Plus, what is being
reported as root bus resource doesn't have to have a
connection to the ranges usable for MMIO at all, at least if I
assume that the (Dell) system I'm right now looking at isn't
completely screwed:

pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
pci_bus 0000:00: root bus resource [mem 0x00000000-0x3fffffffff]

(i.e. it simply reports the full usable 38 bits wide address space)

Looking at another (Intel) one, there is no mention of regions
above the 4G boundary at all:

pci_bus 0000:00: root bus resource [bus 00-3d]
pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7]
pci_bus 0000:00: root bus resource [io  0x0d00-0xffff]
pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff]
pci_bus 0000:00: root bus resource [mem 0x000c4000-0x000cbfff]
pci_bus 0000:00: root bus resource [mem 0xfed40000-0xfedfffff]
pci_bus 0000:00: root bus resource [mem 0xd0000000-0xf7ffffff]

Not sure how the OS would know it is safe to assign BARs above
4Gb here.

In any event, what you need is an equivalent of the frequently
seen BIOS option controlling the size of the space to be reserved
for MMIO (often allowing it to be 1, 2, or 3 Gb). I.e. an alternative
(or extension) to the dynamic lowering of pci_mem_start in
hvmloader.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.