[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] AMD_IOV: IO_PAGE_FALT trying to pass through Mellanox ConnectX HCA (debian testing)



Joerg,

Any idea what this error might signify?
> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca000
> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca040
> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca080
> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca0c0

We have been stabing in the dark enabling certain knobs, .. but I am
just curious - the fault address - that is the real physical address right?
>From the looks of it looks like a normal RAM region, not the PCI BAR space - 
>the
AMD VI chipset doesn't really distinguish between those, or does it?

Ward, can you post your lspci -vvv -s 02:00.0 output? I am curious to see
what the PCI BAR space is.


On Thu, Feb 03, 2011 at 06:24:33PM -0500, Ward Vandewege wrote:
> On Mon, Jan 31, 2011 at 03:03:22PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > > you might need to make sure your driver is using the VM_IO flag.
> > > > > > 
> > > > > > There was some discussion on LKML about this and they proposed
> > > > > > a patch that wasn't neccessary. Don't remember the details but I can
> > > > > > look that up next week.
> > > > 
> > > > Found it.. it was from Vivien but in another thread:
> > > > http://www.mail-archive.com/linux-rdma@xxxxxxxxxxxxxxx/msg06980.html
> > > 
> > > Ah. Is your 
> > > 
> > >   devel/p2m-identity.v4.5
> > > 
> > > still the one I should test with to see if it fixes this problem? I see
> > > you've got newer versions (up to v4.7) now too. 
> > 
> > It has a bug that I am working on. I would just look for the VM_IO flag
> > and see if it has been applied somewhere. Or vice-versa - look for where
> > it has _not_ been applied.
> 
> There are no VM_IO references in the mlx4 driver (the one from OFED 1.5.2).
> Analogous with what Vivien did, I added 
> 
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -548,6 +548,8 @@
>     return -EINVAL;
> 
>   if (vma->vm_pgoff == 0) {
> +   vma->vm_flags |= VM_IO;
> +   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>     vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> 
>     if (io_remap_pfn_range(vma, vma->vm_start,
> @@ -555,6 +557,8 @@
>                PAGE_SIZE, vma->vm_page_prot))
>       return -EAGAIN;
>   } else if (vma->vm_pgoff == 1 && dev->dev->caps.bf_reg_size != 0) {
> +   vma->vm_flags |= VM_IO;
> +   vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>     vma->vm_page_prot = pgprot_wc(vma->vm_page_prot);
> 
>     if (io_remap_pfn_range(vma, vma->vm_start,
> 
> But that didn't change a thing. The driver still complains when loaded:
> 
> [    1.984843] mlx4_core: Initializing 0000:00:00.0
> [    1.985007] mlx4_core 0000:00:00.0: enabling device (0000 -> 0002)
> [    1.985007] mlx4_core 0000:00:00.0: Xen PCI enabling IRQ: 19
> [    2.994953] mlx4_core 0000:00:00.0: Installed FW has unsupported command 
> interface revision 0.
> [    2.994997] mlx4_core 0000:00:00.0: (Installed FW version is 0.0.000)
> [    2.995058] mlx4_core 0000:00:00.0: This driver version supports only 
> revisions 2 to 3.
> [    2.995087] mlx4_core 0000:00:00.0: QUERY_FW command failed, aborting.
> 
> And it still generates this in Xen's dmesg on the dom0:
> 
> [ 2862.038307] pciback: vpci: 0000:02:00.0: assign to virtual slot 0
> [ 2862.041910] pciback 0000:02:00.0: device has been assigned to another 
> domain! Over-writting the ownership, but beware.
> [ 2863.076729] blkback: ring-ref 9, event-channel 10, protocol 1 (x86_64-abi)
> [ 2863.097501] blkback: ring-ref 10, event-channel 11, protocol 1 (x86_64-abi)
> [ 2864.863782] pciback 0000:02:00.0: enabling device (0000 -> 0002)
> [ 2864.864217] xen_allocate_pirq: returning irq 19 for gsi 19
> [ 2864.864867] Already setup the GSI :19
> [ 2864.865232] pciback 0000:02:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 
> 19

> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca000
> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca040
> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca080
> (XEN) AMD_IOV: IO_PAGE_FALT: domain:3, device id:0x200, fault 
> address:0x7e7ca0c0
> 
> I guess there must be something else going on, and/or the above change is not
> the right one.
> 
> Thanks,
> Ward.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.