[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable-staging: Xen BUG at iommu_map.c:455



On 10/04/15 11:24, Sander Eikelenboom wrote:
> Hi Andrew,
>
> Finally got some time to figure this out .. and i have narrowed it down to:
> git://xenbits.xen.org/staging/qemu-upstream-unstable.git
> commit 7665d6ba98e20fb05c420de947c1750fd47e5c07 "Xen: Use the ioreq-server 
> API when available"
> A straight revert of this commit prevents the issue from happening.
>
> The reason i had a hard time figuring this out was:
> - I wasn't aware of this earlier, since git pulling the main xen tree, 
> doesn't 
>   auto update the qemu-* trees.

This has caught me out so many times.  It is very non-obvious behaviour.

> - So i happen to get this when i cloned a fresh tree to try to figure out the 
>   other issue i was seeing.
> - After that checking out previous versions of the main xen tree didn't 
> resolve 
>   this new issue, because the qemu tree doesn't get auto updated and is set 
>   "master".
> - Cloning a xen-stable-4.5.0 made it go away .. because that has a specific 
>   git://xenbits.xen.org/staging/qemu-upstream-unstable.git tag which is not 
>   master.
>
> *sigh* 
>
> This is tested with xen main tree at last commit 
> 3a28f760508fb35c430edac17a9efde5aff6d1d5
> (normal xen-unstable, not the staging branch)
>
> Ok so i have added some extra debug info (see attached diff) and this is the 
> output when it crashes due to something the commit above triggered, the 
> level is out of bounds and the pfn looks fishy too.
> Complete serial log from both bad and good (specific commit reverted) are 
> attached.

Just to confirm, you are positively identifying a qemu changeset as
causing this crash?

If so, the qemu change has discovered a pre-existing issue in the
toolstack pci-passthrough interface.  Whatever qemu is or isn't doing,
it should not be able to cause a crash like this.

With this in mind, I need to brush up on my AMD-Vi details.

In the meantime, can you run with the following patch to identify what
is going on, domctl wise?  I assume it is the assign_device which is
failing, but it will be nice to observe the differences between the
working and failing case, which might offer a hint.

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 9f3413c..57eb311 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1532,6 +1532,11 @@ int iommu_do_pci_domctl(
         max_sdevs = domctl->u.get_device_group.max_sdevs;
         sdevs = domctl->u.get_device_group.sdev_array;
 
+        printk("*** %pv->d%d: get_device_group({%04x:%02x:%02x.%u, %u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+               max_sdevs);
+
         ret = iommu_get_device_group(d, seg, bus, devfn, sdevs, max_sdevs);
         if ( ret < 0 )
         {
@@ -1558,6 +1563,10 @@ int iommu_do_pci_domctl(
         bus = (domctl->u.assign_device.machine_sbdf >> 8) & 0xff;
         devfn = domctl->u.assign_device.machine_sbdf & 0xff;
 
+        printk("*** %pv->d%d: test_assign_device({%04x:%02x:%02x.%u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
         if ( device_assigned(seg, bus, devfn) )
         {
             printk(XENLOG_G_INFO
@@ -1582,6 +1591,10 @@ int iommu_do_pci_domctl(
         bus = (domctl->u.assign_device.machine_sbdf >> 8) & 0xff;
         devfn = domctl->u.assign_device.machine_sbdf & 0xff;
 
+        printk("*** %pv->d%d: assign_device({%04x:%02x:%02x.%u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
         ret = device_assigned(seg, bus, devfn) ?:
               assign_device(d, seg, bus, devfn);
         if ( ret == -ERESTART )
@@ -1604,6 +1617,10 @@ int iommu_do_pci_domctl(
         bus = (domctl->u.assign_device.machine_sbdf >> 8) & 0xff;
         devfn = domctl->u.assign_device.machine_sbdf & 0xff;
 
+        printk("*** %pv->d%d: deassign_device({%04x:%02x:%02x.%u})\n",
+               current, d->domain_id,
+               seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+
         spin_lock(&pcidevs_lock);
         ret = deassign_device(d, seg, bus, devfn);
         spin_unlock(&pcidevs_lock);


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.