This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: IOMMU faults

To: Tim Deegan <Tim.Deegan@xxxxxxxxxxxxx>
Subject: [Xen-devel] Re: IOMMU faults
From: Jean Guyader <jean.guyader@xxxxxxxxxxxxx>
Date: Thu, 16 Jun 2011 10:47:57 +0100
Cc: Wei Wang <wei.wang2@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Allen Kay <allen.m.kay@xxxxxxxxx>, Jean Guyader <Jean.Guyader@xxxxxxxxxx>
Delivery-date: Thu, 16 Jun 2011 02:52:28 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20110616092509.GH17634@xxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110616092509.GH17634@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.21 (2010-09-15)
On 16/06 10:25, Tim Deegan wrote:
> Hi, IOMMU maintainers,
> What should Xen do when an IOMMU fault happens?  As far as I can
> see both the AMD and Intel code clears the error in the IOMMU and
> carries on, but I suspect some more vigorous action is appropriate.
> I've seen traces from an Intel machine that seemed to be livelocked on
> IOMMU faults from a passed-through VGA card, until it was killed by the
> watchdog.  I think I can see two things that contribute to that:
>  - The Intel IOMMU fault handler prints quite a lot of info in interrupt
>    context, making it easier to livelock.  Still I think the general
>    problem applies on AMD too.
>  - Domain destruction re-assigns passed though cards to dom0, but the
>    cards don't seem to get reset.  So there's nothing to stop a card
>    battering away at DMA in the meantime.  That seems like a problem
>    independent of livelock, actually.
> In any case, it seems like it would be a good idea to stop a
> broken/malicious/deassigned card from flooding Xen with IOMMU faults.
> I was considering just writing 0 to the faulting card's PCI command
> register, but I'm told that's not always enough to properly deactivate
> a card, and it might be a little over-zealous to do it on the first
> offence. 
> Ideas?

Hi Tim,

We have seed such behavior when we were testing GPU assignement especially
the Intel GPU. The problem is that domain destruction in Xen is assynchronous
and right now the pci device reset is done in dom0 with some help of the 

In the Intel GPU case we need to make sure that the guest memory and the IOMMU
are still in place while we perform to reset otherwise the device drift into
an unstable state.

There is probably other ways to do that in a cleaner way but we decided to move
the pci reset code into Xen, so we are sure we perform the reset while the 
is in a known state (functionning state).

Attached is the patch we have in XenClient that move the pci reset into Xen.
The modifications we have made to the VT-d code should go in the IOMMU generic
section. I appologise but this patch is based on Xen 3.4, if we think this is
the right way to do it, I can submit a proper patch against unstable and 4.1.


Attachment: iommu-vtd-pci-reset-on-reassign
Description: Text document

Xen-devel mailing list