WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
From: "Cinco, Dante" <Dante.Cinco@xxxxxxx>
Date: Thu, 22 Oct 2009 10:33:32 -0600
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "He, Qing" <qing.he@xxxxxxxxx>
Delivery-date: Thu, 22 Oct 2009 09:34:08 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C705E782.B44A%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcpS6aDaWFCn2jAWTyGt0OCtMnyH8QAAEEAgAASCyWcADevYcA==
Thread-topic: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)
Xiantao,

I'm sorry I forgot to mention that I did apply your two patches but it didn't 
have any effect (interrupts still lost after changing smp_affinity and "No 
handler for irq vector" message). I added a dprintk in msi_set_mask_bit() and 
realized that MSI does not have a mask bit (MSIX does). My PCI device uses MSI 
not MSIX. I placed my dprintk inside the condition below and it never triggered.

    switch (entry->msi_attrib.type) {
    case PCI_CAP_ID_MSI:
        if (entry->msi_attrib.maskbit) {

While debugging this problem, I thought about the potential problem of an 
interrupt firing between the writes for the MSI message address and MSI message 
data. I noticed that pci_conf_write() uses spin_lock_irqsave() to disable 
interrupts before issuing the "out" instruction but the writes for the address 
and data are two separate pci_conf_write() calls. To me, it would be safer to 
write the address and data in a single call and preceded by 
spin_lock_irqsave(). This way, when the interrupts are enabled, the address and 
data have both been updated.

Dante

-----Original Message-----
From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx] 
Sent: Thursday, October 22, 2009 2:42 AM
To: Zhang, Xiantao; Jan Beulich
Cc: He, Qing; xen-devel@xxxxxxxxxxxxxxxxxxx; Cinco, Dante
Subject: Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP 
ProLiant G6 with dual Xeon 5540 (Nehalem)

On 22/10/2009 09:41, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx> wrote:

>> Hmm, then I don't understand which case your patch was a fix for: I 
>> understood that it addresses an issue when the affinity of an 
>> interrupt gets changed (requiring a re-write of the address/data 
>> pair). If the hypervisor can deal with it without masking, then why 
>> did you add it?
> 
> Hmm, sorry, seems I misunderstood your question. If the msi doesn't 
> support mask bit(clearing MSI enable bit doesn't help in this case), 
> the issue may still exist. Just checked Linux side, seems it doesn't 
> perform mask operation when program MSI, but don't know why Linux 
> hasn't such issues.  Actaully, we do see inconsisten interrupt message 
> from the device without this patch, and after applying the patch, the 
> issue is gone.  May need further investigation why Linux doesn't need the 
> mask operation.

Linux is quite careful about when it will reprogram vector/affinity info isn't 
it? Doesn't it mark such an update pending and only flush it through during 
next interrupt delivery, or something like that? Do we need some of the 
upstream Linux patches for this?

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>