[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)



On Wed, Oct 14, 2009 at 01:54:33PM -0600, Cinco, Dante wrote:
> I switched over to Xen 3.5-unstable (changeset 20303) and pv_ops dom0 
> 2.6.31.1 hoping that this would resolve the IRQ SMP affinity problem. I had 
> to use pci-stub to hide the PCI devices since pciback wasn't working. With 
> vcpus=16 (APIC routing is physical flat), the interrupts were working in domU 
> and being routed to CPU0 with the default smp_affinity (ffff) but as soon as 
> I changed it to any 16-bit one-hot value or even setting it to the  same 
> default value resulted in a complete loss of interrupts (even in the devices 
> that didn't have any change to smp_affinity). With vcpus=4 (APIC routing is 
> logical flat), I can see the interrupts being load balanced across all CPUs 
> but as soon as I changed smp_affinity to any value, the interrupts stopped. 
> This used to work reliably with the non-pv_ops kernel. I attached the logs in 
> case anyone wants to take a look.
> 
> I did see the MSI message address/data change in both domU and dom0 (using 
> "lspci -vv"):
> 
> vcpus=16:
> 
> domU MSI message address/data with default smp_affinity: Address: 
> 00000000fee00000  Data: 40a9
> domU MSI message address/data after smp_affinity=0010:   Address: 
> 00000000fee08000  Data: 40b1 (8 is APIC ID of CPU4)

What does Xen tell you (hit Ctrl-A three times and then 'z'). Specifically look 
for vector 169 (a9) and 177 (b1).
Do those values match with what you see in DomU and Dom0? Mainly that 177 has 
dest_id of 8.
Oh, and also check the guest interrupt information, to see if those values 
match..
> 
> dom0 MSI message address/data with default smp_affinity: Address: 
> 00000000fee00000  Data: 4094
> dom0 MSI message address/data after smp_affinity=0010:   Address: 
> 00000000fee00000  Data: 409c
> 
> Aside from "lspci -vv" what other means are there to track down this problem? 
> Is there some way to print the interrupt vector table? I'm considering adding 
> printk's to the code that Qing mentioned in his previous email (see below). 
> Any suggestions on where in the code to add the printk's?

Hit Ctrl-A three times and you can get a wealth of information.. Of interest 
might also
be the IO APIC area - you can see if the vector in question is masked?

> 
> Thanks.
> 
> Dante
> 
> -----Original Message-----
> From: Qing He [mailto:qing.he@xxxxxxxxx] 
> Sent: Sunday, October 11, 2009 10:55 PM
> To: Cinco, Dante
> Cc: Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx; xiantao.zhang@xxxxxxxxx
> Subject: Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on 
> HP ProLiant G6 with dual Xeon 5540 (Nehalem)
> 
> On Mon, 2009-10-12 at 13:25 +0800, Cinco, Dante wrote:
> > With vcpus < 4, logical flat mode works fine (no error message). I can 
> > change smp_affinity to any value > 0 and < 16 and the interrupts go to 
> > the proper CPU(s). Could you point me to the code that handles MSI so 
> > that I can better understand the MSI implementation?
> 
> There are two parts:
>   1) init or changing the data and address of MSI:
>     1) qemu-xen: hw/passthrough.c: pt_msg.*_write, MSI access are
>                  trapped here first. And then pt_update_msi in
>                  hw/pt-msi.c is called to update the MSI binding.
>     2) xen:      drivers/passthrough/io.c: pt_irq_create_bind_vtd,
>                  where MSI is actually bound to the guest.
> 
>   2) on MSI reception:
>     In drivers/passthrough/io.c, hvm_do_IRQ_dpci and hvm_dirq_assist
>     are the routines responsible for handling all assigned irqs
>     (including MSI), and if an MSI is received, vmsi_deliver in
>     arch/x86/vmsi.c get called to deliver MSI to the corresponding
>     vlapic.
> 
> And I just learned from Xiantao Zhang that for the guest Linux kernel, it 
> enables per-cpu vector if it's in physical mode, and that looks more likely 
> relevant to this problem. It had problem in the older xen to handle this, and 
> changeset 20253 is supposed to fix it, although I noticed your xen version is 
> 20270.
> 
> Thanks,
> Qing


> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.