[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] MSI badness in xen-unstable



Probably there are more problems, you could also try a xen-unstable from before 
the commit that changed this code (msi.c)
Another thing that could make it eassier to debug would be to put some printk's 
around the WARN_ON's in msi.c  at the linenumbers that gave the warnings, 
showing but parts of the equation in the WARN_ON

--

Sander

Saturday, October 16, 2010, 7:14:11 PM, you wrote:

> On Sat, Oct 16, 2010 at 9:29 AM, Sander Eikelenboom
> <linux@xxxxxxxxxxxxxx> wrote:
>> Hi Bruce,
>>
>> I tripped over the same warning trying to solve my freezes.
>> Jan Beulich has posted a patch which is not in xen-unstable yet: [Xen-devel] 
>> [PATCH] x86/msi: fix inverted masks in c/s 22182:68cc3c514a0a
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxxxx>
>>
>> --- a/xen/arch/x86/msi.c
>> +++ b/xen/arch/x86/msi.c
>> @@ -549,14 +549,14 @@ static u64 read_pci_mem_bar(u8 bus, u8 s
>>         return 0;
>>     if ( (addr & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == 
>> PCI_BASE_ADDRESS_MEM_TYPE_64 )
>>     {
>> -        addr &= ~PCI_BASE_ADDRESS_MEM_MASK;
>> +        addr &= PCI_BASE_ADDRESS_MEM_MASK;
>>         if ( ++bir >= limit )
>>             return 0;
>>         return addr |
>>                ((u64)pci_conf_read32(bus, slot, func,
>>                                      PCI_BASE_ADDRESS_0 + bir * 4) << 32);
>>     }
>> -    return addr & ~PCI_BASE_ADDRESS_MEM_MASK;
>> +    return addr & PCI_BASE_ADDRESS_MEM_MASK;
>>  }
>>
>>  /**
>>
>>
>>
>> That fixes the warn, but my machine still keeps freezing non the less.
>> (but it also does so with pci=nomsi so it's not msi specific in my case)
>>
>> --
>>
>> Sander

> Hi Sander,

> Thank you.  I tried it against 4.1.0-22240 with no effect.
> I confirmed I had the right patch:

0 %>> hg diff  xen/arch/x86/msi.c

> diff -r 38ad3633ecaf xen/arch/x86/msi.c
> --- a/xen/arch/x86/msi.c        Wed Oct 13 12:01:30 2010 +0100
> +++ b/xen/arch/x86/msi.c        Sat Oct 16 10:12:31 2010 -0700
> @@ -549,14 +549,14 @@
>          return 0;
>      if ( (addr & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
> PCI_BASE_ADDRESS_MEM_TYPE_64 )
>      {
> -        addr &= ~PCI_BASE_ADDRESS_MEM_MASK;
> +        addr &= PCI_BASE_ADDRESS_MEM_MASK;
>          if ( ++bir >= limit )
>              return 0;
>          return addr |
>                 ((u64)pci_conf_read32(bus, slot, func,
>                                       PCI_BASE_ADDRESS_0 + bir * 4) << 32);
>      }
> -    return addr & ~PCI_BASE_ADDRESS_MEM_MASK;
> +    return addr & PCI_BASE_ADDRESS_MEM_MASK;
>  }

>  /**

> The boot time msi warn messages were unchanged.

> -Bruce

>>
>> Saturday, October 16, 2010, 6:14:17 PM, you wrote:
>>
>>> On Mon, Oct 11, 2010 at 2:05 PM, Bruce Edge <bruce.edge@xxxxxxxxx> wrote:
>>>> On Mon, Oct 11, 2010 at 10:12 AM, Gianni Tedesco
>>>> <gianni.tedesco@xxxxxxxxxx> wrote:
>>>>> On Fri, 2010-10-08 at 10:33 +0100, Gianni Tedesco wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I've been trying to boot stefano's minimal dom0 kernel from
>>>>>> git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git
>>>>>> 2.6.36-rc1-initial-domain-v2+pat
>>>>>>
>>>>>> On xen-unstable, I get the following WARN_ON()'s from Xen when bringing
>>>>>> up the NIC's, then the machine hangs forever when trying to login either
>>>>>> over serial or NIC.
>>>>>>
>>>>>> (XEN) Xen WARN at msi.c:649
>>>>
>>>> I get the same Xen WARN messages using the current pvops/xen-next with
>>>> xen-unstable, here's the complete list for one boot, grep'd for WARN:
>>>>
>>>> (XEN) Xen WARN at msi.c:636
>>>> (XEN) Xen WARN at msi.c:649
>>>> (XEN) Xen WARN at msi.c:636
>>>> (XEN) Xen WARN at msi.c:649
>>>> (XEN) Xen WARN at msi.c:656
>>>> (XEN) Xen WARN at msi.c:636
>>>> (XEN) Xen WARN at msi.c:649
>>>> (XEN) Xen WARN at msi.c:636
>>>> (XEN) Xen WARN at msi.c:649
>>>> (XEN) Xen WARN at msi.c:656
>>>> (XEN) Xen WARN at msi.c:636
>>>> (XEN) Xen WARN at msi.c:649
>>>> (XEN) Xen WARN at msi.c:656
>>>> (XEN) Xen WARN at msi.c:636
>>>> (XEN) Xen WARN at msi.c:649
>>>> (XEN)    0000000080287db8 0(XEN) Xen WARN at msi.c:636
>>>> (XEN) Xen WARN at msi.c:649
>>>> (XEN) Xen WARN at msi.c:656
>>>>
>>>> The complete boot seq is attached.
>>>>
>>>> I do get a login at the end of the boot seq though.
>>>> My situation goes pear shaped when I try start a pv domU. The dom0
>>>> locks up after printing this on the console:
>>>>
>>>> (XEN) tmem: all pools frozen for all domains
>>>> (XEN) tmem: all pools thawed for all domains
>>>> (XEN) tmem: all pools frozen for all domains
>>>> (XEN) tmem: all pools thawed for all domains
>>>> mapping kernel into physical memory
>>>> about to get started...
>>>>
>>>> then prints these once a minute:
>>>> [  589.490894] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
>>>>
>>>> The xen console is still active and I can generate a diag dump, also 
>>>> attached.
>>>>
>>>> This dom0 lockup behavior started with pv-ops 2.6.32.21, all the way
>>>> to .24, rendering the later pvops kernels unusable for dom0.
>>>> The 2.6.32.18 kernel is the last one that functioned as a dom0.
>>>>
>>>> This behavior is consistent on platforms, HP proliant 380DL G6, and
>>>> G7, as well as i7 supermicros.
>>>>
>>>> -Bruce
>>>>
>>>>>
>>>>> Hmm so this appears not to be an issue with XCP kernel, in that case I
>>>>> get the warnings but everything still works fine.
>>>>>
>>>>> I will investigate further when I have some time.
>>>>>
>>>>> Gianni
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>>> http://lists.xensource.com/xen-devel
>>>>>
>>>>
>>
>>> The latest xen-unstable, 22240 has the same "  (XEN) Xen WARN at
>>> msi.c:636 " messages with associated stack traces.
>>
>>> I spent a little more time working with this version, and except for
>>> these disconcerting messages, which do look like they are initiated by
>>> the ethernet card discovery, the system appears functional.
>>> In all cases the first occurrence is immediately after the NIC discovery:
>>
>>>  e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
>>> | e1000e: Copyright (c) 1999-2008 Intel Corporation.
>>> | xen: registering gsi 16 triggering 0 polarity 1
>>> | xen_allocate_pirq: returning irq 16 for gsi 16
>>>   xen: --> irq=16
>>>   Already setup the GSI :16
>>>   e1000e 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
>>>   e1000e 0000:06:00.0: setting latency timer to 64
>>>     alloc irq_desc for 493 on node 0
>>>     alloc kstat_irqs on node 0
>>>   (XEN) Xen WARN at msi.c:636
>>>   (XEN) ----[ Xen-4.1-unstable  x86_64  debug=y  Not tainted ]----
>>> ....
>>
>>> In case it's a NIC specific issue, I'm seeing it with both
>>>     06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit
>>> Network Connection
>>> and
>>>     02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II
>>> BCM5709 Gigabit Ethernet (rev 20)
>>> NICs
>>
>>> -Bruce
>>
>>
>>
>>
>>
>> --
>> Best regards,
>>  Sander                            mailto:linux@xxxxxxxxxxxxxx
>>
>>



-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.