WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET bro

To: Andreas Kinzler <ml-xen-devel@xxxxxx>
Subject: Re: [Xen-devel] Instability with Xen, interrupt routing frozen, HPET broadcast
From: Andrew Lyon <andrew.lyon@xxxxxxxxx>
Date: Wed, 29 Sep 2010 20:34:28 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, JBeulich@xxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Wed, 29 Sep 2010 12:35:22 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=KhZ2ZujfswkAVs8ubaFegFrspcVro0vtF2Q6AyaQWxc=; b=ZZNEmB8xqqLiLklPgrMc4pghPT2QRuk9s4uUCEThZi+pwui8+xdd5FAjSiH2Hvk5jm vyEt6KMwgDGCUQeC6lKw213Werl0ZqY5mzbUATuqVtL59gsk77PLRDphln5MCJw68T9k I9C2aZhrVpz06zI53Be84hr7fSBiep97f3LJw=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=HzO3aswsxE45W2LxHPrzh5AzhDUvtqXIL77gmJl735Ov16aGqzbHmtRjBaJXH7b0kn nh1RWQPjSHpoc1ktW+SHAmCFZbeiCo1UuTB9yfM0X8CKpZAHQgQ5QuPEuvvIg8yQCumR 6QWp6S4D956Pvitgwve9iWRpxZpIXjiAQP5MU=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CA38093.9070802@xxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4C88A6F3.9020207@xxxxxx> <20100921115604.GP2804@xxxxxxxxxxx> <4CA38093.9070802@xxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, Sep 29, 2010 at 7:08 PM, Andreas Kinzler <ml-xen-devel@xxxxxx> wrote:
> On 21.09.2010 13:56, Pasi Kärkkäinen wrote:
>>>
>>>  I am talking a while (via email) with Jan now to track the following
>>> problem and he suggested that I report the problem on xen-devel:
>>>
>>> Jul  9 01:48:04 virt kernel: aacraid: Host adapter reset request. SCSI
>>> hang ?
>>> Jul  9 01:49:05 virt kernel: aacraid: SCSI bus appears hung
>>> Jul  9 01:49:10 virt kernel: Calling adapter init
>>> Jul  9 01:49:49 virt kernel: IRQ 16/aacraid: IRQF_DISABLED is not
>>> guaranteed on shared IRQs
>>> Jul  9 01:49:49 virt kernel: Acquiring adapter information
>>> Jul  9 01:49:49 virt kernel: update_interval=30:00 check_interval=86400s
>>> Jul  9 01:53:13 virt kernel: aacraid: aac_fib_send: first asynchronous
>>> command timed out.
>>> Jul  9 01:53:13 virt kernel: Usually a result of a PCI interrupt routing
>>> problem;
>>> Jul  9 01:53:13 virt kernel: update mother board BIOS or consider
>>> utilizing one of
>>> Jul  9 01:53:13 virt kernel: the SAFE mode kernel options (acpi, apic
>>> etc)
>>>
>>> After the VMs have been running a while the aacraid driver reports a
>>> non-responding RAID controller. Most of the time the NIC is also no
>>> longer working.
>>> I nearly tried every combination of dom0 kernel (pvops0, xenfied suse
>>> 2.6.31.x, xenfied suse 2.6.32.x, xenfied suse 2.6.34.x) with Xen
>>> hypervisor 3.4.2, 3.4.4-cs19986, 4.0.1, unstable.
>>> No success in two month. Every combination earlier or later had the
>>> problem shown above. I did extensive tests to make sure that the
>>> hardware is OK. And it is - I am sure it is a Xen/dom0 problem.
>>>
>>> Jan suggested to try the fix in c/s 22051 but it did not help. My answer
>>> to him:
>>>
>>>> In the meantime I did try xen-unstable c/s 22068 (contains staging c/s
>>>
>>> 22051) and
>>>>
>>>> it did not fix the problem at all. I was able to fix a problem with
>>>
>>> the serial console
>>>>
>>>> and so I got some debug info that is attached to this email. The
>>>
>>> following line looks
>>>>
>>>> suspicious to me (irr=1, delivery_status=1):
>>>
>>>> (XEN)     IRQ 16 Vec216:
>>>> (XEN)       Apic 0x00, Pin 16: vector=216, delivery_mode=1,
>>>
>>> dest_mode=logical,
>>>>
>>>>             delivery_status=1, polarity=1, irr=1, trigger=level,
>>>
>>> mask=0, dest_id:1
>>>
>>>> IRQ 16 is the aacraid controller which after some while seems to be
>>>
>>> enable to receive
>>>>
>>>> interrupts. Can you see from the debug info what is going on?
>>>
>>> I also applied a small patch which disables HPET broadcast. The machine
>>> is now running
>>> for 110 hours without a crash while normally it crashes within a few
>>> minutes. Is there
>>> something wrong (race, deadlock) with HPET broadcasts in relation to
>>> blocked interrupt
>>> reception (see above)?
>>
>> What kind of hardware does this happen on?
>
> It is a Supermicro X8SIL-F, Intel Xeon 3450 system.
>
>> Should this patch be merged?
>
> Not easy to answer. I spend more than 10 weeks searching nearly full time
> for the reason of the stability issues. Finally I was able to track it down
> to the HPET broadcast code.
>
> We need to find the developer of the HPET broadcast code. Then, he should
> try to fix the code. I consider it a quite severe bug as it renders Xen
> nearly useless on affected systems. That is why I (and my boss who pays me)
> spend so much time (developing/fixing Xen is not really my core job) and
> money (buying a E5620 machine just for testing Xen).
>
> I think many people on affected systems are having problems. See
> http://lists.xensource.com/archives/html/xen-users/2010-09/msg00370.html
>
> Regards Andreas
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>

I will test that patch on my Supermicro X7DWA-N based dual Xeon
workstation, I always use a Xenified kernel rather than pv_ops as it
supports some features that I need and is compatible with nvidia
binary drivers, but I've always had problems with very occasional
hard/soft lockups :(.

I've ruled out the nvidia drivers, before going on holiday a few weeks
ago I upgraded Xen to 4.0.1 (from 3.4.2) and the kernel to 2.6.34
patched with the latest suse xen patches, but I did not compile the
nvidia module or run X using any other drivers, the system locked up
after 11 days of moderate load runtime, unfortunately my serial to
tcp/ip device was not working so I could not check the serial console
remotely and had to reboot the system.

This problem has happened with 2.6.29,30,31,32, and 34 + Xen 3.4.1,
3.4.2 and 4.0.1, I've also tried using the full suse patch set rather
than the minimal set of Xen patches that I usually use, no change, I
think this is a Xen problem.

Usually a soft lockup is reported by the linux kernel but it is
impossible to diagnose further as no i/o is possible so commands like
xm do not work, more rarely the system locks hard with no response at
all on the serial console and no errors logged.

In perhaps 1 in 20 of cases the lockup is temporary and the system
returns to normal performance, but usually it is terminal.

The machine is my main workstation and the problem is rare enough that
I've tolerated it, we recently got another dual Xeon workstation with
Supermicro X8DAL-i so it will be interesting to see if that has the
same issue.

Some example soft lockup errors, this one the system recovered from:

BUG: soft lockup - CPU#3 stuck for 2796s! [swapper:0]
Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793
hwmon_vid sco rfcomm bnep l2cap crc16 xen_scsibk st ftdi_sio usbserial
snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi snd_hda_codec_realtek
snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt
i2c_i801 iTCO_vendor_support igb i5k_amb snd_page_alloc btusb
bluetooth [last unloaded: nvidia]
CPU 3
Modules linked in: fuse cifs nvidia(P) ipv6 coretemp w83627hf w83793
hwmon_vid sco rfcomm bnep l2cap crc16 xen_scsibk st ftdi_sio usbserial
snd_usb_audio snd_hwdep snd_usb_lib snd_rawmidi snd_hda_codec_realtek
snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt
i2c_i801 iTCO_vendor_support igb i5k_amb snd_page_alloc btusb
bluetooth [last unloaded: nvidia]

Pid: 0, comm: swapper Tainted: P           2.6.34-xen-r4 #1 X7DWA/X7DWA
RIP: e030:[<ffffffff802013aa>]  [<ffffffff802013aa>] 0xffffffff802013aa
RSP: e02b:ffff8803ec4cdf10  EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffffff8088a158 RCX: ffffffff802013aa
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: ffff8803ec4cdfd8 R08: 0000000000000000 R09: ffffffff8088a158
R10: ffff880071aeecc0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f1a34310710(0000) GS:ffff880001049000(0000) knlGS:0000000000000000
CS:  e033 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 00007f5b6b5fa000 CR3: 00000000363be000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8803ec4cc000, task ffff8803ec4be000)
Stack:
 000000000000a280 0000000000000000 ffffffff802062e1 ffffffff80209761
<0> ffffffff8088a158 ffffffff8020361f ffffffff804b2e73 0000000000000000
<0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
 [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd
 [<ffffffff80209761>] ? xen_idle+0x4f/0x85
 [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80
 [<ffffffff804b2e73>] ? force_evtchn_callback+0x9/0xa
Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc
cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41>
5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
Call Trace:
 [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd
 [<ffffffff80209761>] ? xen_idle+0x4f/0x85
 [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80
 [<ffffffff804b2e73>] ? force_evtchn_callback+0x9/0xa

Some older examples which were terminal:

Sep 25 05:10:12 ubermicro kernel: BUG: soft lockup - CPU#6 stuck for
61s! [xenvnc.sh:12180]
Sep 25 05:10:12 ubermicro kernel: Modules linked in: cifs nvidia(P)
ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm l2cap crc16
xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib
snd_rawmidi btusb bluetooth snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_pcm snd_timer iTCO_wdt snd iTCO_vendor_support
i2c_i801 snd_page_alloc igb sym53c8xx i5k_amb [last unloaded:
microcode]
Sep 25 05:10:12 ubermicro kernel: CPU 6
Sep 25 05:10:12 ubermicro kernel: Modules linked in: cifs nvidia(P)
ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm l2cap crc16
xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep snd_usb_lib
snd_rawmidi btusb bluetooth snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_pcm snd_timer iTCO_wdt snd iTCO_vendor_support
i2c_i801 snd_page_alloc igb sym53c8xx i5k_amb [last unloaded:
microcode]
Sep 25 05:10:12 ubermicro kernel:
Sep 25 05:10:12 ubermicro kernel: Pid: 12180, comm: xenvnc.sh Tainted:
P           2.6.34-xen-r4 #1 X7DWA/X7DWA
Sep 25 05:10:12 ubermicro kernel: RIP: e030:[<ffffffff8025328d>]
[<ffffffff8025328d>] smp_call_function_many+0x187/0x19c
Sep 25 05:10:12 ubermicro kernel: RSP: e02b:ffff88004e0c7dd8  EFLAGS: 00000202
Sep 25 05:10:12 ubermicro kernel: RAX: ffff880001086ac0 RBX:
ffff880001089b30 RCX: 00007f3124b39000
Sep 25 05:10:12 ubermicro kernel: RDX: ffff88000107f000 RSI:
0000000000000020 RDI: 0000000000000020
Sep 25 05:10:12 ubermicro kernel: RBP: ffff880001089b00 R08:
0000000000000000 R09: ffff880001089b30
Sep 25 05:10:12 ubermicro kernel: R10: 0000000000007ff0 R11:
ffff8803a73c21c0 R12: ffff8803a73c21c0
Sep 25 05:10:12 ubermicro kernel: R13: ffffffff80216f7f R14:
0000000000000006 R15: ffffffff8088a158
Sep 25 05:10:12 ubermicro kernel: FS:  00007f3125469700(0000)
GS:ffff88000107f000(0000) knlGS:0000000000000000
Sep 25 05:10:12 ubermicro kernel: CS:  e033 DS: 0000 ES: 0000 CR0:
0000000080050033
Sep 25 05:10:12 ubermicro kernel: CR2: 00007f3124b39a90 CR3:
00000000008fd000 CR4: 0000000000002660
Sep 25 05:10:12 ubermicro kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Sep 25 05:10:12 ubermicro kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Sep 25 05:10:12 ubermicro kernel: Process xenvnc.sh (pid: 12180,
threadinfo ffff88004e0c6000, task ffff88004cdc3980)
Sep 25 05:10:12 ubermicro kernel: Stack:
Sep 25 05:10:12 ubermicro kernel: 0000000000000000 0100000000000010
ffff880005ab4588 ffff8803a73c21c0
Sep 25 05:10:12 ubermicro kernel: <0> ffff88004cdc3980
ffff88004cdc3e4c ffff8803a73c2220 0000000000000232
Sep 25 05:10:12 ubermicro kernel: <0> 0000000000000001
ffffffff80216f40 00007f3124b39a90 ffff8803a73c21c0
Sep 25 05:10:12 ubermicro kernel: Call Trace:
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80216f40>] ?
arch_exit_mmap+0x44/0x83
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80298322>] ? exit_mmap+0x49/0x16c
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022e03d>] ? exit_mm+0x108/0x113
Sep 25 05:10:12 ubermicro kernel: [<ffffffff802484d5>] ?
hrtimer_try_to_cancel+0x92/0x9d
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022fc58>] ? do_exit+0x1f2/0x6e0
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022f8dd>] ? sys_wait4+0xa5/0xb5
Sep 25 05:10:12 ubermicro kernel: [<ffffffff802301f4>] ? do_group_exit+0xae/0xd8
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80230230>] ?
sys_exit_group+0x12/0x17
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80204248>] ?
system_call_fastpath+0x16/0x1b
Sep 25 05:10:12 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52
Sep 25 05:10:12 ubermicro kernel: Code: 7e 80 48 89 2d 55 8f 59 00 48
89 c6 48 89 6a 08 e8 82 0c 3c 00 0f ae f0 48 89 df e8 91 c1 fb ff 80
7c 24 0f 00 75 04 eb 08 f3 90 <f6> 45 20 01 75 f8 48 83 c4 18 5b 5d 41
5c 41 5d 41 5e 41 5f c3
Sep 25 05:10:12 ubermicro kernel: Call Trace:
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80216f40>] ?
arch_exit_mmap+0x44/0x83
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80298322>] ? exit_mmap+0x49/0x16c
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022e03d>] ? exit_mm+0x108/0x113
Sep 25 05:10:12 ubermicro kernel: [<ffffffff802484d5>] ?
hrtimer_try_to_cancel+0x92/0x9d
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022fc58>] ? do_exit+0x1f2/0x6e0
Sep 25 05:10:12 ubermicro kernel: [<ffffffff8022f8dd>] ? sys_wait4+0xa5/0xb5
Sep 25 05:10:12 ubermicro kernel: [<ffffffff802301f4>] ? do_group_exit+0xae/0xd8
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80230230>] ?
sys_exit_group+0x12/0x17
Sep 25 05:10:12 ubermicro kernel: [<ffffffff80204248>] ?
system_call_fastpath+0x16/0x1b
Sep 25 05:10:12 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52


Sep 29 02:54:47 ubermicro kernel: BUG: soft lockup - CPU#3 stuck for
2796s! [swapper:0]
Sep 29 02:54:47 ubermicro kernel: Modules linked in: fuse cifs
nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco rfcomm bnep
l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep
snd_usb_lib snd_rawmidi snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt i2c_i801
iTCO_vendor_support igb i5k_amb snd_page_alloc btusb bluetooth [last
unloaded: nvidia]
Sep 29 02:54:47 ubermicro kernel: CPU 3
Sep 29 02:54:47 ubermicro kernel: Modules linked in: fuse cifs
nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco rfcomm bnep
l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep
snd_usb_lib snd_rawmidi snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_pcm snd_timer snd sym53c8xx iTCO_wdt i2c_i801
iTCO_vendor_support igb i5k_amb snd_page_alloc btusb bluetooth [last
unloaded: nvidia]
Sep 29 02:54:47 ubermicro kernel:
Sep 29 02:54:47 ubermicro kernel: Pid: 0, comm: swapper Tainted: P
      2.6.34-xen-r4 #1 X7DWA/X7DWA
Sep 29 02:54:47 ubermicro kernel: RIP: e030:[<ffffffff802013aa>]
[<ffffffff802013aa>] 0xffffffff802013aa
Sep 29 02:54:47 ubermicro kernel: RSP: e02b:ffff8803ec4cdf10  EFLAGS: 00000246
Sep 29 02:54:47 ubermicro kernel: RAX: 0000000000000000 RBX:
ffffffff8088a158 RCX: ffffffff802013aa
Sep 29 02:54:47 ubermicro kernel: RDX: 0000000000000000 RSI:
0000000000000000 RDI: 0000000000000001
Sep 29 02:54:47 ubermicro kernel: RBP: ffff8803ec4cdfd8 R08:
0000000000000000 R09: ffffffff8088a158
Sep 29 02:54:47 ubermicro kernel: R10: ffff880071aeecc0 R11:
0000000000000246 R12: 0000000000000000
Sep 29 02:54:47 ubermicro kernel: R13: 0000000000000000 R14:
0000000000000000 R15: 0000000000000000
Sep 29 02:54:47 ubermicro kernel: FS:  00007f1a34310710(0000)
GS:ffff880001049000(0000) knlGS:0000000000000000
Sep 29 02:54:47 ubermicro kernel: CS:  e033 DS: 002b ES: 002b CR0:
000000008005003b
Sep 29 02:54:47 ubermicro kernel: CR2: 00007f5b6b5fa000 CR3:
00000000363be000 CR4: 0000000000002660
Sep 29 02:54:47 ubermicro kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Sep 29 02:54:47 ubermicro kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Sep 29 02:54:47 ubermicro kernel: Process swapper (pid: 0, threadinfo
ffff8803ec4cc000, task ffff8803ec4be000)
Sep 29 02:54:47 ubermicro kernel: Stack:
Sep 29 02:54:47 ubermicro kernel: 000000000000a280 0000000000000000
ffffffff802062e1 ffffffff80209761
Sep 29 02:54:47 ubermicro kernel: <0> ffffffff8088a158
ffffffff8020361f ffffffff804b2e73 0000000000000000
Sep 29 02:54:47 ubermicro kernel: <0> 0000000000000000
0000000000000000 0000000000000000 0000000000000000
Sep 29 02:54:47 ubermicro kernel: Call Trace:
Sep 29 02:54:47 ubermicro kernel: [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd
Sep 29 02:54:47 ubermicro kernel: [<ffffffff80209761>] ? xen_idle+0x4f/0x85
Sep 29 02:54:47 ubermicro kernel: [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80
Sep 29 02:54:47 ubermicro kernel: [<ffffffff804b2e73>] ?
force_evtchn_callback+0x9/0xa
Sep 29 02:54:47 ubermicro kernel: Code: cc 51 41 53 b8 1c 00 00 00 0f
05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc
cc cc cc cc cc cc cc cc
Sep 29 02:54:47 ubermicro kernel: Call Trace:
Sep 29 02:54:47 ubermicro kernel: [<ffffffff802062e1>] ? xen_safe_halt+0xc/0xd
Sep 29 02:54:47 ubermicro kernel: [<ffffffff80209761>] ? xen_idle+0x4f/0x85
Sep 29 02:54:47 ubermicro kernel: [<ffffffff8020361f>] ? cpu_idle+0x4b/0x80
Sep 29 02:54:47 ubermicro kernel: [<ffffffff804b2e73>] ?
force_evtchn_callback+0x9/0xa


Sep  8 18:16:30 ubermicro kernel: BUG: soft lockup - CPU#2 stuck for
61s! [xenvnc.sh:29385]
Sep  8 18:16:30 ubermicro kernel: Modules linked in: fuse cifs
nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm
l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep
snd_usb_lib snd_rawmidi btusb bluetooth snd_hda_codec_realtek
snd_hda_intel snd_hda_codec snd_pcm snd_timer snd i2c_i801 iTCO_wdt
iTCO_vendor_support snd_page_alloc igb i5k_amb sym53c8xx [last
unloaded: microcode]
Sep  8 18:16:30 ubermicro kernel: CPU 2
Sep  8 18:16:30 ubermicro kernel: Modules linked in: fuse cifs
nvidia(P) ipv6 coretemp w83627hf w83793 hwmon_vid sco bnep rfcomm
l2cap crc16 xen_scsibk st ftdi_sio usbserial snd_usb_audio snd_hwdep
snd_usb_lib snd_rawmidi btusb bluetooth snd_hda_codec_realtek
snd_hda_intel snd_hda_codec snd_pcm snd_timer snd i2c_i801 iTCO_wdt
iTCO_vendor_support snd_page_alloc igb i5k_amb sym53c8xx [last
unloaded: microcode]
Sep  8 18:16:30 ubermicro kernel:
Sep  8 18:16:30 ubermicro kernel: Pid: 29385, comm: xenvnc.sh Tainted:
P           2.6.34-xen-r3 #1 X7DWA/X7DWA
Sep  8 18:16:30 ubermicro kernel: RIP: e030:[<ffffffff802531ef>]
[<ffffffff802531ef>] smp_call_function_many+0x185/0x19c
Sep  8 18:16:30 ubermicro kernel: RSP: e02b:ffff88009b18fdd8  EFLAGS: 00000202
Sep  8 18:16:30 ubermicro kernel: RAX: ffff88000103eac0 RBX:
ffff880001041b30 RCX: 00007f89e1853000
Sep  8 18:16:30 ubermicro kernel: RDX: ffff880001037000 RSI:
0000000000000020 RDI: 0000000000000020
Sep  8 18:16:30 ubermicro kernel: RBP: ffff880001041b00 R08:
0000000000000000 R09: ffff880001041b30
Sep  8 18:16:30 ubermicro kernel: R10: 0000000000007ff0 R11:
ffff8803d7d12800 R12: ffff8803d7d12800
Sep  8 18:16:30 ubermicro kernel: R13: ffffffff80216f7f R14:
0000000000000002 R15: ffffffff8088a158
Sep  8 18:16:30 ubermicro kernel: FS:  00007f89e2183700(0000)
GS:ffff880001037000(0000) knlGS:0000000000000000
Sep  8 18:16:30 ubermicro kernel: CS:  e033 DS: 0000 ES: 0000 CR0:
0000000080050033
Sep  8 18:16:30 ubermicro kernel: CR2: 00007f89e1853a90 CR3:
00000000008fd000 CR4: 0000000000002660
Sep  8 18:16:30 ubermicro kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Sep  8 18:16:30 ubermicro kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Sep  8 18:16:30 ubermicro kernel: Process xenvnc.sh (pid: 29385,
threadinfo ffff88009b18e000, task ffff8801004ad320)
Sep  8 18:16:30 ubermicro kernel: Stack:
Sep  8 18:16:30 ubermicro kernel: 0000000000000000 0100000000000010
ffff880006b3c398 ffff8803d7d12800
Sep  8 18:16:30 ubermicro kernel: <0> ffff8801004ad320
ffff8801004ad7ec ffff8803d7d12860 0000000000000403
Sep  8 18:16:30 ubermicro kernel: <0> 0000000000000001
ffffffff80216f40 00007f89e1853a90 ffff8803d7d12800
Sep  8 18:16:30 ubermicro kernel: Call Trace:
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80216f40>] ?
arch_exit_mmap+0x44/0x83
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80298202>] ? exit_mmap+0x49/0x16c
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022e025>] ? exit_mm+0x108/0x113
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80248439>] ?
hrtimer_try_to_cancel+0x92/0x9d
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022fc40>] ? do_exit+0x1f2/0x6e0
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022f8c5>] ? sys_wait4+0xa5/0xb5
Sep  8 18:16:30 ubermicro kernel: [<ffffffff802301dc>] ? do_group_exit+0xae/0xd8
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80230218>] ?
sys_exit_group+0x12/0x17
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80204248>] ?
system_call_fastpath+0x16/0x1b
Sep  8 18:16:30 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52
Sep  8 18:16:30 ubermicro kernel: Code: d0 c1 7e 80 48 89 2d f1 8f 59
00 48 89 c6 48 89 6a 08 e8 0e 08 3c 00 0f ae f0 48 89 df e8 2d c2 fb
ff 80 7c 24 0f 00 75 04 eb 08 <f3> 90 f6 45 20 01 75 f8 48 83 c4 18 5b
5d 41 5c 41 5d 41 5e 41
Sep  8 18:16:30 ubermicro kernel: Call Trace:
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80216f40>] ?
arch_exit_mmap+0x44/0x83
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80298202>] ? exit_mmap+0x49/0x16c
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022a2a9>] ? mmput+0x28/0xe5
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022e025>] ? exit_mm+0x108/0x113
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80248439>] ?
hrtimer_try_to_cancel+0x92/0x9d
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022fc40>] ? do_exit+0x1f2/0x6e0
Sep  8 18:16:30 ubermicro kernel: [<ffffffff8022f8c5>] ? sys_wait4+0xa5/0xb5
Sep  8 18:16:30 ubermicro kernel: [<ffffffff802301dc>] ? do_group_exit+0xae/0xd8
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80230218>] ?
sys_exit_group+0x12/0x17
Sep  8 18:16:30 ubermicro kernel: [<ffffffff80204248>] ?
system_call_fastpath+0x16/0x1b
Sep  8 18:16:30 ubermicro kernel: [<ffffffff802041e0>] ? system_call+0x0/0x52

I should be able to apply the patch tomorrow and will report back as
soon as I have some results.

Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel