WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze wi

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [Xen-devel] [xen-4.0.1-rc5-pre] [pvops 2.6.32.16] Complete freeze within 2 days, no info in serial log
From: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
Date: Fri, 6 Aug 2010 11:21:11 +0200
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Fri, 06 Aug 2010 02:21:59 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100805145214.GC5697@xxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Eikelenboom IT services
References: <698099271.20100803173057@xxxxxxxxxxxxxx> <20100803154541.GA16122@xxxxxxxxxxxxxxxxxxx> <4C583AFE.7080001@xxxxxxxx> <1048476317.20100805114844@xxxxxxxxxxxxxx> <20100805145214.GC5697@xxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi Konrad,

Hmm it seems that 2.6.33 tree does seem to work for 1 VM with a videograbber, 
but doesn't for the VM which seem to cause the freeze.
It does spit out some stacktraces after a while of not functioning, with since 
is OOM i will be something else caused by the fall out and not anywhere near 
the root cause.
Although this at least didn't freeze the complete system :-)
I will try some more configurations to see if i can find a pattern somehow ...

--
Sander

[ 1269.032133] submit of urb 0 failed (error=-90)
[ 1274.153341] motion: page allocation failure. order:6, mode:0xd4
[ 1274.153375] Pid: 1884, comm: motion Not tainted 2.6.33 #5
[ 1274.153391] Call Trace:
[ 1274.153416]  [<ffffffff810e4665>] __alloc_pages_nodemask+0x5b2/0x62b
[ 1274.153440]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
[ 1274.153461]  [<ffffffff810e46f5>] __get_free_pages+0x17/0x5f
[ 1274.153483]  [<ffffffff8128042e>] xen_swiotlb_alloc_coherent+0x3c/0xe2
[ 1274.153507]  [<ffffffff81410931>] hcd_buffer_alloc+0xfa/0x11f
[ 1274.153527]  [<ffffffff81403e0c>] usb_buffer_alloc+0x17/0x1d
[ 1274.153562]  [<ffffffffa003f39e>] em28xx_init_isoc+0x16a/0x32b [em28xx]
[ 1274.153585]  [<ffffffff815ec0b9>] ? __down_read+0x47/0xed
[ 1274.153613]  [<ffffffffa003a4ac>] buffer_prepare+0xd7/0x10d [em28xx]
[ 1274.153639]  [<ffffffffa0016dac>] videobuf_qbuf+0x308/0x3f4 [videobuf_core]
[ 1274.153667]  [<ffffffffa0039cb3>] vidioc_qbuf+0x35/0x3a [em28xx]
[ 1274.153697]  [<ffffffffa0028229>] __video_do_ioctl+0x11ab/0x373b [videodev]
[ 1274.153720]  [<ffffffff814b51cd>] ? sock_def_readable+0x54/0x5f
[ 1274.153743]  [<ffffffff81541f65>] ? unix_dgram_sendmsg+0x3f1/0x43e
[ 1274.153764]  [<ffffffff810313b5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e
[ 1274.153793]  [<ffffffffa0039c7e>] ? vidioc_qbuf+0x0/0x3a [em28xx]
[ 1274.153814]  [<ffffffff814b208b>] ? sock_sendmsg+0xa3/0xbc
[ 1274.153837]  [<ffffffff8123349b>] ? avc_has_perm+0x4e/0x60
[ 1274.153855]  [<ffffffff810338b9>] ? xen_force_evtchn_callback+0xd/0xf
[ 1274.153880]  [<ffffffffa002aab1>] video_ioctl2+0x2f8/0x3af [videodev]
[ 1274.153901]  [<ffffffff810357df>] ? __switch_to+0x265/0x277
[ 1274.153924]  [<ffffffffa0026122>] v4l2_ioctl+0x38/0x3a [videodev]
[ 1274.153944]  [<ffffffff8111ec90>] vfs_ioctl+0x72/0x9e
[ 1274.153961]  [<ffffffff8111f1d7>] do_vfs_ioctl+0x4a0/0x4e1
[ 1274.153980]  [<ffffffff8111f26d>] sys_ioctl+0x55/0x77
[ 1274.154000]  [<ffffffff81112e6a>] ? sys_write+0x60/0x70
[ 1274.154009]  [<ffffffff81036cc2>] system_call_fastpath+0x16/0x1b
[ 1274.154126] Mem-Info:
[ 1274.154138] DMA per-cpu:
[ 1274.154151] CPU    0: hi:    0, btch:   1 usd:   0
[ 1274.154165] CPU    1: hi:    0, btch:   1 usd:   0
[ 1274.154180] DMA32 per-cpu:
[ 1274.154202] CPU    0: hi:  186, btch:  31 usd:   0
[ 1274.154220] CPU    1: hi:  186, btch:  31 usd:  78
[ 1274.154241] active_anon:248 inactive_anon:326 isolated_anon:0
[ 1274.154244]  active_file:132 inactive_file:105 isolated_file:41
[ 1274.154247]  unevictable:0 dirty:0 writeback:19 unstable:0
[ 1274.154250]  free:1309 slab_reclaimable:642 slab_unreclaimable:3111
[ 1274.154254]  mapped:100846 shmem:4 pagetables:1187 bounce:0
[ 1274.154313] DMA free:2036kB min:80kB low:100kB high:120kB active_anon:0kB 
inactive_anon:24kB active_file:20kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:14752kB mlocked:0kB dirty:0kB 
writeback:0kB mapped:12804kB shmem:0kB slab_reclaimable:16kB 
slab_unreclaimable:40kB kernel_stack:0kB pagetables:24kB unstable:0kB 
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 1274.154375] lowmem_reserve[]: 0 489 489 489
[ 1274.154415] DMA32 free:3200kB min:2788kB low:3484kB high:4180kB 
active_anon:992kB inactive_anon:1280kB active_file:508kB inactive_file:420kB 
unevictable:0kB isolated(anon):0kB isolated(file):164kB present:500960kB 
mlocked:0kB dirty:0kB writeback:76kB mapped:390580kB shmem:16kB 
slab_reclaimable:2552kB slab_unreclaimable:12404kB kernel_stack:592kB 
pagetables:4724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:160 
all_unreclaimable? no
[ 1274.154481] lowmem_reserve[]: 0 0 0 0
[ 1274.154508] DMA: 7*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 
1*1024kB 0*2048kB 0*4096kB = 2036kB
[ 1274.154571] DMA32: 409*4kB 33*8kB 2*16kB 0*32kB 0*64kB 0*128kB 1*256kB 
0*512kB 1*1024kB 0*2048kB 0*4096kB = 3212kB
[ 1274.154634] 429 total pagecache pages
[ 1274.154646] 161 pages in swap cache
[ 1274.154658] Swap cache stats: add 344422, delete 344260, find 99167/143153
[ 1274.154673] Free swap  = 476756kB
[ 1274.154684] Total swap = 524280kB
[ 1274.160880] 131072 pages RAM
[ 1274.160902] 21934 pages reserved
[ 1274.160914] 101195 pages shared
[ 1274.160925] 6309 pages non-shared
[ 1274.160963] unable to allocate 185088 bytes for transfer buffer 4
[ 1287.634682] motion invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
[ 1287.634719] motion cpuset=/ mems_allowed=0




Thursday, August 5, 2010, 4:52:14 PM, you wrote:

> On Thu, Aug 05, 2010 at 11:48:44AM +0200, Sander Eikelenboom wrote:
>> Hi Konrad/Jeremy,
>> 
>> I have tested the last 2 days with the vm's with passthroughed devices 
>> shutdown, and no freeze so far.
>> I'm running now with one of the vm's that runs an old 2.6.33 kernel from an 
>> old tree from Konrad together with some hacked up patches for xhci/usb3 
>> support.
>> That seems to be running fine for some time now (although not a full 2 days 
>> yet).
>> 
>> So my other vm seems to cause the freeze.
>> 
>> - This one uses the devel/merge.2.6.35-rc6.t2 as domU kernel, i think i 
>> should try an older version of pci-front/xen-swiotlb perhaps.
>> - It has both a usb2 and usb3 controller passed through, but the xhci module 
>> has much changed since the hacked up patches from the kernel in de working 
>> domU vm
>> - Most probably the drivers for the videograbbers will have changed
>> 
>> So i suspect:
>>    - newer pci-front / xen-swiotlb
>>    - xhci/usb3 driver
>>    - drivers videograbber
>> 
>> Most probable would be a roque dma transfer that can't be catched by xen / 
>> pciback I guess, and therefore would be hard to debug ?

> The SWIOTLB "brains" by themselves haven't changed since the
> uhh...2.6.33. The code internals that just got Ack-ed upstream looks quite
> similar to the one that Jeremy carries in xen/stable-2.6.32.x. The
> outside plumbing parts are the ones that changed.

> The fixes in the pci-front, well, most of those are "burocractic" in
> nature - set the ownership to this, make hotplug work, etc. The big
> fixes were the MSI/MSI-X ones but those were big news a couple of months
> ago (and I think that was when 2.6.34 came out).

> The videograbber (vl4) stack trace you sent to me some time ago looked
> liked a mutex was held for a very very long time... which I wonder if
> that is the cmpxch compiler bug that has hit some folks. Are you using
> Debian?

> But we can do something easy. I can rebase my 2.6.33 kernel with the
> latest Xen-SWIOTLB/SWIOTLB engine + Xen PCI front, and we can eliminate the
> SWIOTLB/PCIfront being at fault here.. Let me do that if your  2.6.33
> VM guest is running fine for the last two days.




-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>