Hi.
We have hit bugs with 3.0.3 with blktap, so can't use it there. (I saw
some fixes on the devel list that I think address these problems in 3.0.4).
3.0.4 has crashed within a hour to a day after enabling any loop back
mounted file systems. So we can't use loop back mounted devices on
3.0.4. 3.0.3 is fine. We'd reported this via bug tracker but we don't
have a reproducible test case for it.
The performance on 3.0.4 (with sparse blktap files) is a lot slower
(unusably so) compared to 3.0.3 (with sparse loopback mounted files).
On the 3.0.4 server we notice that IO overall (even on dom0) is very
slow for some reason hdparm -t /dev/md1 returns about 3-8MB/s. c.f.
normally it returns 46-55 MB/s.
Any ideas what could be the culprit? Is anyone else seeing IO issues?
If IO is good on your 3.0.4 server, then care to describe your setup?
With blktap we don't get any vbd output in xentop. And I cannot see how
to associate the tapdisk processes in ps auxf with its associated domU.
How would one do this?
And I've just received the following bug (below) from 3.0.4 (tapdisk
related apparently).
Regards, Peter
------------[ cut here ]------------
kernel BUG at fs/aio.c:511!
invalid opcode: 0000 [#1]
Modules linked in: xt_physdev iptable_filter ip_tables x_tables
dm_mirror dm_mod tg3 ext3 jbd raid1
CPU: 0
EIP: 0061:[<c0173291>] Not tainted VLI
EFLAGS: 00010086 (2.6.16.33-xen0 #2)
EIP is at __aio_put_req+0x26/0x10a
eax: cb71bdc0 ebx: cb71bdc0 ecx: cb71be30 edx: ffffffff
esi: c5f62180 edi: c58b9f68 ebp: c5f62180 esp: c58b9f18
ds: 007b es: 007b ss: 0069
Process tapdisk (pid: 18570, threadinfo=c58b8000 task=c6470540)
Stack: <0>00001000 c5f62180 c5f62180 cb71bdc0 c0173393 c5f62180 cb71bdc0
00000000
cb71bdc0 c0174620 cb71bdc0 0805d400 00000001 0805d400 c5f62180
0805f184
c0174731 c58b9f68 0805d400 00000040 00000142 00000000 00000000
00000000
Call Trace:
[<c0173393>] aio_put_req+0x1e/0x63
[<c0174620>] io_submit_one+0x154/0x1d2
[<c0174731>] sys_io_submit+0x93/0xde
[<c01049e9>] syscall_call+0x7/0xb
Code: e9 87 14 0e 00 56 53 83 ec 08 8b 5c 24 18 8b 74 24 14 8b 53 0c 83
ea 01 85 d2 89 53 0c 78 0c 31 c0 85 d2 74 10 83 c4 08 5b 5e c3 <0f> 0b
ff 01 ea b3 34 c0 eb ea 8d 4b 70 8b 43 70 8b 51 04 89 50
<7>exit_aio:ioctx still alive: 2 1 0
blk_tap: Can't get UE info!
Daniel P. Berrange wrote:
On Wed, Feb 14, 2007 at 12:26:32PM +1300, Peter wrote:
Originally we had tried 3.0.4-0 with loop back mounted file systems.
For some reason the dom0 crashed (after running for a day or so). It
did this a couple of times on one host server for us, and again once on
another server.
We have just tried 3.0.4-1 with tap:aio file systems on the domUs.
After that we've gone a couple of days with no kernel crash. Good so far.
You don't say whether the underlying file you are pointing to is sparse
or pre-allocated (non sparse) ? In the sparse case it is expected that
performance is terrible - because every write requires the undering FS
to allocate some more blocks - which in turn causes a journal sync. If
you use non-sparse then all the blocks are pre-allocated so you don't get
the journal bottleneck.
However it seems that performance is a lot slower.
e.g. on a domU:
:/$ time sudo du -s
2020684 .
real 9m25.646s
user 0m0.044s
sys 0m0.144s
On a laptop with a puny 5400 rpm drive: $ time sudo du -s
86923472 .
real 0m18.376s
user 0m0.532s
sys 0m10.737s
And things like bonnie seem to make no real progress.
And on startup things 'seem' slower.
Slower than what? I'd certainly expect tap:aio: to be slower than file:
because file: is not actually flushing your data to disk - it hangs around
in memory and is flushed by the host kernel VM as needed. Since this isn't
remotely safe for your data, its not even worth comparing tap:aio with file:.
Has anyone else experienced slower disk IO with 3.0.4/blktap?
Yes, when using sparse files ontop of a journalled fs.
Regards,
Dan.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|