So, we just moved to some much faster hardware. intel q6600 CPU, 8Gb
unbuffered ECC, ICH7 sata (2x1TB disks) - and we were irritated
and puzzled to find that the new setup had really, really slow I/O.
The odd thing is that the performance is fine if you just mount the LV
directly from the Dom0... but if you xm block-attach it to the Dom0
and then mount it, you get 1/10th the speed.
we are running CentOS 5.1, kernel 2.6.18-53.1.14.el5
full writeup:
We noted that the performance of mirrored logical volumes accessed
through xenblk was about 1/10th that of non-mirrored LVs, or of LVs
mirrored with the --corelog option. Mirrored LVs performed fine when
accessed normally within the dom0, but performance dropped when
accessed via xm block-attach. This was, to our minds, ridiculous.
First, we created two logical volumes in the volume group "test":
one with mirroring and a mirror log and one with the --corelog option.
# lvcreate -m 1 -L 2G -n test_mirror test
# lvcreate -m 1 --corelog -L 2G -n test_core test
Then we made filesystems and mounted them:
# mke2fs -j /dev/test/test*
# mkdir -p /mnt/test/mirror
# mkdir -p /mnt/test/core
# mount /dev/test/test_mirror /mnt/test/mirror
Next we started oprofile, instructing it to count BUS_IO_WAIT events:
# opcontrol --start --event=BUS_IO_WAIT:500:0xc0
--xen=/usr/lib/debug/boot/xen-syms-2.6.18-53.1.14.el5.debug
--vmlinux=/usr/lib/debug/lib/modules/2.6.18-53.1.14.el5xen/vmlinux
--separate=all
Then we ran bonnie on each device in sequence, stopping oprofile and
saving the output each time.
# bonnie++ -d /mnt/test/mirror
# opcontrol --stop
# opcontrol --save=mirrorlog
# opcontrol --reset
The LV with the corelog displayed negligible iowait, as expected.
However, the other experienced quite a bit:
# opreport -t 1 --symbols session:iowait_mirror
warning: /ahci could not be found.
CPU: Core 2, speed 2400.08 MHz (estimated)
Counted BUS_IO_WAIT events (IO requests waiting in the bus queue) with a unit
mask of 0xc0 (All cores) count 500
Processes with a thread ID of 0
Processes with a thread ID of 463
Processes with a thread ID of 14185
samples % samples % samples % app name
symbol name
32 91.4286 15 93.7500 0 0
xen-syms-2.6.18-53.1.14.el5.debug pit_read_counter
1 2.8571 0 0 0 0 ahci
(no symbols)
1 2.8571 0 0 0 0 vmlinux
bio_put
1 2.8571 0 0 0 0 vmlinux
hypercall_page
>From this, it seemed clear that the culprit was in the
pit_read_counter function.
Any ideas on where to take it from here?
Credit to Chris Takemura <chris@xxxxxxxxx> for repeating the problem with
oprofile, and the writeup
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|