|
|
|
|
|
|
|
|
|
|
xen-users
[Xen-users] [Iscsitarget-devel] tracking down cause of filesystem corrup
Hi there,
This has just been posted to the following mailing lists:
open-iscsi@xxxxxxxxxxxxxxxx
iscsitarget-devel@xxxxxxxxxxxxxxxxxxxxx
I have been advised to send it to the xen mailing list as well, so here
we are!
:)
I've been testing iscsi for use in a XEN virtualisation environment and
have been getting pretty bad filesystem corruption after only about 30
minutes of use.
I don't have the time or resources to track down exactly what is causing
this without some help and guidance; for example its a mystery to me as
to whether its the initiator end or the target end or something else
again which is causing the problems.
1. The initiator is on a xen3 dom0 host running Debian Etch.
2. The target is provided by another xen3 dom0 host running Debian Etch.
3. The domU (virtual machine) is running Debian Etch and sees the iscsi
target as a block device given to it by XEN; the domU knows nothing
about iscsi.
4. The filesystem being presented via iscsi is ext3 with default mount
options.
5. uname -a on all machines reads pretty much the same:
Linux fileserver 2.6.18-3-xen-686 #1 SMP Mon Dec 4 20:48:20 UTC 2006
i686 GNU/Linux
I get errors like this on the initiator:
<errors>
Jan 31 09:13:15 xen5 kernel: sd 3:0:0:0: SCSI error: return code =
0x00010000
Jan 31 09:13:15 xen5 kernel: end_request: I/O error, dev sde, sector 6996544
</errors>
and the domU sees its root filesystem disappear and hangs.
I see no errors in logs on the machine providing the target.
The target daemon was not restarted nor HUPPED during these tests
neither was the initiator daemon.
The network interfaces on the initiator and target show no errors, no
dropped and no overruns (according to ifconfig).
My ietd.conf for the iscsi target used by the domU is attached
(passwords changed) as is the iscsid node config (again, passwords changed).
You will notice that this domU has been running spamassassin; I figured
that would give the iscsi layer a good workout since it was running
spamassassin for 3 seperate mail servers each with fairly high throughput.
I am hoping that there is some config parameter which I have set
inappropriately and that there is an easy fix as iscsi + xen would be
very useful!
If there is any further info that anyone needs please ask.
If there are any commandlines I can run to extract info or perform some
diagnostics to see where the problems are occuring, please send them to
me and I'll send back the results.
node.name = iqn.2006-12.fileserver:spamassassin
node.transport_name = tcp
node.tpgt = 1
node.active_conn = 1
node.startup = automatic
node.session.initial_cmdsn = 0
node.session.auth.authmethod = None
node.session.auth.username = spamassassin
node.session.auth.password = password
node.session.timeo.replacement_timeout = 120
node.session.err_timeo.abort_timeout = 10
node.session.err_timeo.reset_timeout = 30
node.session.iscsi.InitialR2T = Yes
node.session.iscsi.ImmediateData = No
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.session.iscsi.DefaultTime2Retain = 0
node.session.iscsi.DefaultTime2Wait = 0
node.session.iscsi.MaxConnections = 1
node.session.iscsi.MaxOutstandingR2T = 1
node.session.iscsi.ERL = 0
node.conn[0].address = 10.10.10.129
node.conn[0].port = 3260
node.conn[0].startup = automatic
node.conn[0].tcp.window_size = 524288
node.conn[0].tcp.type_of_service = 0
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.active_timeout = 5
node.conn[0].timeo.idle_timeout = 60
node.conn[0].timeo.ping_timeout = 5
node.conn[0].timeo.noop_out_interval = 0
node.conn[0].timeo.noop_out_timeout = 0
node.conn[0].iscsi.MaxRecvDataSegmentLength = 65536
node.conn[0].iscsi.HeaderDigest = CRC32C
node.conn[0].iscsi.DataDigest = CRC32C
node.conn[0].iscsi.IFMarker = No
node.conn[0].iscsi.OFMarker = No
Target iqn.2006-12.fileserver:spamassassin
Lun 0 Path=/dev/volumes/spamassassin,Type=fileio
Lun 2 Path=/dev/volumes/spamassassin-swap,Type=fileio
Alias spamassassin
IncomingUser spamassassin password
InitialR2T Yes
ImmediateData No
MaxRecvDataSegmentLength 8192
MaxXmitDataSegmentLength 8192
MaxBurstLength 262144
FirstBurstLength 65536
DefaultTime2Wait 2
DefaultTime2Retain 20
MaxOutstandingR2T 8
DataPDUInOrder Yes
DataSequenceInOrder Yes
ErrorRecoveryLevel 0
HeaderDigest CRC32C
DataDigest CRC32C
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-users] [Iscsitarget-devel] tracking down cause of filesystem corruption,
Steve Wray <=
|
|
|
|
|