|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] iscsi conn error: Xen related?
Hey Thomasz,
I could get an interesting and clear trace when things started going
south this morning... for no appearant reason (ie: no load). Load
shouldn't be a problem is this environment yet.
Starting with nop-out timing out.. then sank from there to fail all I/O.
I indeed also opened a ticket with open-e, but haven't gotten an answer
yet.
I also launched a ping -s 8192 -i 3 -I ethXX to the storage, to see if I
am losing icmp packets when the iscsi connections are lost.
Upgrade can be an option soon.. I also saw xen 3.1.2 was out, so I may
upgrade everything at once in a while if the problem persist and no
solution is found.
The switches doesn't have anything in the log that could indicate any
issue with jumbo frames, or anything else for that matter.
Thanks all,
fred
Tomasz Chmielewski wrote:
Fred Blaise schrieb:
Hello all,
I got some severe iscsi connection loss on my dom0 (Gentoo
2.6.20-xen-r6, xen 3.1.1). Happening several times a day.
open-iscsi version is 2.0.865.12. Target iscsi is the open-e DSS product.
Here is a snip of my messages log file:
May 5 16:52:50 ying connection226:0: iscsi: detected conn error (1011)
May 5 16:52:51 ying iscsid: connect failed (111)
May 5 16:52:51 ying iscsid: Kernel reported iSCSI connection 226:0
error (1011) state (3)
May 5 16:52:53 ying connection215:0: iscsi: detected conn error (1011)
May 5 16:52:53 ying iscsid: connect failed (111)
May 5 16:52:53 ying iscsid: connect failed (111)
May 5 16:52:53 ying iscsid: connect failed (111)
May 5 16:52:53 ying iscsid: connect failed (111)
[...]
and sometimes:
May 5 16:53:11 ying iscsid: connection227:0 is operational after
recovery (6 attempts)
May 5 16:53:11 ying iscsid: connection221:0 is operational after
recovery (6 attempts)
May 5 16:53:12 ying iscsid: connection214:0 is operational after
recovery (9 attempts)
I doubt it's Xen related.
I'm running lots of dom0s and domUs (and non-Xen) running as iSCSI
initiator mostly without such problems.
If it ever happens, it can mean a problem with:
1) iSCSI target implementation,
2) either the target or initiator is very loaded (or both).
Did you try changing the iSCSI target, either to tgt or SCST? I'm not
sure what targer you have with e-open; I think they wanted to migrate to
SCST, but used buggy IET before (or stil use, I'm not sure).
Any other messages/logs?
2.6.25 has a nice feature with soft lockups detection, i.e. it will
print such messages when machine is severely loaded (it may indicate
some problems):
May 3 00:46:33 backup1 kernel: INFO: task sync:4875 blocked for more
than 120 seconds.
snap_msglog.txt
Description: Text document
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|
|
|