WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] network hang again

I tracked the glitch back to the 2.4.27 domain-1 (unpriv, uses evms
blocks from dom0 to serve out as iscsi targets via file-io) with this
error message being the trigger point of the colapse.

Sep 15 00:16:55 localhost kernel: fileio_make_request(85) Bad things
happened 40
96, -5

from kernel/file-io.c:lines 76 to 85 seems to be the error point.
                        if (rw == READ)
                                ret = generic_file_read(filp, buf,
count, &ppos);
                        else
                                ret = generic_file_write(filp, buf,
count, &ppos);

                        if (ret != count)
                                printk("%s(%d) Bad things happened %lld,
%d\n",
                                       __FUNCTION__, __LINE__, count,
ret);


-5 is -EIO in linux-2.4.27/include/asm-i386/errno.h:8
#define EIO              5      /* I/O error */

I do NOT get any errors from domain0, so I can't trace through to dom0
right now. 8-(

This error coincides perfectly time wise with the linux-iscsi initiator
errors I got earlier this week, so I believe that this is what's
triggering the iscsi-initiator error.

Any advice on how to figure out what is causing the I/O error would be
greatly appreciated. Right now it is the ONLY thing that is holding me
back from using the IET iSCSI target.

Thanks!

Brian Wolfe

On Tue, 2004-09-14 at 21:50, James Harper wrote:
> When I explained about the patch on the iet list, I was asked if I was
> getting frequent disconnections :)
> 
> It sounds like the network issues I'm seeing in xen are probably
> triggering the crash in iscsi.
> 
> I'm running iet 0.3.3 + 2.6 patch + my additional 2.6 patch on dom0, and
> linux-iscsi 4.0.1.8 on dom1.
> 
> James
> 
> > -----Original Message-----
> > From: Brian Wolfe [mailto:ahzz@xxxxxxxxxxx]
> > Sent: Wednesday, 15 September 2004 02:22
> > To: James Harper
> > Cc: xen-devel@xxxxxxxxxxxxxxxxxxxxx
> > Subject: Re: [Xen-devel] network hang again
> > 
> > I have been running IET 0.3.3 on 2.4.27 on one machine, and cisco's
> > linux-iscsi on 2.6.8.1 on a second physical machine for a couple days
> > now. So far the only thing that I have run into is a dump message
> > concerning OOM on the linux-iscsi machine.
> > 
> > 
> > Sep 13 00:20:11 vhost1 kernel: iSCSI: 4.0.1 ( 9-Feb-2004) built for
> > Linux 2.6.8-tbc-vhost-Xen0
> > Sep 13 00:20:11 vhost1 kernel: iSCSI: will translate deferred sense to
> > current sense on disk command responses
> > Sep 13 00:20:11 vhost1 kernel: iSCSI: control device major number 254
> > Sep 13 00:20:11 vhost1 kernel: scsi_proc_hostdir_add: proc_mkdir
> failed
> > for <NULL>
> > Sep 13 00:20:11 vhost1 kernel: scsi17 : Cisco iSCSI driver
> > Sep 13 00:20:11 vhost1 kernel: iSCSI:detected HBA host #17
> > Sep 13 00:20:11 vhost1 kernel: iSCSI: bus 0 target 0 =
> > iqn.2001-04.dmz.iscsi1:wnhttp
> > Sep 13 00:20:11 vhost1 kernel: iSCSI: bus 0 target 0 portal 0 =
> address
> > 10.11.7.1 port 3260 group 1
> > Sep 13 00:20:11 vhost1 kernel: iSCSI: starting timer thread at
> 21835751
> > Sep 13 00:20:11 vhost1 kernel: iSCSI: bus 0 target 0 trying to
> establish
> > session to portal 0, address 10.11.7.1 port 32
> > 60 group 1
> > Sep 13 00:20:12 vhost1 kernel: iSCSI: session c1478000 authenticated
> by
> > target iqn.2001-04.dmz.iscsi1:wnhttp
> > Sep 13 00:20:12 vhost1 kernel: iSCSI: bus 0 target 0 established
> session
> > #1, portal 0, address 10.11.7.1 port 3260 grou
> > p 1
> > Sep 13 00:20:12 vhost1 kernel:   Vendor: LINUX     Model:
> > ISCSI             Rev: 0
> > Sep 13 00:20:12 vhost1 kernel:   Type:
> > Direct-Access                      ANSI SCSI revision: 03
> > Sep 13 00:20:12 vhost1 kernel: SCSI device sda: 16777212 512-byte hdwr
> > sectors (8590 MB)
> > Sep 13 00:20:12 vhost1 kernel: SCSI device sda: drive cache: write
> back
> > Sep 13 00:20:12 vhost1 kernel:  sda: unknown partition table
> > Sep 13 00:20:12 vhost1 kernel: Attached scsi disk sda at scsi17,
> channel
> > 0, id 0, lun 0
> > Sep 13 00:20:12 vhost1 kernel:   Vendor: LINUX     Model:
> > ISCSI             Rev: 0
> > Sep 13 00:20:12 vhost1 kernel:   Type:
> > Direct-Access                      ANSI SCSI revision: 03
> > Sep 13 00:20:12 vhost1 kernel: SCSI device sdb: 65536 512-byte hdwr
> > sectors (34 MB)
> > Sep 13 00:20:12 vhost1 kernel: SCSI device sdb: drive cache: write
> back
> > Sep 13 00:20:12 vhost1 kernel:  sdb: unknown partition table
> > Sep 13 00:20:12 vhost1 kernel: Attached scsi disk sdb at scsi17,
> channel
> > 0, id 0, lun 1
> > Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: found reiserfs format
> > "3.6" with standard journal
> > Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: using ordered data mode
> > Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: journal params: device
> > sda, size 8192, journal first block 18, max trans
> > len 1024, max batch 900, max commit age 30, max trans age 30
> > Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: checking transaction log
> > (sda)
> > Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: replayed 1 transactions
> in
> > 0 seconds
> > Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: Using r5 hash to sort
> > names
> > Sep 13 00:28:51 vhost1 kernel: iscsi-tx: page allocation failure.
> > order:1, mode:0x20
> > Sep 13 00:28:51 vhost1 kernel:  [__alloc_pages+728/848]
> > __alloc_pages+0x2d8/0x350
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [__get_free_pages+31/64]
> > __get_free_pages+0x1f/0x40
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [kmem_getpages+30/224]
> > kmem_getpages+0x1e/0xe0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [cache_grow+159/336]
> > cache_grow+0x9f/0x150
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [cache_alloc_refill+318/512]
> > cache_alloc_refill+0x13e/0x200
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [__kmalloc+139/160]
> __kmalloc+0x8b/0xa0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [alloc_skb+71/224] alloc_skb+0x47/0xe0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+38296326/1002676224]
> > rhine_rx+0x156/0x460 [via_rhine]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+38295340/1002676224]
> > rhine_interrupt+0x1ac/0x1d0 [via_rhine]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [handle_IRQ_event+73/144]
> > handle_IRQ_event+0x49/0x90
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [do_IRQ+109/240] do_IRQ+0x6d/0xf0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [evtchn_do_upcall+156/256]
> > evtchn_do_upcall+0x9c/0x100
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [hypervisor_callback+51/73]
> > hypervisor_callback+0x33/0x49
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [csum_partial_copy_generic+63/248]
> > csum_partial_copy_generic+0x3f/0xf8
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [tcp_sendmsg+578/4176]
> > tcp_sendmsg+0x242/0x1050
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [inet_sendmsg+77/96]
> > inet_sendmsg+0x4d/0x60
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [sock_sendmsg+165/192]
> > sock_sendmsg+0xa5/0xc0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [__do_softirq+149/160]
> > __do_softirq+0x95/0xa0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [do_softirq+69/80]
> do_softirq+0x45/0x50
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [do_IRQ+194/240] do_IRQ+0xc2/0xf0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+39270168/1002676224]
> > iscsi_xmit_queued_cmnds+0x188/0x3c0 [iscsi]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+39254271/1002676224]
> > iscsi_sendmsg+0x4f/0x70 [iscsi]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+39271874/1002676224]
> > iscsi_xmit_data+0x472/0x8d0 [iscsi]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [__do_softirq+149/160]
> > __do_softirq+0x95/0xa0
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+39273273/1002676224]
> > iscsi_xmit_r2t_data+0x119/0x1f0 [iscsi]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+39165617/1002676224]
> > iscsi_tx_thread+0x711/0x8d0 [iscsi]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [autoremove_wake_function+0/96]
> > autoremove_wake_function+0x0/0x60
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [autoremove_wake_function+0/96]
> > autoremove_wake_function+0x0/0x60
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [default_wake_function+0/32]
> > default_wake_function+0x0/0x20
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [pg0+39163808/1002676224]
> > iscsi_tx_thread+0x0/0x8d0 [iscsi]
> > Sep 13 00:28:51 vhost1 kernel:
> > Sep 13 00:28:51 vhost1 kernel:  [kernel_thread_helper+5/16]
> > kernel_thread_helper+0x5/0x10
> > Sep 13 00:28:51 vhost1 kernel:
> > 
> > The only reason I'm posting the "trace" from linux-iscsi is because it
> > contains the hypervisor_callback function in it and it's in the rx
> phase
> > of the via_rhine driver.
> > 
> > What iscsi are you running on each machine? (Sorry if I missed it,
> been
> > offline for a few deays now. 8-( ) I'd be interested to know if this
> is
> > in any way similar to your issue.
> > 
> > Brian
> > 
> > 
> > On Tue, 2004-09-14 at 07:38, James Harper wrote:
> > > I'm now seeing this network hang a lot, to the point where it makes
> my
> > > iscsi testing unusable. I believe this is more to do with the sort
> of
> > > testing I'm doing now more so than a bug that has suddenly appeared.
> > >
> > > My setup is this:
> > > Dom0:
> > > 2.6.8.1
> > > Iscsitarget 0.3.3 + 2.6 patches + my own 2.6 patches.
> > > No conntrack or other netfilter related modules
> > > Bridged eth0 to Dom1
> > > /usr/src exported via nfs
> > >
> > > Dom1:
> > > 2.6.8.1
> > > Linux-iscsi 4.0.1.8
> > > No conntrack or other netfilter related modules
> > > /usr/src mounted from Dom0
> > >
> > > Iscsi works for a while, normally crashing in Dom0 due to another
> > > non-xen related bug before it hits this bug, but if I try to do a
> > > compile on Dom1 in the nfs mounted /usr/src, the network locks up
> almost
> > > instantly, but then clears up shortly after if I kill the compile.
> > >
> > > The logs show absolutely nothing of any use.
> > >
> > > I've just tried a few netperf tests. A quick hammering goes off
> without
> > > a hitch, but afterwards I see random dropped packets. I'll keep
> testing.
> > >
> > > James
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
> > > Project Admins to receive an Apple iPod Mini FREE for your judgement
> on
> > > who ports your project to Linux PPC the best. Sponsored by IBM.
> > > Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@xxxxxxxxxxxxxxxxxxxxx
> > > https://lists.sourceforge.net/lists/listinfo/xen-devel
> > 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: thawte's Crypto Challenge Vl
> Crack the code and win a Sony DCRHC40 MiniDV Digital Handycam
> Camcorder. More prizes in the weekly Lunch Hour Challenge.
> Sign up NOW http://ad.doubleclick.net/clk;10740251;10262165;m
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/xen-devel



-------------------------------------------------------
This SF.Net email is sponsored by: thawte's Crypto Challenge Vl
Crack the code and win a Sony DCRHC40 MiniDV Digital Handycam
Camcorder. More prizes in the weekly Lunch Hour Challenge.
Sign up NOW http://ad.doubleclick.net/clk;10740251;10262165;m
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel

<Prev in Thread] Current Thread [Next in Thread>