WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Question on XenVbd_HwScsiResetBus in PV driver

To: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Question on XenVbd_HwScsiResetBus in PV driver
From: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
Date: Fri, 22 Jul 2011 17:38:10 +0800
Cc: james.harper@xxxxxxxxxxxxxxxx
Delivery-date: Fri, 22 Jul 2011 02:38:52 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
Importance: Normal
In-reply-to: <BLU157-w1080D4E6CD8170842EFCD9DA4A0@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <BLU157-w1080D4E6CD8170842EFCD9DA4A0@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi:
 
     Well, AFAK, There is a KeSwapProcessOrStack thread in Windonws kernel to swap in/out thread
kernel Stack, and it is possible to cause BOSD code 0x77/0x7E, Which means the IO page requestion can
not be complete successfully due to disk fail. This is reproduceable by periodically "gdb attach tapdsik"
process in dom0, to simulate IO large response, larger than 10s.
 
      In fact, the IO stream from tapdisk is written to our own storage cluster, and it supports
failover, but it takes time,  so it means, when failover, the IO is hang from VM side. When this
happen, we confront some bluescreens.
 
     Also I've done some experiments, test two scenerios,
     1) use current XenVbd_HwScsiResetBus, that is complete IO with SRB_STATUS_BUS_RESET
     2) do nothing in XenVbd_HwScsiResetBus
    Just use gdb tapdisk to hold IO periodically, it shows that 1) makes higher possibilty blue
screen than 2)(in fact, we have'nt met bluescreen in 2)).
 
     Form the log, I see XenVbd_HwScsiResetBus every 14seconds( 10 Seconds + 4S hold time)
in scenerio 1), but in 2) I just saw a fem of them(less than 10), It looks like the driver call resetbus
on a few of times.
 
     So, I have below assumptions or questions:
     1) Only some of the IO failure will cause BOSD
     2) Do nothing in XenVbd_HwScsiResetBus  is relatively good to minimize the bluescreen posibity
     3) Well, I still confuse how is XenVbd_HwScsiResetBus called, and why XenVbd_HwScsiResetBus not
called if nothing to be done in XenVbd_HwScsiResetBus.
     4) Is it ok do nothing in XenVbd_HwScsiResetBus?
 
      Could you help to clarify? Many thanks.
 
 
    
   
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel