Can you please try again with the following patch attached:
diff -r 34e72b071e51 xenpci/xenpci_dbgprint.c
--- a/xenpci/xenpci_dbgprint.c Tue Mar 01 23:47:47 2011 +1100
+++ b/xenpci/xenpci_dbgprint.c Wed Mar 02 17:27:31 2011 +1100
@@ -69,10 +69,23 @@
static void XenDbgPrint(PCHAR string, ULONG length)
{
ULONG i;
+ ULONGLONG j;
+ LARGE_INTEGER current_time;
//KIRQL old_irql = 0;
while(InterlockedCompareExchange(&debug_print_lock, 1, 0) == 1)
KeStallExecutionProcessor(1);
+
+ KeQuerySystemTime(¤t_time);
+ current_time.QuadPart /= 10000; /* convert to ms */
+ for (j = 1000000000000000000L; j >= 1; j /= 10)
+ if (current_time.QuadPart / j)
+ break;
+ for (; j >= 1; j /= 10)
+ WRITE_PORT_UCHAR(XEN_IOPORT_LOG, '0' + ((current_time.QuadPart / j)
% 10));
+ WRITE_PORT_UCHAR(XEN_IOPORT_LOG, ':');
+ WRITE_PORT_UCHAR(XEN_IOPORT_LOG, ' ');
+
for (i = 0; i < length; i++)
WRITE_PORT_UCHAR(XEN_IOPORT_LOG, string[i]);
/* release the lock */
That will put a timestamp on each debug message which will help a lot in
diagnosing the problem.
James
> -----Original Message-----
> From: MaoXiaoyun [mailto:tinnycloud@xxxxxxxxxxx]
> Sent: Wednesday, 2 March 2011 14:02
> To: James Harper
> Cc: xen devel
> Subject: RE: [Xen-devel] RE: blue screen in windows balloon driver
>
>
> Attached is the three logs for crash.
> cp17 & 21 crash on
> Assertion failed: srb != NULL
>
> thanks.
>
> > Subject: RE: [Xen-devel] RE: blue screen in windows balloon driver
> > Date: Tue, 1 Mar 2011 23:48:04 +1100
> > From: james.harper@xxxxxxxxxxxxxxxx
> > To: tinnycloud@xxxxxxxxxxx
> > CC: xen-devel@xxxxxxxxxxxxxxxxxxx
> >
> > I've pushed a possible fix for the reset code for Windows 2000, XP
and
> > 2003. I haven't fixed the Vista/2008/7/2008R2 storport driver yet.
> >
> > I'll see what I can do tomorrow to actually test a scsi reset but I
> > can't reproduce the problem you are seeing on my system. You'll
still
> > see the reset messages in the logs which I think simply indicates
that
> > your system is too loaded to complete the requests in time and
Windows
> > thinks the scsi bus is hung, but this way it might pick itself up
again
> > afterwards. On the other hand it may be that too many timeouts and
> > resets will cause windows to throw its hands in the air and give up
and
> > declare the scsi device offline, in which case there might not be
much
> > we can do.
> >
> > James
> >
> > > -----Original Message-----
> > > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-
> > > bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of James Harper
> > > Sent: Tuesday, 1 March 2011 23:36
> > > To: MaoXiaoyun
> > > Cc: xen devel
> > > Subject: [Xen-devel] RE: blue screen in windows balloon driver
> > >
> > > Hold off on testing. I'm fixing up the reset code so that it does
what
> > > Windows wants. I'll post something soon if it doesn't take too
long.
> > >
> > > James
> > >
> > > > -----Original Message-----
> > > > From: MaoXiaoyun [mailto:tinnycloud@xxxxxxxxxxx]
> > > > Sent: Tuesday, 1 March 2011 23:34
> > > > To: James Harper
> > > > Cc: xen devel
> > > > Subject: RE: blue screen in windows balloon driver
> > > >
> > > > I will have new driver tested.
> > > > Attached is the xentop snapshot.
> > > >
> > > > thanks.
> > > >
> > > > > Subject: RE: blue screen in windows balloon driver
> > > > > Date: Tue, 1 Mar 2011 23:11:14 +1100
> > > > > From: james.harper@xxxxxxxxxxxxxxxx
> > > > > To: tinnycloud@xxxxxxxxxxx
> > > > >
> > > > > >
> > > > > > exe attached, thanks.
> > > > > >
> > > > > > I have three machines, on each sum the
*XenVbd_HwScsiResetBus*
> > > event.
> > > > > > 24 VMS, so
> > > > > > grep XenVbd_HwScsiResetBus qemu-dm-w3.MR_cp* | wc -l
> > > > > >
> > > > > > machine 25: VM easily got crash, the sum is 200
> > > > > > machine 23: VM never got crash, the sum is 10
> > > > > > machine 212: VM never got crash, the sum is 16
> > > > > >
> > > > > > it seems that machine 25 has much more XenVbd_HwScsiResetBus
> > event
> > > > > > than other two machines.
> > > > > >
> > > > > > BTW, when start 24VM concurrently, the starting process is
quite
> > > slow,
> > > > > takes
> > > > > > about 20 minutes more to whole started.
> > > > > >
> > > > > > I commented line 505 in xenpci_pdo.c to avoid timed out.
> > > > > >
> > > > > > 505 //remaining -= thiswait;
> > > > > >
> > > > >
> > > > > It sounds like you are overloading your disk IO bandwidth.
With
> > many
> > > > > DomU's swapping heavily, Dom0 may simply not be able to keep
up
> > with
> > > the
> > > > > IO throughput required resulting in windows thinking that the
scsi
> > > > > device isn't responding. Can you check xentop and see what
sort of
> > > IO
> > > > > operations per second you are getting?
> > > > >
> > > > > I have just pushed a change to dump out the in-flight scsi
> > requests
> > > > > (srb) when HwScsiResetBus is called. Please apply the patch
and
> > send
> > > me
> > > > > the next crash.
> > > > >
> > > > > Thanks
> > > > >
> > > > > James
> > >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|