> Subject: RE: PV resume failed after self migration failed
> Date: Wed, 22 Jun 2011 14:06:18 +1000
> From: james.harper@xxxxxxxxxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>
> > >
> > > The xenvbd driver doesn't do any timeout, windows does the timeout
> and
> > > tells xenvbd to reset. I haven't tested the scenario you describe
> very
> > > recently, and xenvbd is now two different drivers, one for scsiport
> (<=
> > > 2003) and one for storport (>= Vista), so there could be bugs in
> either.
> > >
> >
> > The bug can be reproduced in 2003 32bit system. We are using scsi
> driver.
> > I put some log in XenVbd_HwScsiResetBus to see if there are not
> completed
> > srb(Like below)
> > but I didn't see the log when XenVbd_HwScsiResetBus called. So No IO
> is in
> > queue.
>
> Just to confirm, is this the issue that only happens when the migration
> fails in xen and is cancelled?
>
>Exactly.
>I've noticed some difference in log.
In normal resuming, from the log, we can see event port assign like below:
pdo_event_channel = 5 (Notifying event channel 5)
suspend event channel = 6
XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 7 (for VBD)
XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 8 (VIF)
>when guest resuming locally from suspend(that is migration failed in xen, guest
>has already suspended, so it need resuming)
>pdo_event_channel = 7 ( Notifying event channel 7)
>suspend event channel = 8
>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 9 (vif)
>VBD port is not allocated, since pdo is waiting fdo change.
>It looks like port 5 and 6 is still occpuied, or pdo_event_channel bind twice?
it works when I unbind pdo_event_channel & suspend_evtchn.
===================================================================
--- xenpci_fdo.c (revision 4304)
+++ xenpci_fdo.c (working copy)
@@ -656,6 +656,12 @@
}
WdfChildListEndIteration(child_list, &child_iterator);
+ EvtChn_Unbind(xpdd, xpdd->pdo_event_channel);
+ EvtChn_Close(xpdd, xpdd->pdo_event_channel);
+
+ EvtChn_Unbind(xpdd, xpdd->suspend_evtchn);
+ EvtChn_Close(xpdd, xpdd->suspend_evtchn);
+
XenBus_Suspend(xpdd);
EvtChn_Suspend(xpdd);
XenPci_HighSync(XenPci_Suspend0, XenPci_SuspendN, xpdd);
BTW, is there a missing "break" in XenVbd_HwScsiInterrupt, xenvbd_scsiport.c:928
before default? Well, it is harmless.
924 case SR_STATE_RUNNING:
925 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo));
926 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo;
927 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel);
928 ScsiPortNotification(NextRequest, DeviceExtension);
930 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo));
931 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo;
932 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel);
Thanks.
>> James
>>