Hi James:
In addtion, I think the if statement in XenVbd_HwScsiResetBus, we might need use
suspend_resume_state_fdo, not suspend_resume_state_pdo.
Since suspend_resume_state_pdo is changed to SR_STATE_SUSPENDING, but there
are still io request not finished, when reset happen, those IO can be finished.
What do u think?
Thanks.
static BOOLEAN
XenVbd_HwScsiResetBus(PVOID DeviceExtension, ULONG PathId)
{
PXENVBD_DEVICE_DATA xvdd = DeviceExtension;
srb_list_entry_t *srb_entry;
PSCSI_REQUEST_BLOCK srb;
int i;
UNREFERENCED_PARAMETER(DeviceExtension);
UNREFERENCED_PARAMETER(PathId);
FUNCTION_ENTER();
KdPrint((__DRIVER_NAME " IRQL = %d\n", KeGetCurrentIrql()));
if (xvdd->ring_detect_state == RING_DETECT_STATE_COMPLETE && xvdd->device_state->suspend_resume_state_pdo == SR_STATE_RUNNING) *********this line
{
while((srb_entry = (srb_list_entry_t *)RemoveHeadList(&xvdd->srb_list)) != (srb_list_entry_t *)&xvdd->srb_list)
{
srb = srb_entry->srb;
srb->SrbStatus = SRB_STATUS_BUS_RESET;
KdPrint((__DRIVER_NAME " completing queued SRB %p with status SRB_STATUS_BUS_RESET\n", srb));
ScsiPortNotification(RequestComplete, xvdd, srb);
}
>> Subject: RE: PV resume failed after self migration failed
>> Date: Wed, 22 Jun 2011 14:06:18 +1000
>> From:
james.harper@xxxxxxxxxxxxxxxx>> To:
tinnycloud@xxxxxxxxxxx;
xen-devel@xxxxxxxxxxxxxxxxxxx>>
>> > >
>> > > The xenvbd driver doesn't do any timeout, windows does the timeout
>> and
>> > > tells xenvbd to reset. I haven't tested the scenario you describe
>> very
>> > > recently, and xenvbd is now two different drivers, one for scsiport
>> (<=
>> > > 2003) and one for storport (>= Vista), so there could be bugs in
>> either.
>> > >
>> >
>> > The bug can be reproduced in 2003 32bit system. We are using scsi
>> driver.
>> > I put some log in XenVbd_HwScsiResetBus to see if there are not
>> completed
>> > srb(Like below)
>> > but I didn't see the log when XenVbd_HwScsiResetBus called. So No IO
>> is in
>> > queue.
>>
>> Just to confirm, is this the issue that only happens when the migration
>> fails in xen and is cancelled?
>>
>>Exactly.
>>I've noticed some difference in log.
>
>In normal resuming, from the log, we can see event port assign like below:
>pdo_event_channel = 5 (Notifying event channel 5)
>suspend event channel = 6
>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 7 (for VBD)
>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 8 (VIF)
>
>>when guest resuming locally from suspend(that is migration failed in xen, guest
>>has already suspended, so it need resuming)
>
>>pdo_event_channel = 7 ( Notifying event channel 7)
>>suspend event channel = 8
>>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 9 (vif)
>
>>VBD port is not allocated, since pdo is waiting fdo change.
>
>>It looks like port 5 and 6 is still occpuied, or pdo_event_channel bind twice?
>
>it works when I unbind pdo_event_channel & suspend_evtchn.
>===================================================================
>--- xenpci_fdo.c (revision 4304)
>+++ xenpci_fdo.c (working copy)
>@@ -656,6 +656,12 @@
> }
> WdfChildListEndIteration(child_list, &child_iterator);
>
>+ EvtChn_Unbind(xpdd, xpdd->pdo_event_channel);
>+ EvtChn_Close(xpdd, xpdd->pdo_event_channel);
>+
>+ EvtChn_Unbind(xpdd, xpdd->suspend_evtchn);
>+ EvtChn_Close(xpdd, xpdd->suspend_evtchn);
>+
> XenBus_Suspend(xpdd);
> EvtChn_Suspend(xpdd);
> XenPci_HighSync(XenPci_Suspend0, XenPci_SuspendN, xpdd);
>
>
>BTW, is there a missing "break" in XenVbd_HwScsiInterrupt, xenvbd_scsiport.c:928
>before default? Well, it is harmless.
>
>924 case SR_STATE_RUNNING:
>925 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo));
>926 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo;
>927 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel);
>928 ScsiPortNotification(NextRequest, DeviceExtension);
>929 default:
>930 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo));
>931 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo;
>932 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel);
>933 break;
>
>Thanks.
>>> James
>>>
>