Xen project Mailing List

Re: [Xen-devel] vbd flushing during migration?

To: "Andrew Warfield" <andrew.warfield@xxxxxxxxxxxx>, "John Byrne" <john.l.byrne@xxxxxx>

From: "Charles Coffing" <ccoffing@xxxxxxxxxx>

Date: Tue, 01 Aug 2006 15:28:20 -0400

Delivery-date: Tue, 01 Aug 2006 12:28:53 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

I've got a patch in our tree that does (basically) what John is describing. The exact bug we hit was that a "xm shutdown -w vm" did not wait until the vbds were cleared out before returning. So now I wait until the backend/vbd nodes go away before returning. This could probably be done more cleanly with watches, and should be abstracted out to be sure it applies equally to migration, and so forth. But for the sake of discussion, the patch is attached. -Charles >>> On Mon, Jul 31, 2006 at 4:26 PM, in message <44CE83B1.1090605@xxxxxx>, John Byrne <john.l.byrne@xxxxxx> wrote: > It would be a bit ugly, but mostly straightforward to watch for the > destruction of the vbds (or all devices) after the destroyDomain() is > done and then sending an all- clear. (The last time I looked there wasn't > a waitForDomainDestroy() anywhere, so it would probably be best to write > one.) This would guarantee correctness: which is the most important thing. > > The problem I see with that strategy is the effect on downtime during a > live- move. Ideally you'd like to start the vbd cleanup when the final > suspend is done and hope to parallelize the any final device operations > with the final pass of live- move. How to do that and play nice with > domain destruction on the normal path and handle errors seems a lot less > clear to me. > > So, are you just ignoring the notion of minimizing downtime for the > moment or is there something I'm missing? > > John > > Andrew Warfield wrote: >> It's slightly more than a flush that's required. The migration >> protocol needs to be extended so that execution on the target host >> doesn't start until all of the outstanding (i.e. issued by the >> backend) block requests have been either cancelled or acknowledged. >> This should be pretty straight forward given that the backend driver >> ref counts a blkif's state based on pending requests, and won't tear >> down the backend directory in xenstore until all the outstanding >> requests have cleared. All that is likely required is to have the >> migration code register watches on the backend vbd directories, and >> wait for them to disappear before giving the all- clear to the new >> host. >> >> We've talked about this enough to know how to fix it, but haven't had >> a chance to hack it up. (I think Julian has looked into the problem a >> bit for blktap, but not yet done a general fix.) Patches would >> certainly be welcome though. ;) >> >> a. >> >> On 7/31/06, John Byrne <john.l.byrne@xxxxxx> wrote: >>> >>> Hi, >>> >>> I don't see any obvious flush to disk taking place for vbd's on the >>> source host in XendCheckpoint.py before the domain is started on the new >>> host. Is there a guarantee that all written data is on disk somewhere >>> else or is something needed? >>> >>> Thanks, >>> >>> John Byrne >>> >>> >>> _______________________________________________ >>> Xen- devel mailing list >>> Xen- devel@xxxxxxxxxxxxxxxxxxx >>> http://lists.xensource.com/xen- devel >>> >> > > > _______________________________________________ > Xen- devel mailing list > Xen- devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen- devel

Attachment: xen-shutdown-wait.diff
Description: Binary data

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.