WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 00 of 10] Teach xm save to checkpoint a running d

To: Steven Hand <Steven.Hand@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH 00 of 10] Teach xm save to checkpoint a running domain
From: Brendan Cully <brendan@xxxxxxxxx>
Date: Fri, 15 Dec 2006 16:04:28 -0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 15 Dec 2006 16:04:29 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <E1Gv86l-0006kd-00@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Mail-followup-to: Steven.Hand@xxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx
References: <patchbomb.1166168316@xxxxxxxxxxxxxxxxx> <E1Gv86l-0006kd-00@xxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.13 (2006-11-21)
I think maybe I forgot to mention that I have successfully
checkpointed domains and restored them from checkpoints (with
file-system activity between checkpoints). It seems to work pretty
well. I'll try to put together a demo of this next week.

Regarding full device disconnection, my understanding is that guest
domains are already prepared to deal with back-end driver crashes (by
maintaining shadows of the ring etc), so a forced reconnect on resume
should be able to recover even if there wasn't an orderly shutdown
before the suspend. I thought when I looked over the code that the
reconnect path did a paranoid forced disconnect first anyway (eg
checking for existing event channels and resetting them).

On the other hand, if checkpoints are taken more frequently than they
are restored, it seems odd to be constantly detaching and reattaching
back-ends in the parent.

But if this is unsafe, it should be fairly easy to make the code do a
full disconnect before suspend. It might be as easy as changing xm
save to write 'suspend' to control/shutdown instead of 'checkpoint'.

On Friday, 15 December 2006 at 08:07, Steven Hand wrote:
> 
> >I'm not too sure about the last couple of patches in this
> >series. Because the checkpointing domain doesn't disconnect before
> >calling suspend, it retains a few references to pages it doesn't
> >own. These trigger a PT race detector in xc_linux_save, which causes
> >it to abort. So the last couple of patches explicitly identify the
> >references I've found so far (shared_info and some grant table shared
> >pages) and simply zero those PTEs during save, since they'll be
> >recreated on restore. Finding the grant table pages is a bit fragile -
> >I walk the page table loaded in CR3 at the time of suspend looking for
> >the virtual address I've stowed in the suspend record. I've only got
> >code for two-level page tables at the moment, since I'm not convinced
> >this is the right approach. Under what circumstances would a non-live
> >save have an unsafe PTE race? 
> 
> Pretty much any PT race in a non-live save/migrate is a bug; the 
> domain is (in theory) suspended at this point, and all of the 
> devices are disconnected. Since you've chosen not to 'disconnect' 
> the devices, you'll get random updates occuring to any shared 
> pages (shared via grants or directly shared with Xen). 
> 
> > Maybe it's fine to simply zero these ptes without checking them. 
> 
> I'd think not. 

to clarify, the pages that have caused races in my experiments are
always the same 5: shared_info and four grant table shared pages. The
reason these don't cause races in plain save is simply that they are
unmapped before suspend is called. Since I've adjusted the kernel to
recreate these specific pages on restore (but not in the parent when
checkpoint returns), my patches do just zero out the PTEs (simulating
in the save code what had previously been done in the guest).

Finding the guest grant table pages is a little annoying though. I
ended up having the guest put the virtual address of its mapping into
an unused field in the suspend record, then walking the page table to
find the MFN. I was thinking it might be better to either get Xen to
export a list of pages that the guest has references to, or to assume
that any unowned MFNs in the page tables are either pages that will be
recreated on restore anyway and just zero them out. In short, I wonder
how often that PT race code has stopped a non-live save. If the answer
is 'never', then zeroing out the PTEs might be fine. Especially since
the original domain is still intact after the checkpoint.

Thanks again for looking this over.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel