[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: Is continuous replication of state possible?


  • To: xen-devel@xxxxxxxxxxxxxxxxxxxxx
  • From: George Washington Dunlap III <dunlapg@xxxxxxxxx>
  • Date: Mon, 10 Jan 2005 11:53:26 -0500 (EST)
  • Delivery-date: Mon, 10 Jan 2005 16:56:20 +0000
  • List-id: List for Xen developers <xen-devel.lists.sourceforge.net>

I did a little reading on this subject a couple of years back, and it
seems that on Pentiums getting deterministic execution is impossible
even for UPs, as long as you allow preemptive multitasking. Because
(according to the Intel manuals) the precision of the Pentium
performance counters cannot be relied on, the timer and other interrupts
will essentially act as a random generator. Naturally you can do peridic
checkpointing, but there will be no way correctness can be guaranteed,
unless you coordinate all outgoing traffic between replicas before
making it visible to the outside world.

There is a paper by Bressoud and Schneider about hypervisor-based fault
tolerance on the PA-RISC (which had precise performance counters) which
is worth reading, I found a copy online at
http://roc.cs.berkeley.edu/294fall01/readings/bressoud.pdf .

There's another paper by Dunlap, et al (see the From field) about implementing deterministic execution for a uniprocessor for Athlons, and we have since gotten the same thing working for P4s. :-) It can be found here:

 http://www.eecs.umich.edu/CoVirt/papers/revirt.pdf

The main trick is that there are several different counters you could use. The one spoken of mainly in the literature is the instruction counter (which, on both Athlons and P4's is unusable). However, there are repeatable branch counters on both platforms. Logging the <eip, branch_count> tuple at every interrupt allows us to re-deliver the interrupts precisely. (See Mellor-Crummey89 for a software version of this same idea.)

Maybe sometime I'll post a whitepaper about the dirty details of doing deterministic replay on P4's and Athlons. If you're interested, I have an e-mail that I've already sent to several people (including the Xen team) with the details.

And we're currently working on extending deterministic replay for SMP. :-)

 -George Dunlap

G. W. Dunlap, S. T. King, S. Cinar, M. Basrai, and P. M. Chen. ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay. In Proceedings of the 2002 Symposium on Operating Systems Design and Implementation, pages 211-224, December 2002

J. M. Mellor-Crummey and T. J. LeBlanc. A Software Instruction Counter. In Proceedings of the 1989 International Conference on Architectural Support for Programming Languages and Operating Systems, page 78-86, April 1989.


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.