[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10] remus drbd: Implement remus drbd replicated disk



Shriram Rajagopalan writes ("Re: [PATCH v10] remus drbd: Implement remus drbd 
replicated disk"):
> It does. But the design is such that the disk and memory checkpoints are
> simultaneously transmitted. So by the time this call is made, the ack is
> already in the system.

One packet might get lost while the other gets through.

Risking locking up the whole of the process is unfortunately not
acceptable.

> -- this is the common case. Covers about 90% of the calls (since disk traffic
> is pretty low compared to memory checkpoint).
> 
> > What if the network is broken ?  Might it not then delay indefinitely ?
> 
> Nope.  I designed the relevant drbd code such that the ioctl wait times out
> (configurable) in worst case, returning an error. The time out is generally
> about 300ms. This code path is exercised only during failures.

If you think this ioctl will, when there is no error, complete
immediately, can we have a non-blocking versiion, and fall back to the
fork trick ?

Or better still, is there something we could poll() on to find out
when the ioctl will definitely complete ?

> So, a one-time error condition and few slow checkpoints out of an indefinite
> number of checkpoints don't warrant a fork per ioctl call (which usually
> returns immediately).

libxl might be handling a large number of domains.  If libxl blocks, a
lot of everything else might stall.  (Including checkpoints of other
domains, if you care about that.)

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.