This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-users] AoE (Was: iscsi vs nfs for xen VMs)

> -----Original Message-----
> From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-
> bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Simon Hobson
> Subject: Re: [Xen-users] AoE (Was: iscsi vs nfs for xen VMs)
> Getting somewhat off-topic, but I'm interested to know how AoE handles
> errors ? I assume there is some handshake to make sure packets were
> rather than just "fire and forget" !

The Linux aoe open-source driver from Coraid (with which I am the most
familiar) implements a congestion avoidance and control algorithm,
similar to TCP/IP.  If a response exceeds twice the average round-trip
time plus 8 times the average deviation, the request is retransmitted
(based on aoe6-75 sources, earlier sources may differ).

What's interesting about aoe vs. TCP is that a round-trip measures both
network and disk latency, not just network latency.  A request request
will send a request packet, after which the target performs a disk read,
and returns a response packet with the disk sector contents.  A normal
write request will send a request with the sector contents, upon which
the target performs a disk write, and returns a status packet.  Disk
latency is orders of magnitude greater than network, and more variable.
We see a RTT of 5-10ms typically under light usage.

Upon heavy disk I/O, this time can vary upwards, possibly tenths of
seconds, leading to apparent packet loss and an RTT adjustment by the
driver.  So it's not uncommon for a target to receive and process a
duplicate request, which is okay because each request is idempotent.

Lossage of 0.1% to 0.2% is common in our environment, but this does not
have a significant impact overall on aoe performance.

That said, the aoe protocol also supports an asynchronous write
operation, which I suppose really is "fire and forget", unlike normal
reads and writes.  I haven't used an aoe driver that implements
asynchronous writes however, and I'm not sure I would if I had the
option since you have no guarantee that the writes succeed.


Xen-users mailing list