I'm currently looking to establish a small pool of xen hosts, each
running several guests, backed by disk servers for the guests,
accessible across the network.
I'm hoping to achieve ease of migration, and reliability of service
(particularly in cases of hardware failure), but I'm not pushing for
24/7 guaranteed HA.
I've looked at the different ways of offering network storage to the
guests, and have narrowed it down to what I think are the 2 best options:
1) DRBD & Heartbeat (or similar cluster management tool): Run 2 disk
servers with one 'master', mirroring in the background. In case of
failure of the master, switch the slave to being the master.
Server-server communication is efficient
Recovery only has to update a small delta
Risk of split-brain, if the disk servers stop 'seeing' each other and
both try to become masters.
2) Software RAID(1) of iSCSI disks: Export an iSCSI disk from each of 2
disk servers - RAID these together on the host, using multipath and
RAID1 to deal with disk failure.
RAID array degrades 'nicely' - no need to switch master/slave roles etc.
No risk of split brain.
No human decision making involved during recovery stages
RAID rebuild happens on the host, not between servers
RAID recovery requires complete rebuild, not just delta of changes
Looking at the above, I am drawn towards the RAID1 option. While it
might be less efficient in terms of speed and rebuild times, it
completely avoids the risk of split brain, and also there seem to be far
fewer places where manual intervention (i.e human decisions where a
script can't do the work) might be needed than with DRBD.
However, I'm also aware I might be missing something fundamental here.
Anyone care to agree with, or enlighten me?
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Description: OpenPGP digital signature
Xen-users mailing list