RE: [Xen-users] high availability iscsi and xen setup
> I was originally planning on having 8 gigabit ports on each DRBD
> (storage) server, going in to 2 seperate switches using multipathing.
> each xen host would also have 2 gigabit links in to each switch also
> using multipathing.
I guess it depends on your disk bandwidth. If you have >200Mbytes/second
disk throughput then you'll need to add more network ports to prevent it
becoming a bottleneck. I'd definitely be plumbing the DRBD network
connections through directly though. At least if you have a switch
failure you won't get split brain.
> This way I would be covered for switch failure also, any thoughts on
> that? This would mean running DRBD and "SAN" traffic on the same
> network, is this not advisable?
> alternatively direct connected 10G CX4 port for DRBD traffic and 4
> (2 in to each switch with multipathing) would also be good I guess?
Every write goes to the primary DRBD server which then sends it to the
secondary DRBD server. I figure that keeping them on separate networks
is a good idea.
> > Split brain problems are the worst problems you'll face. If storage
> > server #1 goes down at 9pm (power failure and this machine is on a
> > circuit with a dud UPS) and everything fails over to #2, then
> > goes down at 10pm (someone forgot to fill up the generators or the
> > runs out of battery or something), power comes back at 11pm but #2
> > to boot and everything starts using storage server #1 which is out
> > date. You'd have to be having a really bad day for that to happen
> > though.
> I think that's an acceptable risk, there's not much you can really do
> mitigate it? I guess stonith to make sure #1 doesnt come back, your
> with nothing but atleast not the "bad" data?
When #1 comes back up DRBD will activate it's split brain policy but you
can set that to be anything you want. Data loss is inevitable though as
both #1 and #2 contain data that doesn't exist on the other. But you're
right, it's pretty much impossible to do anything about it - #1 would
need to know that #2 was primary for a while while #1 was down. This
might be known by the xen servers so maybe you could do something there
but what? You'd basically need to keep the whole system down until
someone came in and fixed #2, so I guess you need to chose between
downtime or data loss.
One thing you could do maybe is stack DRBD on each xen server for the
active LV. This makes migrations expensive though (need to sync up the
DRBD on the new xen node) and would be really fiddly to implement for
what is a very rare possibility
Xen-users mailing list