Re: [Xen-users] A simple backup
Am Freitag, 9. Mai 2008 15:07 schrieb John Haxby:
> Taking a snapshot of both memory and disk image does work to a large
> extent, but in the case of this message store simply bringing up the old
> memory and disk image suddenly leaves the guest OS wondering what to do
> with all these network connections it used to have -- it will generally
> recover and clean-up its aborted transactions, but, for example,
> messages that were being sent _out_ at the time of the snapshot will be
> resent because the outgoing connection didn't acknowledge the message.
> Depending on what the message is this will vary from the merely annoying
> all the way through to the downright weird.
> I would imagine that there are other application domains where
> restarting a transaction from the restored domU would have rather
> unpleasant side effects.
and i think this is the point. You have to consider what you want.
Fast recovery to a given time with the loss of all computed data since there
you are fine with saved memory, state and lvm snapshot.
But normally you won't loose any bit, e.g. xen host dies and takes all domUs
down. You won't recover from a state hours ago, instead you would start the
domUs. And if you copy the snapshots somewhere it would take a long time to
put the disk data in place. If you you leave the lvm snapshot, perhaps making
multiple snapshots it will highly decrease disk io performance.
My normal backup strategy is as follows:
* Let the domU make consistent backups of important data such as databases. A
lot of (big) applications have commands to let them prepare for backup (let
them make the data consistent on disk). Or just to write-lock the
application and sync to disk. This state must only consist a shot period of
time till the snapshot is created.
* Make lvm snapshots of the disks outside domU while domUs are running (it's
like turning off the computer/harddisks without stopping the domUs).
* Inside domUs release locks, say application that they can continue
* Mount these snapshots to let the fs make the fs consistent (not the data).
* Backup the files to somewhere (i use rsync with hardlink copies on a logical
volume on the same volume group).
* Umount snapshots and release snapshots.
With this i have the following options:
* Recover single files from backup without interrupting domU.
* Recover databases with database dumps without interrupting domU.
* If a domU dies unexpectly just start it, the fs should play back journal
and so the fs is consistent. If a database doesn't come up, take
the dumps from the backup.
* If domU gets badly destroyed like fs error or a lot of real harddisk
failures i only have to make new fs for the domU, copy files from backup on
it and start the domU (this is a desaster recovery). This is very similiar
to the here discussed backup strategy. But in my experience it is a lot
faster than handling big dd images or having a lot of snapshots active. The
only thing i don't have is a running application. But as mentioned here
already, it could be very useless to have a running cpu managing a
connection which is discarded long ago.
I do this for about two years and i've made about five desaster recoveries
(bacause of user failures) and normally i'm asked to bring back single files
or databases without interrupting the whole domU. Doing a full desaster
recovery is only an option to me if nothing is left (like deleted/overwritten
PS: And very new as bonus for my users i've managed to
include /backup/YYYY-MM-DD/fullfilesystem over sshfs in all domUs so that
users can easily get files out of /backup without consulting the backup
operators (well, this is linux only for now).
Description: PGP signature
Xen-users mailing list