Thanks Julian, it is more clear for me how snapshot works.
- Anthony
On Wed, 2010-01-27 at 02:37 -0800, Julian Chesterfield wrote:
> Ian Pratt wrote:
> >> That means if guest linux is executing "yum install kernel" when
> >> creating snapshot, the vm created from this snapshot might be not
> >> bootable.
> >>
> >
> > Because xen issues write completions to the guest only when IO is
> > completed, the snapshot will at least be crash consistent from a filesystem
> > point of view (just like a physical system loosing power).
> >
> > Linux doesn't have a generic mechanism for doing higher-level 'freeze'
> > operations (see Windows VSS) so there's no way to notify yum that we'd like
> > to take a snapshot. Some linux filesystems do support a freeze operation,
> > but it's not clear this buys a great deal.
> >
> Ack. Without application signalling (as provided by VSS) it's unclear
> whether there's any real benefit since the application data may still be
> internally inconsistent.
>
> FYI - for windows VMs XCP includes a VSS quiesced snapshot option
> (VM.snapshot_with_quiesce) which utilises the agent running in the guest
> as a VSS requestor to quiesce the apps, flush the local cache to disk
> and then trigger a snapshot for all the VMs disks.
>
> - Julian
> > 99 times out of 100 you'll get away with just taking a snapshot of a VM. If
> > you're wanting to use the snapshot as a template for creating other clones
> > you'd be best advised to shut the guest down and get a clean filesystem
> > though. Any snapshot should be fine for general file backup purposes.
> >
> > Ian
> >
> > PS: I'd be surprised if "yum install kernel" didn't actually go to some
> > lengths to be reasonably atomic as regards switching grub over to using the
> > new kernel, otherwise you'd have the same problem on a physical machine
> > crashing or losing power.
> >
> >
> >> - Anthony
> >>
> >>
> >>
> >>
> >>
> >>> Daniel
> >>>
> >>>
> >>>> How does XCP make sure this snapshot is usable,say, virtual disk
> >>>> metadata is consistent?
> >>>>
> >>>> Thanks
> >>>> - Anthony
> >>>>
> >>>>
> >>>> On Tue, 2010-01-26 at 13:56 -0800, Ian Pratt wrote:
> >>>>
> >>>>>> I still have below questions.
> >>>>>>
> >>>>>> 1. if a non-leaf node is coalesce-able, it will be coalesced later
> >>>>>>
> >> on
> >>
> >>>>>> regardless how big the physical size of this node?
> >>>>>>
> >>>>> Yes: it's always good to coalesce the chain to improve access
> >>>>>
> >> performance.
> >>
> >>>>>> 2. there is one leaf node for a snapshot, actually it may be
> >>>>>>
> >> empty, does
> >>
> >>>>>> it exist only because it can prevent coalesce.
> >>>>>>
> >>>>> Not quite sure what you're referring to here. The current code has a
> >>>>>
> >> limitation whereby it is unable to coalesce a leaf into its parent, so
> >> after you've created one snapshot you'll always have a chain length of 2
> >> even if you delete the snapshot (if you create a second snapshot it can be
> >> coalesced).
> >>
> >>>>> Coalescing a leaf into its parent is on the todo list: its a little
> >>>>>
> >> bit different from the other cases because it requires synchronization if
> >> the leaf is in active use. It's not a big deal from a performance point of
> >> view to have the slightly longer chain length, but it will be good to get
> >> this fixed for cleanliness.
> >>
> >>>>>> 3. a clone will introduce a writable snapshot, it will prevent
> >>>>>>
> >> coalesce
> >>
> >>>>> A clone will produce a new writeable leaf linked to the parent. It
> >>>>>
> >> will prevent the linked snapshot from being coalesced, but any other
> >> snapshots above or below on the chain can still be coalesced by the
> >> garbage collector if the snapshots are deleted.
> >>
> >>>>> The XCP storage management stuff is pretty cool IMO...
> >>>>>
> >>>>> Ian
> >>>>>
> >>>>>
> >>>>>> - Anthony
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, 2010-01-26 at 02:34 -0800, Julian Chesterfield wrote:
> >>>>>>
> >>>>>>> Hi Anthony,
> >>>>>>>
> >>>>>>> Anthony Xu wrote: > Hi all, > > Basically snapshot on LVMoISCSI
> >>>>>>>
> >> SR work
> >>
> >>>>>>> well, it provides thin > provisioning, so it is fast and disk
> >>>>>>>
> >> space
> >>
> >>>>>>> efficient. > > > But I still have below concern. > > There is
> >>>>>>>
> >> one more
> >>
> >>>>>>> vhd chain when creating snapshot, if I creates 16 > snapshots,
> >>>>>>>
> >> there
> >>
> >>>>>>> are 16 vhd chains, that means when one VM accesses a > disk
> >>>>>>>
> >> block, it
> >>
> >>>>>>> may need to access 16 vhd lvm one by one, then get the > right
> >>>>>>>
> >> block,
> >>
> >>>>>>> it makes VM access disk slow. However, it is > understandable,
> >>>>>>>
> >> it is
> >>
> >>>>>>> part of snapshot IMO. > The depth and speed of access will
> >>>>>>>
> >> depend on
> >>
> >>>>>>> the write pattern to the disk. In XCP we add an optimisation
> >>>>>>>
> >> called a
> >>
> >>>>>>> BATmap which stores one bit per BAT entry. This is a fast
> >>>>>>>
> >> lookup table
> >>
> >>>>>>> that is cached in memory while the VHD is open, and tells the
> >>>>>>>
> >> block
> >>
> >>>>>>> device handler whether a block has been fully allocated. Once
> >>>>>>>
> >> the
> >>
> >>>>>>> block is fully allocated (all logical 2MB written) the block
> >>>>>>>
> >> handler
> >>
> >>>>>>> knows that it doesn't need to read or write the Bitmap that
> >>>>>>> corresponds to the data block, it can go directly to the disk
> >>>>>>>
> >> offset.
> >>
> >>>>>>> Scanning through the VHD chain can therefore be very quick,
> >>>>>>>
> >> i.e. the
> >>
> >>>>>>> block handler reads down the chain of BAT tables for each node
> >>>>>>>
> >> until
> >>
> >>>>>>> it detects a node that is allocated with hopefully the BATmap
> >>>>>>>
> >> value
> >>
> >>>>>>> set. The worst case is a random disk write workload which
> >>>>>>>
> >> causes the
> >>
> >>>>>>> disk to be fragmented and partially allocated. Every read or
> >>>>>>>
> >> write
> >>
> >>>>>>> will therefore potentially incur a bitmap check at every level
> >>>>>>>
> >> of the
> >>
> >>>>>>> chain. > But after I delete all these 16 snapshots, there is
> >>>>>>>
> >> still 16
> >>
> >>>>>>> vhd chains, > the disk access is still slow, which is not
> >>>>>>> understandable and > reasonable, even though there may be only
> >>>>>>>
> >> several
> >>
> >>>>>>> KB difference between > each snapshot, > There is a mechanism
> >>>>>>>
> >> in XCP
> >>
> >>>>>>> called the GC coalesce thread which gets kicked asynchronously
> >>>>>>> following a VDI deletion event. It queries the VHD tree, and
> >>>>>>> determines whether there is any coalescable work to do.
> >>>>>>>
> >> Coalesceable
> >>
> >>>>>>> work is defined as:
> >>>>>>>
> >>>>>>> 'a hidden child node that has no siblings'
> >>>>>>>
> >>>>>>> Hidden nodes are non-leaf nodes that reside within a chain. When
> >>>>>>>
> >> the
> >>
> >>>>>>> snapshot leaf node is deleted therefore, it will leave redundant
> >>>>>>>
> >> links
> >>
> >>>>>>> in the chain that can be safely coalesced. You can kick off a
> >>>>>>>
> >> coalesce
> >>
> >>>>>>> by issuing an SR scan, although it should kick off automatically
> >>>>>>>
> >> within
> >>
> >>>>>>> 30 seconds of deleting the snapshot node, handled by XAPI. If
> >>>>>>>
> >> you look
> >>
> >>>>>>> in the /var/log/SMlog file you'll see a lot of debug information
> >>>>>>> including tree dependencies which will tell you a) whether the
> >>>>>>>
> >> GC thread
> >>
> >>>>>>> is running, and b) whether there is coalescable work to do. Note
> >>>>>>>
> >> that
> >>
> >>>>>>> deleting snapshot nodes does not always mean that there is
> >>>>>>>
> >> coalescable
> >>
> >>>>>>> work to do since there may be other siblings, e.g. VDI clones.
> >>>>>>>
> >>>>>>>> is there any way we can reduce depth of vhd chain after
> >>>>>>>>
> >> deleting
> >>
> >>>>>>>> snapshots? get VM back to normal disk performance.
> >>>>>>>>
> >>>>>>>>
> >>>>>>> The coalesce thread handles this, see above.
> >>>>>>>
> >>>>>>>> And, I notice there are useless vhd volume exist after
> >>>>>>>>
> >> deleting snap
> >>
> >>>>>>>> shots, can we delete them automatically?
> >>>>>>>>
> >>>>>>>>
> >>>>>>> No. I do not recommend deleting VHDs manually since they are
> >>>>>>>
> >> almost
> >>
> >>>>>>> certainly referenced by something else in the chain. If you
> >>>>>>>
> >> delete them
> >>
> >>>>>>> manually you will break the chain, it will become unreadable,
> >>>>>>>
> >> and you
> >>
> >>>>>>> potentially lose critical data. VHD chains must be correctly
> >>>>>>>
> >> coalesced
> >>
> >>>>>>> in order to maintain data integrity.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Julian
> >>>>>>>
> >>>>>>>> - Anthony
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> xen-api mailing list
> >>>>>>>> xen-api@xxxxxxxxxxxxxxxxxxx
> >>>>>>>> http://lists.xensource.com/mailman/listinfo/xen-api
> >>>>>>>>
> >>>>>>>>
> >>>>>> _______________________________________________
> >>>>>> xen-api mailing list
> >>>>>> xen-api@xxxxxxxxxxxxxxxxxxx
> >>>>>> http://lists.xensource.com/mailman/listinfo/xen-api
> >>>>>>
> >>>> _______________________________________________
> >>>> xen-api mailing list
> >>>> xen-api@xxxxxxxxxxxxxxxxxxx
> >>>> http://lists.xensource.com/mailman/listinfo/xen-api
> >>>>
> >>>
> >
> >
>
_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/mailman/listinfo/xen-api
|