Re: [Xen-users] Understanding sparse-files
Rustedt, Florian wrote:
What exactly is the advantage of sparse-files against "normal" files
with fixed length?
There are both advantages and disadvantages.
First i thought this is something like an auto-increasing file. But if i
take a 2GB partition and add two sparse-files with 1GB each, i can't add
an additional one, the disk is full?
No, that's not it.
So what about this mystic advantage? Is it only the faster creation of
that file with dd, because it is not completely filled?
If you create yourself a nice big sparse file like this
dd bs=1M seek=10240 count=0 if=/dev/zero of=huge
And then look at what you've got with "ls -lh" you'll see you have a 10G
file that was created almost instantly. On the other hand, "ls -sh"
will show that the file is actually occupying no space at all (well,
almost no space). You can make this file bigger like this:
dd bs=1M seek=20480 count=0 if=/dev/zero of=huge
and this will make it 20GB and still not occupying much space.
I suspect you already know this, but if you didn't, you do now :-)
The advantage of this 20GB file is precisely that it occupies next to no
space on the disk that holds it. I can start writing data into it (that
is, use it a a guest's disk) and the blocks needed will be allocated as
they are used. In fact, I could have a 200GB guest disk image even
though the disk I have at the moment is only 120GB and I'm using quite a
lot of it -- it would only be a problem if the guest actually wanted to
use all that space.
There are some problems with sparse files: the compress beautfully (gzip
reports 99.9%) but it takes a while to read the empty space and when you
uncompress the file you discover that it now actually occupies disk
space: there's no good way to distinguish between an unallocated block
and a block full of zeroes. This also means that you need to be
careful how you back these files up: you need something a little
cleverer than gzip.
Another problem with sparse files, especially when using them as domU
disks is that blocks that are contiguous in the file are not contiguous
on the disk. That means if, in the guest, if you just "dd if=/dev/xvda
of=/dev/null" then domU will be seeking back and forth all over the
place to return the blocks in the order that they're being asked for.
You don't need xen for this -- when I downloaded the DVD image of
Fedora 10 using transmission (a bittorrent client) a checksum on the
resulting file only managed to read it at about 4MB/s. On the other
hand, when I copied the file the checksum on the copy ran at closer to
100MB/s -- bittorrent clients like transmission really ought to
pre-allocate the disk space to that you get something contiguous and
also not embarrassingly run out of space half way through.
In a nutshell, though:
pros: over-committed disk space
Xen-users mailing list