[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] AFS-based VBD backend

On Thu, Dec 23, 2004 at 12:55:58AM -0800, Steve Traugott wrote:
> Hi All,
> Has anyone ever put any thought (or code) into an (Open)AFS-based
> virtual block device backend for Xen?  This driver would use a large
> file stored in AFS as its underlying "device", as opposed to a loop
> device or an LVM or physical device.  If I understand Xen's interface
> architecture correctly, this backend would be stacked something like
> this:
>     ext3 or other standard filesystem in domN
>     existing block device frontend in domN
>     new "afs file" backend in dom0
>     large file in /afs in dom0
>     AFS client in dom0
>     AFS server somewhere on net
> You would configure this using something like:
>     disk = ['afsfile:/full/path/to/afs/file,sda1,w']
> Only dom0 would need to be an AFS client; the kerberos ticket(s) would
> be owned and renewed by management code in dom0 only; the other domains
> would not need to be kerberized, would not have access to any keytabs in
> dom0, etc.  Each domain could have a single kerberos ID and its own AFS
> volume(s) for storage of its "disks", but the individual users or
> daemons in each domain would not be aware of kerberos at all.  
> I think most of this could be done in python -- the backend driver
> itself might be a relatively thin layer to translate block addressing to
> and from file byte locations, talk to the frontend, and do a periodic
> fsync() on the underlying file to write the changes back to the AFS
> server.  
> There'd be nothing to keep someone from using this backend on top of
> ordinary, non-AFS files; this might provide better performance (one less
> layer) than going through the loop device driver.  Perhaps the VBD type
> name might even want to be 'rawfile' or somesuch instead of 'afsfile',
> though an AFS-specific bind/unbind script might be useful for token
> management.  
> Some potential FAQ answers, before I get bombarded:  ;-)
> Q:  Why is this useful?
> A:  Because AFS can be run over the Internet, has excellent security,
>     client-side caching, server-side replication and snapshots, and
>     would lend itself well to an environment where the AFS clients and
>     servers might be in different geographical locations, owned by two
>     different parties, hosting Xen domN filesystems owned in turn by
>     other parties.  
> Q:  Why not just use native AFS in the unprivileged domains?
> A:  Because then those domains would have to be kerberized AFS clients,
>     and the users/owners of those domains would have to have kerberos
>     ID's, they'd have to be knowledgable in AFS ACLs, the AFS
>     directory-based security model, daemon token renewal, and so on.
>     The root-user domain owners would have to know how to manage
>     kerberos users, and they'd have to have kerberos admin rights.  This
>     is too much to expect.  The typical Xen customer wants to be able to
>     just use normal *nix tools to add or delete a user -- that won't work
>     with kerberos.  All users of all domains would be in the same
>     kerberos namespace too -- there could be only one "jim" across all
>     domains, even though those domains are supposed to be independent
>     *nix machines owned by different parties -- very difficult to
>     explain.
> Q:  Why not just use a loop device on top of AFS, with the 'file:' 
>     VBD type?
> A:  Loop devices on top of AFS files hang with large volumes of 
>     I/O -- looks like a deadlock of some sort (in my tests, a dd of
>     around 2-300 Mb into an AFS-based loop device appears to
>     consistently hang the kernel, even with a 500Mb or larger AFS
>     cache).  In addition, an unmodified loop.c will not fsync() the
>     underlying file; changes won't get written back to the AFS server
>     until loop teardown.  I've added an fsync() to the worker thread of
>     loop.c to take care of this every few seconds; that seems to work
>     but I can't really stress test it much because of the hang problem.
> Q:  Why not use iSCSI, nbd, drbd, gnbd, or enbd?
> A:  While these each seem to do their job well, none offer all of the 
>     maturity, client-side caching, WAN-optimized protocols, volume
>     management, backups, easy snapshots, scalability, central
>     administration, redundancy, replication, or kerberized security of
>     AFS.  
> What did I miss?  ;-)

That it's already possible to use normal files. :)

So, no need for explicit support in xen. If your dom0 already knows and
uses afs, just specify the file in the xen configuration:

disk = [ 'file:/afs/file,sda1,w' ]

Luciano Rocha

SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.