WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] apparent issues with Xen and PVFS2

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] apparent issues with Xen and PVFS2
From: Matthew Haas <haas@xxxxxxxxxxx>
Date: Sun, 30 Jul 2006 11:28:53 -0400
Delivery-date: Sun, 30 Jul 2006 08:30:05 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Organization: SUNY Geneseo
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5.0.5 (Macintosh/20060719)
Good morning,

I have been tinkering with Xen for the past couple months, looking to use it as the base of our new cluster infrastructure. So far I've had significant success with it, performance is great, live migrations are awesome.

However, as I've continued to get the Xen0 infrastructure set up the way I would prefer, I've finally run into a wall that I can't seem to get by, so I am tossing out a plea for assistance here on the xen-users list. It seems as if the visible error I am receiving has been discussed before on the list, but perhaps not to the satisfactory resolution to all involved.

 That's right, I have the infamous:

        Error: Device 769 (vbd) could not be connected. Backend device
        not found.

 error. Woohoo.

A little bit about my cluster setup... a bunch (currently 12, but more on the way) of Dell OptiPlex GX240s (2GB of RAM each, currently allocating ~64MB for each Xen0, and have more than enough allocated to each XenU, but not all, so I can double-and-triple-up XenUs when I am testing stuff), 2.4GHz P4 CPUs, and hard disks in each node ranging from 30GB - 40GB in capacity. Gig-E interconnect amongst all nodes.

I'm using Xen 3.0.1, as that is the version I've had the most success with. I am now working with a custom-compile set of 2.6.12.6 3.0.1 Xen0/XenU kernels as I needed to enable stuff like kernel-based NFS. Debian Testing is my base distro.

My working setup utilizes an NFS mount which contains all the images and config files I use in Xen... saves and live migrations also go through here.

And all works fine and dandy. In fact the current NFS server is located on a machine with only 100Mb ethernet, and I have been very impressed with the overall responsiveness when migrating.

My problem has cropped up when I decided I wanted to try and do something about some of the unused disk space on each of my Xen0 machines (I'm only using 6GB root + 2GB swap, out of ~40GB on each machine, as I wanted to play around with some fancy network accessible storage solutions), so I allocated the remaining 20-30GB on each machine to a partition, formatted it, and then proceeded to setup PVFS2 (version 1.5.1).. I seemed to get that up without a hitch... 4 I/O servers, 4 metadata servers (on the same machines as the I/O servers), and balanced out clients across all my machines (currently about 12).

I'm using the 2.6 kernel PVFS2 module so I can have it mounted just like a real filesystem, so I can use regular utilities and whatnot. I've got a nice 110GB block of space via my PVFS2 mount point, and thought it would be neat to see how well my Xen operations would work out of the PVFS2 storage vs. the NFS storage. So I copied the necessary files over, updated my Xen config .sxp files, and gave it a go. That's when I first got that dreaded error.

As for current debugging efforts... I've gone ahead and allocated up to 128 loopback devices, as was the popular suggestion in the thread I found on this list. (loop_max=128 on the appropriate line in grub). A "dmesg | grep loop" indicates this was successful, and /dev/ lists 128 loop devices.

 However, does not fix my problem. The error persists.

I also tried changing how Xen looks for my XenU images... in the config file (as they work via NFS), I access the disks via the "file:" schema... I have seen a "phy:" schema which I tried, and got a slightly different error message, saying something to the effect of "it is already mounted, I can't do it again".

So I went in and put in "w!" for the access options, instead of the regular "w" option that was there. This actually got the kernel booting... however, the attempt was in vain because it could not locate the root filesystem (so it really didn't do much for me aside from bypassing the first error).

ALSO: If I go and try the old-and-working-NFS-style XenU guest creation, it will also spit back the unable to find backend, good old error 769.

I have found that if I turn off the PVFS2 client on the Xen0 host, that NFS then seems to work again. So this seems to indicate PVFS2 is doing something.. what, is a good question, but it is doing something.

Are there other schemas I could be using? Since I'm running PVFS2 over gig-E, I'm using the TCP transport, default port 3334... does Xen have network-based file-access schemas?

 Any suggestions or things you all think I should try?

 Thank you for any pointers you can throw my way.

-Matthew
--
 Matthew Haas
 SUNY Geneseo
 Distributed Systems Lab

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-users] apparent issues with Xen and PVFS2, Matthew Haas <=