[Xen-users] Re: Xen-users Digest, Vol 47, Issue 120

You're correct, this is a lot of overhead for applications that are heavy on I/O, but not more overhead than any of the other commercial virtualization systems offer for clustering (e.g. VMware ESX - VMFS3 has a good deal of overhead). If I were doing any high I/O loads, I'd map the FC connections directly through from my SAN to the domU and not do file-based disks. As it is, most of my domUs are things like Intranet servers, news servers (low-traffic), a few Windows XP domUs, etc., that are not I/O intensive. I'm about to move my e-mail system over to this set of systems, but I'll be passing that SAN volume through the dom0 to the domU.

-Nick

-----Original Message-----
From: xen-users-request@xxxxxxxxxxxxxxxxxxx
Reply-To: xen-users@xxxxxxxxxxxxxxxxxxx
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Xen-users Digest, Vol 47, Issue 120
Date: Tue, 20 Jan 2009 08:50:10 -0700

Send Xen-users mailing list submissions to
	xen-users@xxxxxxxxxxxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit
	http://lists.xensource.com/mailman/listinfo/xen-users
or, via email, send a message with subject or body 'help' to
	xen-users-request@xxxxxxxxxxxxxxxxxxx

You can reach the person managing the list at
	xen-users-owner@xxxxxxxxxxxxxxxxxxx

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Xen-users digest..."
Today's Topics:

   1. Re: Distributed xen or cluster? (Fajar A. Nugraha)
   2. Re: Distributed xen or cluster? (Fajar A. Nugraha)
   3. Xen 3.3.0 - QEMU COW disk image with sparse backing file - VM
      fails to start (Martin Tr?ster)
   4. sporadic problems relocating guests (J. D.)
   5. Re: Distributed xen or cluster? (Nick Couchman)

email message attachment

-------- Forwarded Message --------
From: Fajar A. Nugraha <fajar@xxxxxxxxx>
To: xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-users] Distributed xen or cluster?
Date: Tue, 20 Jan 2009 22:09:24 +0700

On Tue, Jan 20, 2009 at 8:21 PM, Nick Couchman <Nick.Couchman@xxxxxxxxx> wrote:
> I use SLES10 SP2 for my dom0, which has a few tools that make this possible:
> - EVMS + Heartbeat for shared block devices
> - OCFS2 for a clustered filesystem
> - Heartbeat for maintaining availability.
>
> I have a volume shared out from my SAN that's managed with EVMS on each of
> my Xen servers.  I created an OCFS2 filesystem on this volume and have it
> mounted on all of them.

That setup sounds like it has a lot of overhead. In particular, AFAIK
a clustered file system (like OCFS2) has a lower I/O throughput
(depends on the workload) compared to non-clustered FS. What kind of
workload do you have on your domU's? Are they I/O-hungry (e.g. busy
database servers)?

Also considering that (according to Wikipedia) :
- IBM stopped developing EVMS in 2006
- Novell will be moving to LVM in future products

IMHO it'd be better, performance and support wise, to use cLVM and put
domU's fs on LVM-backed storage. Better yet, have your SAN give each
domU it's own LUN and let all dom0s see them all.

domU config files should still be in a cluster FS (OCFS2 or GFS/GFS2) though.

Regards,

Fajar

email message attachment

On Tue, Jan 20, 2009 at 1:59 PM, lists@xxxxxxxxxxxx <lists@xxxxxxxxxxxx> wrote:
> Thanks Mark, I'm just reading on it now. Sounds like it allows fail over but am not sure that it's an actual cluster, as in redundancy?

Exactly what kind of redundancy are you looking for? Is it (for
example) having several domUs serving the same web content and having
a load balancer in front of them so traffic is balanced among working
domUs?

email message attachment

-------- Forwarded Message --------
From: Martin Tröster <TroyMcClure@xxxxxx>
To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Xen 3.3.0 - QEMU COW disk image with sparse backing file - VM fails to start
Date: Tue, 20 Jan 2009 16:22:12 +0100

Hi,

I upgraded my Xen 3.2-based test system to Xen 3.3. With this installation, I am no longer able to start images with COW disk images based on qcow sparse files. Starting the DomU via xm create gives success status (with "Startig Domain test" message printed to console), but xm list directly afterwards shows no entry for the VM.

The image structure is:
Base File A.img (sparse QCOW2 file)
COW File B.img (QCOW2 file with A.img as backing file) used as disk in Xen config file (see VM config file below)

I am running a CentOS 5.2 server with the pre-build packages at http://www.gitco.de/repro and can repro this behaviour on both 3.3.0 and 3.3.1RC1_pre (no 3.3.1 final version available, no time yet to do a build on my own).

In detail, I see the following behaviour:
-> RAW image with disk using file:/ - works
-> QCOW sparse image using tap:qcow - works
-> QCOW image based on RAW image using tap:qcow - works
-> QCOW image based on QCOW sparse image using tap:qcow - fails.

Logs of failing case:
=============================
/var/log/xen/xend.log shows that the domain immediately terminates, but no ERROR indication:
[2009-01-20 09:13:55 20596] DEBUG (XendDomainInfo:1443) XendDomainInfo.handleShutdownWatch
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices vif.
[2009-01-20 09:13:55 20596] DEBUG (DevController:160) Waiting for 0.
[2009-01-20 09:13:55 20596] DEBUG (DevController:645) hotplugStatusCallback /local/domain/0/backend/vif/8/0/hotplug-status.
[2009-01-20 09:13:55 20596] DEBUG (DevController:645) hotplugStatusCallback /local/domain/0/backend/vif/8/0/hotplug-status.
[2009-01-20 09:13:55 20596] DEBUG (DevController:659) hotplugStatusCallback 1.
[2009-01-20 09:13:55 20596] DEBUG (DevController:160) Waiting for 1.
[2009-01-20 09:13:55 20596] DEBUG (DevController:645) hotplugStatusCallback /local/domain/0/backend/vif/8/1/hotplug-status.
[2009-01-20 09:13:55 20596] DEBUG (DevController:659) hotplugStatusCallback 1.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices vscsi.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices vbd.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices irq.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices vkbd.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices vfb.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices console.
[2009-01-20 09:13:55 20596] DEBUG (DevController:160) Waiting for 0.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices pci.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices ioports.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices tap.
[2009-01-20 09:13:55 20596] DEBUG (DevController:160) Waiting for 51712.
[2009-01-20 09:13:55 20596] DEBUG (DevController:645) hotplugStatusCallback /local/domain/0/backend/tap/8/51712/hotplug-status.
[2009-01-20 09:13:55 20596] DEBUG (DevController:659) hotplugStatusCallback 1.
[2009-01-20 09:13:55 20596] DEBUG (DevController:155) Waiting for devices vtpm.
[2009-01-20 09:13:55 20596] INFO (XendDomain:1172) Domain test (8) unpaused.
[2009-01-20 09:13:57 20596] INFO (XendDomainInfo:1634) Domain has shutdown: name=test id=8 reason=poweroff.
[2009-01-20 09:13:57 20596] DEBUG (XendDomainInfo:2389) XendDomainInfo.destroy: domid=8
[2009-01-20 09:13:57 20596] DEBUG (XendDomainInfo:2406) XendDomainInfo.destroyDomain(8)
[2009-01-20 09:13:57 20596] DEBUG (XendDomainInfo:1939) Destroying device model
[2009-01-20 09:13:57 20596] DEBUG (XendDomainInfo:1946) Releasing devices
[2009-01-20 09:13:57 20596] WARNING (image:472) domain test: device model failure: no longer running; see /var/log/xen/qemu-dm-test.log 

/var/log/xen/qemu-dm-test.log shows nothing spectacular at all (at least for me):
domid: 8
qemu: the number of cpus is 1
config qemu network with xen bridge for  tap8.0 xenbr0
config qemu network with xen bridge for  tap8.1 eth0
Using xvda for guest's hda
Strip off blktap sub-type prefix to /path/to/test.img (drv 'qcow')
Watching /local/domain/0/device-model/8/logdirty/next-active
Watching /local/domain/0/device-model/8/command
qemu_map_cache_init nr_buckets = 10000 size 3145728
shared page at pfn 3fffe
buffered io page at pfn 3fffc
Time offset set 0
Register xen platform.
Done register platform.

/var/log/messages also shows nothing remarkable. Interesting entries never seen before are:
Jan 19 17:50:02 test kernel: tapdisk[21929] general protection rip:40b315 rsp:42900108 error:0
but they occur all the time on Xen 3.3 and Xen 3.3.1 using qcow images (but seem to be recovered)

VM config file:
------------------------------------------------------------------------------------------
name = "test"
device_model = '/usr/lib64/xen/bin/qemu-dm'
builder = "hvm"
kernel =  "/usr/lib/xen/boot/hvmloader"

# hardware
memory = "1024"
disk =  [ 'tap:qcow:/path/to/B.img,ioemu:xvda,w' ]
vcpus=1

# network
vif = [ 'type=ioemu,mac=02:00:00:00:00:01','type=ioemu,bridge=eth0,mac=02:00:00:00:01:98' ]
dhcp = "dhcp"

# visualization
sdl = 0
vnc = 1
vncviewer = 0
------------------------------------------------------------------------------------------

Any help is appreciated. Thanks!

Cheers,
Martin
____________________________________________________________________
Psssst! Schon vom neuen WEB.DE MultiMessenger gehrt? 
Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123

email message attachment

-------- Forwarded Message --------
From: J. D. <jdonline@xxxxxxxxx>
To: Xen Users <xen-users@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-users] sporadic problems relocating guests
Date: Tue, 20 Jan 2009 10:50:40 -0500

Hello all,

I am experiencing some problems relocating guests. I could not relocate the guest squid to the node physical node xen01
Other guests would migrate to xen01 without issue. I rebooted the node and now I can relocate my squid guest to xen01.
Now however I am finding that I can no longer relocate the squid guest to xen00. The errors below are what I am seeing
in the messages log on xen00.

We are on redhat 5.2 using cluster suite and the stock xen. Any ideas?

Jan 20 10:03:07 xen00 clurgmgrd[11135]: <warning> #68: Failed to start vm:squid; return value: 1
Jan 20 10:03:07 xen00 clurgmgrd[11135]: <notice> Stopping service vm:squid
Jan 20 10:03:07 xen00 kernel: xenbr0: port 7(vif9.1) entering disabled state
Jan 20 10:03:07 xen00 kernel: device vif9.1 left promiscuous mode
Jan 20 10:03:07 xen00 kernel: xenbr0: port 7(vif9.1) entering disabled state
Jan 20 10:03:13 xen00 clurgmgrd[11135]: <notice> Service vm:squid is recovering
Jan 20 10:03:13 xen00 clurgmgrd[11135]: <warning> #71: Relocating failed service vm:squid
Jan 20 10:03:15 xen00 clurgmgrd[11135]: <notice> Service vm:squid is now running on member 3

Best regards,

J. D.

email message attachment

-------- Forwarded Message --------
From: Nick Couchman <Nick.Couchman@xxxxxxxxx>
To: rbeglinger@xxxxxxxxx
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Distributed xen or cluster?
Date: Tue, 20 Jan 2009 09:01:55 -0700

My SAN is Active-Active, but you still should be able to accomplish this even with an Active-Passive SAN. This shouldn't be an issue with normal operations at all - it'll just be a matter of whether things can fail over correctly in the event of a SAN controller failure.

-Nick

-----Original Message-----
From: Rob Beglinger <rbeglinger@xxxxxxxxx>
To: xen-users <xen-users@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-users] Distributed xen or cluster?
Date: Tue, 20 Jan 2009 08:26:24 -0600

Nick,

Is your SAN an Active-Active or Active-Passive SAN? I'm looking to set something up like what you're doing, but my SAN only supports Active-Passive. We originally looked at Win2K8 with Hyper-V but fortunately that requires a SAN that supports Active-Active configuration. I'm using SLES 10 SP2 for dom0, and will be running SLES 10 SP2 domU's as well. I am running Xen 3.2.

On Tue, Jan 20, 2009 at 7:21 AM, Nick Couchman <Nick.Couchman@xxxxxxxxx> wrote:

I use SLES10 SP2 for my dom0, which has a few tools that make this possible:
- EVMS + Heartbeat for shared block devices
- OCFS2 for a clustered filesystem
- Heartbeat for maintaining availability.

I have a volume shared out from my SAN that's managed with EVMS on each of my Xen servers. I created an OCFS2 filesystem on this volume and have it mounted on all of them. This way I do file-based disks for all of my domUs and they are all visible to each of my hosts. I can migrate the domUs from host to host. I'm in the process of getting Heartbeat setup to manage my domUs - Heartbeat can be configured to migrate VMs or restart them if one of the hosts fails.

It isn't a "single-click" solution - it takes a little work to get everything running, but it does work.

-Nick

>>> "lists@xxxxxxxxxxxx" <lists@xxxxxxxxxxxx> 2009/01/19 23:37 >>>

Anyone aware of any clustering package for xen, in order to gain redundancy, etc.

Mike

This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR.

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

WARNING - OLD ARCHIVES

xen-users

[Xen-users] Re: Xen-users Digest, Vol 47, Issue 120