This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-API] Alternative to Vastsky?

To: xen-api@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-API] Alternative to Vastsky?
From: George Shuklin <george.shuklin@xxxxxxxxx>
Date: Wed, 20 Apr 2011 04:51:30 +0400
Delivery-date: Tue, 19 Apr 2011 17:51:50 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=QkXkvxIRbxr0rUdKG0E/n0anZlanO7k8g01UYnVgz+A=; b=TTq7o6AF6CUpnQEBXAr/5yF1awFzXAiUwjOWqxlEN8zrIPxy5lTUUfXj3yGx/ZCDzp byHsuoJ0Dx3N+AnVgFSLwpORG/QY7jRiI5AejRMqVleyxbEBEtowskXb4oDz2F8Qdoht 36+KTWmJN1tvvtujTtK0XuIjpz5hErJMZOY14=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=epRC+rbtC/rF/apTg3DPHEGY9Q+ROa8zmWNVcWOmwzLvST7W+EyxECZbRVI7yChuaw sopGs0dRrTcajn0aPM8oM3cbVtacQBLKdG6PUPUj+A2bIFCBJWkLw7zEUj1Js66VEluN 3tZndWPY2+TbNhPZe1ngi9O+s8MgzVhDEjD6A=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4DAE2454.1060101@xxxxxxxxxx>
List-help: <mailto:xen-api-request@lists.xensource.com?subject=help>
List-id: Discussion of API issues surrounding Xen <xen-api.lists.xensource.com>
List-post: <mailto:xen-api@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-api>, <mailto:xen-api-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-api>, <mailto:xen-api-request@lists.xensource.com?subject=unsubscribe>
References: <4DAE085F.5040707@xxxxxxxxxx> <4DAE0ADD.70209@xxxxxxxxx> <4DAE2454.1060101@xxxxxxxxxx>
Sender: xen-api-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20110303 Icedove/3.0.11
I see no problem with split brains in case of DRBD between two XCP hosts (with DRBD between local drive in fist XCP host and second drive via network on second XCP host). XCP assure there is no two copy of same VM running in pool (we talking about XCP, not xend?). If some pool suddenly go offline or disconnected (same thing), you must manually say vm-reset-powerstate. I think this kind of protection is fairy normal, except it will delay automatic restart in case of unexpected host hangup - but in case of XCP this problem exists for every storage solution - problem is not with storage but with XCP way to detects HOST_OFFLINE (only after long delay XCP will assume host down... or never? I still not test this well).

The main sad thing in DRBD is two host limit, but it still better, than plain /dev/sd[abcde] for pack of 'mission critical applications with new level of performance and effibla-bla-bla'. And (as far as I know XCP internals) it have all capabilities (may be with little tweaking) to get DRBD support at logic level. We have shared SR with two PBD on two hosts. We calculate vm-vbd-vdi-sr-pbd-host paths before sending task to slave (start/migrate/evacuate), we accounting them before returning calculated ha-avability (forgot exact names). To avoid 'tripple confilct' we allow only one DRBD per host: if A have two different DRBD with B and C, B have same with C and A, and C with B and A and we create vm with two vdi on fist and second DRBD volumes, we lost any way for successful migration (and, in certain meaning, loose some redundancy).

Than you for reply about two iscsi target for same drbd... I have a little doubts about data consistency due iscsi queue...

The last: I DO really wants to see 2.6.38+ in XCP. In 2.6.38 Red Hat has add support for blkio-throttle in this version - most wanted feature for dom0 - its allow to shape IOPS and bandwidth for every process separately (this means 'for every VM'). We have (not very good, but working) traffic shaper, so disk shaper is very actual too...

On 20.04.2011 04:09, Tim wrote:
On 19/04/11 23:21, George Shuklin wrote:
I think we shall split this to three different scenarios:

1) local storage redundancy of local storage within terms of single host (e.g. software RAID support, I think this require a little tweak of installer to create RAID1 instead plain /dev/sda installation)

This works quite well - I do it manually for each host after installing. It would be less painful XCP could be upgraded using yum. That way there wouldn't be a need to re-do it after each upgrade.

2) local storage redundancy within pool with limited host replication (primary/primary DRBD between two XCP hosts, similar to current /opt/xensource/packages/iso shared ISO SR)

I use this as a backing for an LVM storage repository. The only problem I can foresee, is that I'm not sure if DRBD supports multi-path. Network problems in a primary-primary setup would lead to split-brain with different VMs running on different brains....... I can't imagine that being fun to solve.... I'm using a crossover cable and it seems to work well - very reliable but definitely not scalable.

3) Supports for external storage supports replication and clustering and many other enterprise-level buzzwords.

Most interesting is third.

Right now I have plans to test iscsi over DRBD with muplipath to both iscsi initiators (never test this, but it may be interesting), alternative is corosync/pacemaker clustering for NFS/ISCSI + DRBD...

If I am understanding you correctly, I have tried this setup. Two iSCSI targets kept in sync using DRBD, with multi-path between the initiators and targets. This was replaced with the aforementioned solution when the hosts were upgraded. It only required two servers as opposed to four, no additional switches, there were fewer points of failure overall, and it removed the processing overhead/latency caused by the iSCSI layer.

I can imagine it would be of use in a situation where you had multiple initiators, but it would then run the risk of being the bottleneck.

I also tried an active/passive DRBD pair with iSCSI/multi-path, with fail-over managed by pacemaker/heartbeat. Write performance was marginally better, but the problem was insuring that the fail-over worked as planned.

On 20.04.2011 02:10, Tim Titley wrote:
Has anyone considered a replacement for the vastsky storage backend now that the project is officially dead (at least for now)?

I have been looking at Ceph ( http://ceph.newdream.net/ ). A suggestion to someone so inclined to do something about it, may be to use the Rados block device (RBD) and put an LVM storage group on it, which would require modification of the current LVM storage manager code - I assume similar to LVMOISCSI.

This would provide scalable, redundant storage at what I assume would be reasonable performance since the data can be striped across many storage nodes.

Development seems reasonably active and although the project is not officially production quality yet, it is part of the Linux kernel which looks promising, as does the news that they will be providing commercial support.

The only downside is that RBD requires a 2.6.37 kernel. For those "in the know" - how long will it be before this kernel makes it to XCP - considering that this vanilla kernel supposedly works in dom0 (I have yet to get it working)?

Any thoughts?



xen-api mailing list

xen-api mailing list

xen-api mailing list

xen-api mailing list