WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] [PATCH] Remus: increase failover timeout from 500ms to 1s

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] [PATCH] Remus: increase failover timeout from 500ms to 1s
From: Brendan Cully <brendan@xxxxxxxxx>
Date: Thu, 11 Feb 2010 12:09:17 -0800
Delivery-date: Thu, 11 Feb 2010 12:09:30 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mercurial-patchbomb/1.4.3+93-1a8df80dfdde
# HG changeset patch
# User Brendan Cully <brendan@xxxxxxxxx>
# Date 1265918930 28800
# Node ID 986a4052bb406856c575597fa923ae5abc7bbe2c
# Parent  83a6621b91bffebdb8696a04b711b4689ee08170
Remus: increase failover timeout from 500ms to 1s

500ms is aggressive enough to trigger split-brain under fairly ordinary
workloads, particularly for HVM. The long-term fix is to integrate with
a real HA monitor like linux HA.

Signed-off-by: Brendan Cully <brendan@xxxxxxxxx>

diff --git a/tools/blktap2/drivers/block-remus.c 
b/tools/blktap2/drivers/block-remus.c
--- a/tools/blktap2/drivers/block-remus.c
+++ b/tools/blktap2/drivers/block-remus.c
@@ -59,7 +59,7 @@
 #include <unistd.h>
 
 /* timeout for reads and writes in ms */
-#define NET_TIMEOUT 500
+#define HEARTBEAT_MS 1000
 #define RAMDISK_HASHSIZE 128
 
 /* connect retry timeout (seconds) */
@@ -604,8 +604,8 @@
        int rc;
        size_t cur = 0;
        struct timeval tv = {
-               .tv_sec = 0,
-               .tv_usec = NET_TIMEOUT * 1000
+               .tv_sec = HEARTBEAT_MS / 1000,
+               .tv_usec = (HEARTBEAT_MS % 1000) * 1000
        };
 
        if (!len)
@@ -649,8 +649,8 @@
        size_t cur = 0;
        int rc;
        struct timeval tv = {
-               .tv_sec = 0,
-               .tv_usec = NET_TIMEOUT * 1000
+               .tv_sec = HEARTBEAT_MS / 1000,
+               .tv_usec = (HEARTBEAT_MS % 1000) * 1000
        };
 
        if (!len)
diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -444,7 +444,7 @@
 /* set when a consistent image is available */
 static int completed = 0;
 
-#define HEARTBEAT_MS 500
+#define HEARTBEAT_MS 1000
 
 #ifndef __MINIOS__
 static ssize_t read_exact_timed(int fd, void* buf, size_t size)
@@ -458,8 +458,8 @@
     {
         if ( completed ) {
             /* expect a heartbeat every HEARBEAT_MS ms maximum */
-            tv.tv_sec = 0;
-            tv.tv_usec = HEARTBEAT_MS * 1000;
+            tv.tv_sec = HEARTBEAT_MS / 1000;
+            tv.tv_usec = (HEARTBEAT_MS % 1000) * 1000;
 
             FD_ZERO(&rfds);
             FD_SET(fd, &rfds);

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-devel] [PATCH] Remus: increase failover timeout from 500ms to 1s, Brendan Cully <=