WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] soft lockups during live migrate..

To: "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] soft lockups during live migrate..
From: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
Date: Thu, 22 Oct 2009 21:21:49 -0700
Delivery-date: Thu, 22 Oct 2009 21:22:37 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

Trying to migrate a 64bit PV guest with 64GB running medium to heavy load 
on xen 3.4.0, it is showing lot of soft lockups. The softlockups are 
causing dom0 reboot by the cluster FS. The hardware has 256GB and 32
CPUs.

Looking into the hypervisor thru kdb, I see one cpu in sh_resync_all()
while all other 31 appear spinning on the shadow_lock. I vaguely remember
seeing some thread on this while ago, but just can't seem to google find
it now. I'm trying to figure what could be done in the short run.

Now that guests are getting bigger in memory, bugs of this nature are slowly
popping up under medium/heavy load. I've been thinking of what could be
done to adderss those in the long run. May be create a certain class of 
pages, that once migrated, are 'w' protected, and any write faults on them 
are resolved on the target system, is one idea.  Incidentally, IBM took 
the reverse approach. The (VCPU) contexts are migrated and pages are 
pulled in. 


thanks,
Mukesh



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel