[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Starting many VMs concurrently causes vif connect errors


  • To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Carb, Brian A" <Brian.Carb@xxxxxxxxxx>
  • Date: Wed, 12 Sep 2007 12:24:38 -0500
  • Cc: "Carb, Brian A" <Brian.Carb@xxxxxxxxxx>
  • Delivery-date: Wed, 12 Sep 2007 10:25:15 -0700
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: Acf1Ycu9c/IJDLu0QoCsw8hrC8FveQ==
  • Thread-topic: Starting many VMs concurrently causes vif connect errors

Until now, we have been starting a large number of VMs sequentially using a simple
    "for i in vmcfg* ; do xm create $i; done"
and this has the effect of starting VMs sequentially but with some concurrency. The domain structure is created and unpaused, and the "xm create completes". While the next iteration of the loop is running "xm create" again, xen is busy scheduling and running any started domains, which each boot their OS.
 
Since the total elapsed time to start a large number of VMs in this way is significant, we looked for another way to start them. If we change the script to:
    "for i in vmcfg* ; do xm create $i & done"
then many background tasks are spawned, each running "xm create", and this has an interesting effect. When we monitor with "xm top", we notice that all the domain structures are created and completed first, then xen begins scheduling the domains and they boot. This works well for 50 VMs.
 
However, when I try to start 100 VMs with this script, all the domain structures get created, but then all of them are dismantled and removed, and each one reports "Error: Device 0 (vif) could not be connected. Hotplug scripts not working".
 
I've verified that, in fact, networking is still working, because after all the VMs are destroyed, I can then start one or two without error.
 
Any idea what's breaking? Should I try changing the value of DEVICE_CREATE_TIMEOUT in tools/python/xen/xend/server/DevController.py? If so, what would I have to rebuild to use the changed file? Or is there some networking component that gets overloaded? I have no problem starting the 100 VMs sequentially.

brian carb
unisys corporation - malvern, pa

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.