WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] pci device hotplug, race accessing xenstore

To: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] pci device hotplug, race accessing xenstore
From: Simon Horman <horms@xxxxxxxxxxxx>
Date: Thu, 15 Oct 2009 09:49:19 +1100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Phung Te Ha <phungte@xxxxxxxxx>
Delivery-date: Wed, 14 Oct 2009 15:49:44 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <alpine.DEB.2.00.0910141351140.11134@kaball-desktop>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <f6cf36180910131213x21218f2am7ed55c0a8a381312@xxxxxxxxxxxxxx> <alpine.DEB.2.00.0910141351140.11134@kaball-desktop>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
On Wed, Oct 14, 2009 at 02:34:35PM +0100, Stefano Stabellini wrote:
> On Tue, 13 Oct 2009, Phung Te Ha wrote:
> > Hello Simon,
> > 
> > I took the source as per you message: 
> > http://marc.info/?l=xen-devel&m=124748015304566&w=4
> > 
> > compiled and run it on an Intel-DQ35JO, Fedora-10.
> > 
> > When I try to pass pci device through at boot time in configuration file, 
> > there's a race between xend and qemu accessing
> > xenstore.
> > 
> > Xend waits in signalDeviceModel(...) for qemu to declare 'running' then 
> > write to the dm-command pipe the devices to be
> > passed-through.
> > 
> > On the qemu side, it poses a watch on  
> > /local/domain/0/device-model/2/command and expects the dm-command from 
> > there, by
> > calling xs_watch(...). xs_watch(...) causes xenstored to run do_watch(...) 
> > and at the end, run add_event(...) with the
> > following comment:
> >           /* We fire once up front: simplifies clients and restart. */
> > 
> > 
> > The problem shows when xend is faster, detecting qemu 'running' state, and 
> > calls xstransact.Store adn writes to the
> > command pipe, before qemu can call main_loop_wait(...) and run one empty 
> > loop on the command pipe. This write causes
> > xenstored to run a fires_watch, thus another add_event(...).
> > The problem shows in qemu log by an extra dm-command, using wrong parameter 
> > and fails to initialize, for instance:
> > 
> > ...
> > xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command
> > read_message: msg type reply pci-ins
> > dm-command: hot insert pass-through pci dev
> > read_message: msg type reply 0000:00:1b.0@100
> > register_real_device: Assigning real physical device 00:1b.0 ...
> > pt_register_regions: IO region registered (size=0x00004000 
> > base_addr=0x90420004)
> > pt_msi_setup: msi mapped with pirq ff
> > register_real_device: Real physical device 00:1b.0 registered successfuly!
> > IRQ type = MSI-INTx
> > read_message: msg type reply OK
> > read_message: msg type reply OK
> > xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command
> > read_message: msg type reply pci-ins
> > dm-command: hot insert pass-through pci dev
> > read_message: msg type reply 0x20
> > hot add pci devfn -1 exceed.
> > read_message: msg type reply OK
> > ...
> > 
> > On the xend side:
> > 
> > ...
> >     (bdf_str, vdevfn))
> > VmError: Cannot pass-through PCI function '0000:00:1b.0@100'. Device model 
> > reported an error: no free hotplug devfn
> > [2009-10-13 10:45:10 4174] ERROR (XendDomainInfo:471) VM start failed
> > Traceback (most recent call last):
> > ...
> > 
> > 
> 
> I think we should take this chance to make the pci-insert protocol more
> reliable.
> In particular we are missing the following things:
> 
> - qemu shouldn't accept any dm-command unless it is in state "running";
> 
> - xend should remove the command node on xenstore after reading
> state "pci-inserted" and before writing state "running"  again.
> 
> This way when the second xenstore watch fires the pci-ins command is
> never executed for a second time because either qemu is not in the right
> state (pci-inserted instead of running) or the command node doesn't
> contain any data (it has been removed by xend).

My memory of that code is a bit hazy, but that sounds like a good idea.

> Another problem is that nothing else can happen while xend waits for the
> device model to be in state running, this also prevents pci coldplug
> from working with stubdoms.
> Is it possible to run signalDeviceModel in a new xend Thread?

I'm interested to hear a comment on what the status of the Ocaml
replacement for xend is. It seems silly to spend time fixing up the
python code - there is ample scope for fixing - if a replacement
is in the wings. In particular, I'm refering to the toolstack.git
XCI tree.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel