[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] bug in evtchn_cpu_notify (was domains not shutting down properly-theproblemisbackagain)


  • To: "James Harper" <james.harper@xxxxxxxxxxxxxxxx>, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "James Harper" <james.harper@xxxxxxxxxxxxxxxx>
  • Date: Sat, 3 Jan 2009 17:29:27 +1100
  • Cc:
  • Delivery-date: Fri, 02 Jan 2009 22:30:21 -0800
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AclsEUiDJwnrGfQLQ6G5Bsw4GM+wzgAAL2RgAAIdJkkAHEzXYAAA9a8QAAD1UUAACaHcAAABmirxAAEexXAAAH2McAAAGpUAAAH94BAAAdTsvgABrLdgAASGj2IAHHluQAAA9upQAAGJNHA=
  • Thread-topic: bug in evtchn_cpu_notify (was domains not shutting down properly-theproblemisbackagain)

> 
> > > Perhaps multiprocessor dom0, plus a bug in the dom0 kernel which
means
> > > that
> > > the VCPU which Xen notifies for the virq is not the one which dom0
> > kernel
> > > is
> > > expecting to receive the notification to? What do you use as dom0
> > kernel?
> > >
> >
> > That appears to be the problem.
> >
> > 1. xenstore starts up and binds VIRQ_DOM_EXC to port 18
> > 2. xend starts and sets the number of cpus to 1 (dom0-cpus = 1)
> > 3. xen notifies xenstore on port=18, vcpu=1, but vcpu 1 doesn't
exist
> > anymore so the event never gets anywhere
> >
> > The curious thing is that IOCTL_EVTCHN_BIND_VIRQ explicitly sets
vcpu =
> > 0, so why is the event getting delivered to vcpu 1???
> >
> 
> Something is making a call to evtchn_bind_vcpu...
> 

I think I've figured out what is going on... the 'per user data' in
drivers/xen/evtchn/evtchn.c is per connection to the event channel
device, so the same 'per user data' may be assigned to multiple ports

Initially all the event channels opened by xenstored (eg 17 and 18) have
'1' in the vcpu of their user data, indicating that ports on that
connection are bound to vcpu 1.

In evtchn_cpu_notify(CPU_DOWN_PREPARE) (when xend starts, reducing the
number of cpu's in dom0 to 1), every port is looped through. Port 17 is
found to be bound to vcpu 1 (via the per user data) which is about to go
away, so the port is rebound to vcpu 0 and the user data is updated to
reflect the new vcpu (I only have 2 cpu's, so it is set to 0 as 1 is
going away). Port 18 is checked but because the per-user data has been
updated to vcpu=0 so nothing is done and the port stays bound to vcpu 1.

I'll try and come up with a solution when I get back to my computer in a
few hours if nobody beats me to it... is there another way to check what
vcpu a port is bound to than checking the per-user value of bind_vcpu?

James

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.