[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Race condition on device add hanling in xl devd



On Mon, Dec 17, 2018 at 05:09:19PM +0100, Roger Pau Monné wrote:
> On Mon, Dec 17, 2018 at 02:42:23PM +0000, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Roger Pau Monne
> > > Sent: 17 December 2018 14:32
> > > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
> > > Cc: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>; xen-
> > > devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>
> > > Subject: Re: [Xen-devel] Race condition on device add hanling in xl devd
> > > 
> > > On Mon, Dec 17, 2018 at 01:11:11PM +0000, Paul Durrant wrote:
> > > > > -----Original Message-----
> > > > > From: Roger Pau Monne
> > > > > Sent: 17 December 2018 13:06
> > > > > To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> > > > > Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>; Wei Liu
> > > > > <wei.liu2@xxxxxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx>
> > > > > Subject: Re: [Xen-devel] Race condition on device add hanling in xl
> > > devd
> > > > >
> > > > > On Mon, Dec 17, 2018 at 01:23:15PM +0100, Marek Marczykowski-Górecki
> > > > > wrote:
> > > > > > On Mon, Dec 17, 2018 at 01:18:55PM +0100, Roger Pau Monné wrote:
> > > > > > > On Mon, Dec 17, 2018 at 01:00:01PM +0100, Marek Marczykowski-
> > > Górecki
> > > > > wrote:
> > > > > > > > On Mon, Dec 17, 2018 at 10:40:59AM +0100, Roger Pau Monné wrote:
> > > > > > > > > On Sun, Dec 16, 2018 at 02:47:43AM +0100, Marek Marczykowski-
> > > > > Górecki wrote:
> > > > > > > > > > A workaround could be implemented in hotplug script itself -
> > > > > wait for
> > > > > > > > > > the device there. I'm not sure how proper solution could
> > > look
> > > > > like. Some
> > > > > > > > > > synchronization between xl devd and the kernel (like xl devd
> > > > > monitoring
> > > > > > > > > > uevents)?
> > > > > > > > >
> > > > > > > > > There's already a synchronization mechanism, libxl waits for
> > > the
> > > > > > > > > backend to switch to state 2 (XenbusStateInitWait) before
> > > running
> > > > > the
> > > > > > > > > hotplug scripts [0].
> > > > > > > > >
> > > > > > > > > Maybe netback sets state 2 before creating the backend device?
> > > > > > > > >
> > > > > > > > > It looks to me like the backend needs to be sure everything
> > > needed
> > > > > by
> > > > > > > > > the hotplug script is in place before switching to state 2.
> > > > > > > >
> > > > > > > > I've done some more tests and I think that's something else.
> > > I've
> > > > > added
> > > > > > > > a loop waiting for /sys/class/net/$vif to a hotplug script, but
> > > it
> > > > > timed
> > > > > > > > out (5s). I don't see _any_ kernel messages related to the
> > > device.
> > > > > > > >
> > > > > > > > It may be some bug in nested virtualization in KVM...
> > > > > > >
> > > > > > > In your message you said you have also observed this behavior when
> > > > > > > running on bare metal, so it's likely not related to nested
> > > > > > > virtualization?
> > > > > >
> > > > > > Yes, but on bare metal is so hard to reproduce (like 0.1% or even
> > > less
> > > > > > startups), I'm not really sure if that was the same problem, as the
> > > > > > problem doesn't leave that much logs...
> > > > >
> > > > > I'm not very familiar with netback, but I think it's indeed possible
> > > > > for netback to switch to state 2 without having created the vif.
> > > > > Netback switching from state 1 -> 2 seems to be solely controlled by
> > > > > the frontend state (see frontend_changed).
> > > > >
> > > > > I think the patch below could solve this issue, but I haven't even
> > > > > compile tested it, could you give it a spin?
> > > > >
> > > > > I would also like to hear the opinion of netback maintainers, since I
> > > > > might be completely wrong.
> > > >
> > > > IIRC there is a good reason why netback doesn't want the hotplug script
> > > to run before moving into state 2... the script adds the vif to the bridge
> > > and, if this is done on the 1 -> 2 transition then you may end up with a
> > > load of vifs sat on the bridge for which there is no frontend (at least
> > > yet, but maybe never)... so the bridge wastes time in every packet sent to
> > > such a vif.
> > > 
> > > I don't think netback has ever waited for a frontend before running
> > > hotplug scripts.
> > > 
> > > In the udev times the hotplug script would be run upon vif creation,
> > > which happens in netback_probe, and when launching hotplug scripts
> > > from libxl the script is executed when the backend changes to state 2,
> > > which happens almost immediately because netback switches to state 2
> > > when the frotnend is in state 1 which is the initial frontend state.
> > > 
> > 
> > I suspect I must be remembering a XenServer-specific hack^Wpatch then. I'd 
> > have to dig... it's been a while since I messed with the netif state model, 
> > which is of course different the blkif state model.
> 
> Quite likely. With udev scripts is was feasible to only execute
> hotplug scripts for vifs with an attached frontend.
> 
> With libxl this is not possible, since hotplug scripts are run during
> domain creation, at which point the guest is completely paused.
> 
> I'm not that familiar with bridges and vifs, but maybe the vifs status
> can be set to offline until there's a frontend attached in order to
> reduce the bridge distributor load? (if that's not already the case).

I've found was the problem, and with some definition of "race condition"
it could be named this way.
The problem is that for some reason xenstore watch on device add
sometimes does not fire in xl devd. But then, when libxl in dom0
timeouts and remove the device, the xenstore watch in xl devd fire and
hotplug script is called. At this point device is already gone, so
it fails. xl devd then quickly calls hotplug script the second time, for
device removal.

I have no idea why this xenstore watch do not fire, but triggering a
no-op write into watched path (to trigger the watch again) workarounds
the problem. I use a xenstore watch in dom0 for that[1] - which works.
I suspect something related to KVM nested virtualization (lost
interrupt?)...

[1] 
https://github.com/marmarek/openqa-tests-qubesos/blob/3e604b521eb4d4e1b8ff40ad9e278d63d9a3baa3/extra-files/system-tests/xenstore-watch-trigger.py
      

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.