[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 20 of 29 RFC] libxl: introduce libxl hotplug public API functions



On Thu, 9 Feb 2012, Ian Campbell wrote:
> On Thu, 2012-02-09 at 16:18 +0000, Stefano Stabellini wrote:
> > On Thu, 9 Feb 2012, Ian Campbell wrote:
> > > On Thu, 2012-02-09 at 16:00 +0000, Stefano Stabellini wrote:
> > > > On Thu, 9 Feb 2012, Ian Campbell wrote:
> > > > > On Thu, 2012-02-09 at 15:32 +0000, Stefano Stabellini wrote:
> > > > > > On Thu, 9 Feb 2012, Ian Jackson wrote:
> > > > > > > Stefano Stabellini writes ("Re: [Xen-devel] [PATCH 20 of 29 RFC] 
> > > > > > > libxl: introduce libxl hotplug public API functions"):
> > > > > > > > - we can reuse the "state" based mechanism to establish a 
> > > > > > > > connection:
> > > > > > > > again not a great protocol, but very well known and understood.
> > > > > > > 
> > > > > > > I don't think we have, in general, a good understanding of these
> > > > > > > "state" based protocols ...
> > > > > > 
> > > > > > What?! We have netback, netfront, blkback, blkfront, pciback, 
> > > > > > pcifront,
> > > > > > kbdfront, fbfront, xenconsole, and these are only the ones in 
> > > > > > Linux!!
> > > > > 
> > > > > And no one I know is able to describe, accurately, exactly what the
> > > > > state diagram for even one of those actually looks like or indeed 
> > > > > should
> > > > > look like. It became quite evident in these threads about hotplug 
> > > > > script
> > > > > handling etc that no one really knows for sure what (is supposed to)
> > > > > happens when.
> > > > 
> > > > I thought that most of the thread was about the interface with the block
> > > > scripts, that is an entirely different matter and completely obscure.
> > > > If I am mistaken, please point me at the right email.
> > > 
> > > We are talking about reusing the existing xenbus state machine schema
> > > for a new purpose. Ian J pointed out that these are not generally well
> > > understood, you replied that it was and cited some examples. I pointed
> > > out why these were not examples of why this stuff was well understood at
> > > all, in fact quite the opposite.
> > 
> > Sorry but I don't understand how these examples are supposed to be
> > "quite the opposite".
> > I quite like the idea of being able to read a single source file of less
> > than 400 LOC to understand how a protocol works
> > (drivers/input/misc/xen-kbdfront.c).
> 
> That is not a protocol specification, merely one implementation of it.
> What does the BSD driver do? Is it exactly the same as Linux? Should BSD
> driver authors be expected to reverse engineer the protocol from the
> Linux code? What/who arbitrates when the two behave differently?

The lack of documentation is an issue.


> > In fact I don't think that understanding the protocol has been an issue
> > for the GSoC student that had to write a new one.
> 
> Being able to reverse engineer something which works is not proof that
> these things are "well understood" in the general case.
> 
> > I think we are under influence of a "reiventing the wheel" virus.
> 
> I think we are in danger of making the same mistakes again as have been
> made with the device protocols and this is what I want to avoid.
> 
> Now, perhaps this style of state machine protocol is a reasonable design
> choice in this case, but since we are starting afresh here this specific
> new instance should be well documented _up_front_ not left in the "oh,
> just read the Linux code" state we have now for many of our devices
> which has lead to multiple slightly divergent implementations of the
> same basic concept.

I agree.


> > > > > Justin just posted a good description for blkif.h which included a 
> > > > > state
> > > > > machine description. We need the same for pciif.h, netif.h etc etc.
> > > >  
> > > > The state machine is the same for block and network.
> > > 
> > > No, it's not. This is exactly what IanJ and I are talking about.
> > 
> > Could you please elaborate?
> > 
> > I am sure you know that the xenstore state machine is handled the same
> > way for all the backends in QEMU (see hw/xen_backend.c).
> > And the same thing is true for the frontends and the backends in Linux.
> 
> A substantial proportion of the threads about this hotplug script stuff
> has been about the fact that no one is quite sure what really happens
> when for all implementations nor what the common semantics are.
> 
> e.g. How do you ask a backend to shut down (do you set it to state 5?
> state 6? do you nuke the xenstore dir?). Neither is anyone sure when the
> correct point to call the hotplug scripts actually is, or even what
> actually happens with them right now across the different backend
> drivers or kernel types.

Yeah, this needs to be documented in advance.


> The actual state transitions which netback and blkback go through are
> not the same: The netback protocol uses InitWait, the blkback one does
> not or is it vice-versa? I can't remember and it isn't documented. Some
> Linux frontends handled the kexec reconnect sequencing differently, by
> disconnecting or reconnecting the actual underlying devices at subtly
> different times and/or handling the transition from Closing back to Init
> or InitWait differently.

Interesting... however these changes are just about when netfront or
blkfront decide to bring up the interface/device, they don't affect the
protocol itself, I think. What I mean is that from the toolstack POV we
don't need to worry about this.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.