[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: request for feedback on a Xen/Linux compatibility issue



On Thu, 6 Jan 2022, Julien Grall wrote:
> On 06/01/2022 14:03, Jan Beulich wrote:
> > On 06.01.2022 08:13, Juergen Gross wrote:
> > > On 06.01.22 01:40, Stefano Stabellini wrote:
> > > > Hi all,
> > > > 
> > > > Today Xen dom0less guests are not "Xen aware": the hypervisor node
> > > > (compatible = "xen,xen") is missing from dom0less domUs device trees and
> > > > as a consequence Linux initializes as if Xen is not present. The reason
> > > > is that interfaces like grant table and xenstore (xenbus in Linux) don't
> > > > work correctly in a dom0less environment at the moment.
> > > > 
> > > > The good news is that I have patches for Xen to implement PV drivers
> > > > support for dom0less guests. They also add the hypervisor node to device
> > > > tree for dom0less guests so that Linux can discover the presence of Xen
> > > > and related interfaces.
> > > > 
> > > > When the Linux kernel is booting as dom0less kernel, it needs to delay
> > > > the xenbus initialization until the interface becomes ready. Attempts to
> > > > initialize xenbus straight away lead to failure, which is fine because
> > > > xenbus has never worked in Linux when running as dom0less guest up until
> > > > now. It is reasonable that a user needs a newer Linux to take advantage
> > > > of dom0less with PV drivers. So:
> > > > 
> > > > - old Xen + old/new Linux -> Xen not detected in Linux
> > > > - new Xen + old Linux     -> xenbus fails to initialize in Linux
> > > > - new Xen + new Linux     -> dom0less PV drivers working in Linux
> > > > 
> > > > 
> > > > The problem is that Linux until recently couldn't deal with any errors
> > > > in xenbus initialization. Instead of returning error and continuing
> > > > without xenbus, Linux would crash at boot.
> > > > 
> > > > I upstreamed two patches for Linux xenbus_probe to be able to deal with
> > > > initialization errors. With those two fixes, Linux can boot as a
> > > > dom0less kernel with the hypervisor node in device tree. The two fixes
> > > > got applied to master and were already backported to all the supported
> > > > Linux stable trees, so as of today:
> > > > 
> > > > - dom0less with hypervisor node + Linux 5.16+           -> works
> > > > - dom0less with hypervisor node + stable Linux 5.10     -> works
> > > > - dom0less with hypervisor node + unpatched Linux 5.10  -> crashes
> > > > 
> > > > 
> > > > Is this good enough? Or for Xen/Linux compatibility we want to also be
> > > > able to boot vanilla unpatched Linux 5.10 as dom0less kernel? If so,
> > > > the simplest solution is to change compatible string for the hypervisor
> > > > node, so that old Linux wouldn't recognize Xen presence and wouldn't try
> > > > to initialize xenbus (so it wouldn't crash on failure). New Linux can of
> > > > course learn to recognize both the old and the new compatible strings.
> > > > (For instance it could be compatible = "xen,xen-v2".) I have prototyped
> > > > and tested this solution successfully but I am not convinced it is the
> > > > right way to go.
> > > > 
> > > > Do you have any suggestion or feedback?
> > > > 
> > > > The Linux crash on xenbus initialization failure is a Linux bug, not a
> > > > Xen issue. For this reason, I am tempted to say that we shouldn't change
> > > > compatible string to work-around a Linux bug, especially given that the
> > > > Linux stable trees are already all fixed.
> > > 
> > > What about adding an option to your Xen patches to omit the hypervisor
> > > node in the device tree? This would enable the user to have a mode
> > > compatible to today's behavior.
> > 
> > While this sounds nice at the first glance, this would need to be a per-
> > domain setting. Which wouldn't be straightforward to express via command
> > line option (don't know how feasible it would be to express such via other
> > means).
> 
> For dom0less, domains are described in the Device-Tree. We have one node per
> domain, so we could add a property to indicate whether the domain should be
> started in compat mode (or not).
> 
> That said, I am not sure every users will want Linux to use
> grant-table/xenstore (possibly, some users may want one but not the other).
> 
> So how about a more generic property "xen,enhanced" with an opional value
> indicating whether this is disabled, enabled or the list of interface (e.g.
> xenbus, grant-table) exposed?

Yeah, I like this idea. It would allow for maximum flexibility while not
requiring any changes to the existing Xen/Linux interface; even the
compatible string would remain unmodified.

I also find the ability to select individual features interesting,
although I don't have a concrete use-case for it yet. I should say that
I do have a concrete use-case for enabling only event-channels but they
are actually already enabled for dom0less guests because they are just
hypercalls. (Nothing disables them at present for dom0less guests so
they get them "by default".)

Let's say we go down this path, which seems nice. The remaining question
is what do we want as default when the new "xen,enhanced" option is
missing. I think it makes sense for the default to be "enabled" because
I expect most people to want the enhacements and they are generally
harmless if you don't use them (except for old unpatched Linux kernels,
which is the main reason why we need the option).

In any case thanks for the suggestions, I like this!



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.