[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen: detect uninitialized xenbus in xenbus_init



On Thu, 18 Nov 2021, Juergen Gross wrote:
> On 18.11.21 09:47, Jan Beulich wrote:
> > On 18.11.2021 06:32, Juergen Gross wrote:
> > > On 18.11.21 03:37, Stefano Stabellini wrote:
> > > > --- a/drivers/xen/xenbus/xenbus_probe.c
> > > > +++ b/drivers/xen/xenbus/xenbus_probe.c
> > > > @@ -951,6 +951,28 @@ static int __init xenbus_init(void)
> > > >                 err = hvm_get_parameter(HVM_PARAM_STORE_PFN, &v);
> > > >                 if (err)
> > > >                         goto out_error;
> > > > +               /*
> > > > +                * Return error on an invalid value.
> > > > +                *
> > > > +                * Uninitialized hvm_params are zero and return no 
> > > > error.
> > > > +                * Although it is theoretically possible to have
> > > > +                * HVM_PARAM_STORE_PFN set to zero on purpose, in 
> > > > reality it
> > > > is
> > > > +                * not zero when valid. If zero, it means that Xenstore 
> > > > hasn't
> > > > +                * been properly initialized. Instead of attempting to 
> > > > map a
> > > > +                * wrong guest physical address return error.
> > > > +                */
> > > > +               if (v == 0) {
> > > 
> > > Make this "if (v == ULONG_MAX || v== 0)" instead?
> > > This would result in the same err on a new and an old hypervisor
> > > (assuming we switch the hypervisor to init params with ~0UL).

Sure, I can do that


> > > > +                       err = -ENOENT;
> > > > +                       goto out_error;
> > > > +               }
> > > > +               /*
> > > > +                * ULONG_MAX is invalid on 64-bit because is 
> > > > INVALID_PFN.
> > > > +                * On 32-bit return error to avoid truncation.
> > > > +                */
> > > > +               if (v >= ULONG_MAX) {
> > > > +                       err = -EINVAL;
> > > > +                       goto out_error;
> > > > +               }
> > > 
> > > Does it make sense to continue the system running in case of
> > > truncation? This would be a 32-bit guest with more than 16TB of RAM
> > > and the Xen tools decided to place the Xenstore ring page above the
> > > 16TB boundary. This is a completely insane scenario IMO.
> > > 
> > > A proper panic() in this case would make diagnosis of that much
> > > easier (me having doubts that this will ever be hit, though).
> > 
> > While I agree panic() may be an option here (albeit I'm not sure why
> > that would be better than trying to cope with 0 and hence without
> 
> I could imagine someone wanting to run a guest without Xenstore access,
> which BTW will happen in case of a guest created by the hypervisor at
> boot time.
> 
> > xenbus), I'd like to point out that the amount of RAM assigned to a
> > guest is unrelated to the choice of GFNs for the various "magic"
> > items.
> 
> Yes, but this would still be a major tools problem which probably
> would render the whole guest rather unusable.

First let's distinguish between an error due to "hvm_param not
initialized" and an error due to more serious conditions, such as "pfn
above max".

"hvm_param not initialized" could mean v == 0 (as it would be today) or
v == ~0UL (if we change Xen to initialize all hvm_param to ~0UL). I
don't think we want to panic in these cases as they are not actually
true erroneous configurations. We should just stop trying to initialize
xenstore and continue with the rest.


The "pfn above max" case could happen if v is greater than the max pfn.
This is a true error in the configuration because the toolstack should
know that the guest is 32-bit so it should give it a pfn that the guest
is able to use. As Jan wrote in another email, for 32-bit the actual
limit depends on the physical address bits but actually Linux has never
been able to cope with a pfn > ULONG_MAX on 32-bit because xen_store_gfn
is defined as unsigned long. So Linux 32-bit has been truncating
HVM_PARAM_STORE_PFN all along.

There is also an argument that depending on kconfig Linux 32-bit might
only be able to handle addresses < 4G, so I don't think the toolstack
can assume that a 32-bit guest is able to cope with HVM_PARAM_STORE_PFN
> ULONG_MAX.  If Linux is 32-bit and HVM_PARAM_STORE_PFN > ULONG_MAX,
even if HVM_PARAM_STORE_PFN < address_bits_max I think it would be fair
to still consider it an error, but I can see it could be argued either
way. Certainly if HVM_PARAM_STORE_PFN > address_bits_max is an error.

In any case, I think it is still better for Linux to stop trying to
initialize Xenstore but continue with the rest because there is a bunch
of other useful things Linux can do without it. Panic should only be the
last resort if there is nothing else to do. In this case we haven't even
initialized the service and the service is not essential, at least it is
not essential in certain ARM setups.


So in conclusion, I think this patch should:
- if v == 0 return error (uninitialized)
- if v == ~0ULL (INVALID_PFN) return error (uinitialized)
- if v >= ~0UL (32-bit) return error (even if this case could be made to
  work for v < max_address_bits depending on kconfig)

Which leads to something like:

        /* uninitialized */
                if (v == 0 || v == ~0ULL) {
                        err = -ENOENT;
                        goto out_error;
                }
        /* 
         * Avoid truncation on 32-bit.
         * TODO: handle addresses >= 4G
         */
        if ( v >= ~0UL ) {
            err = -EINVAL;
            goto out_error;
        }



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.