[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [XEN PATCH for-4.13 v1] Reset iomem's gfn to LIBXL_INVALID_GFN on reboot



On Mon, Oct 14, 2019 at 12:28 PM Julien Grall <julien.grall@xxxxxxx> wrote:
>
> Hi Ian,
>
> On 11/10/2019 16:14, Ian Jackson wrote:
> > Oleksandr Grytsov writes ("[PATCH v1] Reset iomem's gfn to 
> > LIBXL_INVALID_GFN on reboot"):
> >> During domain reboot its configuration is partially reused
> >> to re-create a new domain, but iomem's GFN field for the
> >> iomem is only restored for those memory ranges, which are
> >> configured in form of [IOMEM_START,NUM_PAGES[@GFN], but not for
> >> those in form of [IOMEM_START,NUM_PAGES], e.g. without GFN.
> >> For the latter GFN is reset to 0, but while mapping ranges
> >> to a domain during reboot there is a check that GFN treated
> >> as valid if it is not equal to LIBXL_INVALID_GFN, thus making
> >> Xen to map IOMEM_START to address 0 in the guest's address space.
> >>
> >> Workaround it by resseting GFN to LIBXL_INVALID_GFN, so xl
> >> can set proper values for mapping on reboot.
> >
> > Thanks for this patch.
> >
> > I confess that I am not sure what is going on here.  Where is this
> > troublesome 0 coming from ?  I see that the default value for gfn in
> > the struct is 0 and looking for assignments before this patch, gfn is
> > defaulted from b_info->iomem[i].start, which is presumably non-0.
> >
> > I suspect that your patch may be fixing this the wrong way.  I have
> > addressed this mail to various people who have touched this area of
> > code and hope they will be able to clarify.
>
> I found a thread from December 2017 related to the problem described here [1].
>
> Looking at the thread, there were no conclusion on the root causes and some
> questions were left unanswered by the contributor (see [2]).
>
> Oleksandr, could you look at the thread and see if you can provide more 
> details
> what's going on? Answering back here would be fine.
>
> >
> > BTW, please do ping this (and your other patches) by email, if the
> > conversation seems to stall.
> >
> > Thanks,
> > Ian.
> >
> >> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
> >> ---
> >>   tools/libxl/libxl_domain.c | 9 +++++++++
> >>   1 file changed, 9 insertions(+)
> >>
> >> diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
> >> index 9d0eb5aed1..0ae16a5b12 100644
> >> --- a/tools/libxl/libxl_domain.c
> >> +++ b/tools/libxl/libxl_domain.c
> >> @@ -2120,6 +2120,15 @@ static void 
> >> retrieve_domain_configuration_end(libxl__egc *egc,
> >>           }
> >>       }
> >>
> >> +    /* reset IOMEM's GFN to initial value */
> >> +    {
> >> +        int i;
> >> +
> >> +        for (i = 0; i < d_config->b_info.num_iomem; i++)
> >> +            if (d_config->b_info.iomem[i].gfn == 0)
> >> +                d_config->b_info.iomem[i].gfn = LIBXL_INVALID_GFN;
> >> +    }
> >> +
> >>       /* Devices: disk, nic, vtpm, pcidev etc. */
> >>
> >>       /* The MERGE macro implements following logic:
> >> --
> >> 2.17.1
> >>
>
> Cheers,
>
> [1] <ebf78aec-dcfd-72d9-dac2-06b29e4a66ae@xxxxxxxxx>
> [2] <20180213122432.h4fh22ej4dfe7226@xxxxxxxxxx>
>
> --
> Julien Grall

Julien,

Thanks to point me out for this old thread. I completely forgot about it
(I haven't worked with xen since long time). I've performed additional
investigation
and found the root cause of the issue. It doesn't relate to iomem GFN directly.
The problem is in type from json parsing at place where libxl creates array of
struct.

For example, libxl_domain_config_from_json calls libxl_domain_config_init
which initializes all child structures and arrays. But then when libxl parses
json and creates the array of structure, it doesn't initialize array elements
properly (see libxl__domain_build_info_parse_json iomem parsing):

p->num_iomem = x->u.array->count;
p->iomem = libxl__calloc(NOGC, p->num_iomem, sizeof(*p->iomem));
if (!p->iomem && p->num_iomem != 0) {
    rc = -1;
    goto out;
}
for (i=0; (t=libxl__json_array_get(x,i)); i++) {
    rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]);
    if (rc)
       goto out;
}

libxl creates array element with calloc function, so all element
fields are initialized
with zero values. Even some of them have default value different from zero.
For these purpose dedicated init function should be called for each element.
Above example should be:

for (i=0; (t=libxl__json_array_get(x,i)); i++) {
    libxl_iomem_range_init(&p->iomem[i]);
    rc = libxl__iomem_range_parse_json(gc, t, &p->iomem[i]);
    if (rc)
       goto out;
}

I've changes gentypes.py as following:

diff --git a/tools/libxl/gentypes.py b/tools/libxl/gentypes.py
index 88e5c5f30e..92e28be469 100644
--- a/tools/libxl/gentypes.py
+++ b/tools/libxl/gentypes.py
@@ -454,6 +454,8 @@ def libxl_C_type_parse_json(ty, w, v, indent = "
 ", parent = None, discrimina
         s += "        goto out;\n"
         s += "    }\n"
         s += "    for (i=0; (t=libxl__json_array_get(x,i)); i++) {\n"
+        if ty.elem_type.init_fn is not None and
ty.elem_type.autogenerate_init_fn:
+            s += indent + "    "+"%s_init(&%s[i]);\n" %
(ty.elem_type.typename, v)
         s += libxl_C_type_parse_json(ty.elem_type, "t", v+"[i]",
                                      indent + "    ", parent)
         s += "    }\n"

I'm not sure is it right and complete fix.

Ian, could you review?

If the fix is ok, I will submit the patch.

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.