[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 5/9] tools/libxc: common code



On Wed, 2014-05-07 at 15:38 +0100, Andrew Cooper wrote:
> On 07/05/14 14:03, Ian Campbell wrote:
> > On Wed, 2014-04-30 at 19:36 +0100, Andrew Cooper wrote:
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >> Signed-off-by: Frediano Ziglio <frediano.ziglio@xxxxxxxxxx>
> >> Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx>
> >> ---
> >>  tools/libxc/saverestore/common.c         |   87 ++++++
> >>  tools/libxc/saverestore/common.h         |  172 ++++++++++++
> >>  tools/libxc/saverestore/common_x86.c     |   54 ++++
> >>  tools/libxc/saverestore/common_x86.h     |   21 ++
> >>  tools/libxc/saverestore/common_x86_hvm.c |   53 ++++
> >>  tools/libxc/saverestore/common_x86_pv.c  |  431 
> >> ++++++++++++++++++++++++++++++
> >>  tools/libxc/saverestore/common_x86_pv.h  |  104 +++++++
> >>  tools/libxc/saverestore/restore.c        |  288 ++++++++++++++++++++
> >>  tools/libxc/saverestore/save.c           |   42 +++
> >>  9 files changed, 1252 insertions(+)
> >>  create mode 100644 tools/libxc/saverestore/common_x86.c
> >>  create mode 100644 tools/libxc/saverestore/common_x86.h
> >>  create mode 100644 tools/libxc/saverestore/common_x86_hvm.c
> >>  create mode 100644 tools/libxc/saverestore/common_x86_pv.c
> >>  create mode 100644 tools/libxc/saverestore/common_x86_pv.h
> >>
> >> diff --git a/tools/libxc/saverestore/common.c 
> >> b/tools/libxc/saverestore/common.c
> >> index de2e727..b159c4c 100644
> >> --- a/tools/libxc/saverestore/common.c
> >> +++ b/tools/libxc/saverestore/common.c
> >> @@ -1,3 +1,5 @@
> >> +#include <assert.h>
> >> +
> >>  #include "common.h"
> >>  
> >>  static const char *dhdr_types[] =
> >> @@ -52,6 +54,91 @@ const char *rec_type_to_str(uint32_t type)
> >>      return "Reserved";
> >>  }
> >>  
> >> +int write_split_record(struct context *ctx, struct record *rec,
> >> +                       void *buf, size_t sz)
> >> +{
> >> +    static const char zeroes[7] = { 0 };
> >> +    xc_interface *xch = ctx->xch;
> >> +    uint32_t combined_length = rec->length + sz;
> >> +    size_t record_length = (combined_length + 7) & ~7UL;
> > Isn't this ROUNDUP(combined_length, 3)? (Where 3 and/or 7 perhaps ought
> > to be given names)
> 
> It is.  I am not certain that 3/7 need specific named, given the
> specification mandating that all records have trailing 0s to align the
> subsequent record on an 8 byte boundary.

If the specification mandates it then that is an argument *for* a
SAVE_RECORD_ALIGNMENT type #define.

> >> +    if ( write_exact(ctx->fd, &rec->type, sizeof rec->type) ||
> > libxc style appears to be "sizeof (rec->type)" (the space seems odd to
> > me but is apparently The Style, I think the brackets are a must).
> 
> libxc style also states no extraneous brackets.

These ones aren't extraneous ;-)

Where does it say that anyway?

> >> +    datasz = (rhdr.length + 7) & ~7U;
> > Another ROUNDUP and #define XXX 3 or 7 opportunity. (I'm not going to
> > mention any more of these I find).
> >
> > Is it not a bug in the input for datasz to not be aligned? I thought you
> > were padding everything.
> 
> It is certainly not a bug.  For the blobs which can have an arbitrary
> length, the record length field reflects the exact length.  Trailing 0s
> are then inserted into the stream to align the start of the next record
> on an 8 byte boundary.

Make sense. In which case can you add a comment /* Include padding not
covered by rhdr.length */ or some such.

> >> +// Hack out junk from the namespace
> >  ... namespace, because/in order to...
> >
> > (to prevent accidents I suppose?)
> 
> because mfn_to_pfn and pfn_to_mfn are gross abortions of macros which
> look like regular functions but take implicit scoped state.

So don't use them?


> I have been collapsing patches from 3 separate people ;)  I will do a
> consistency check before this becomes non-rfc.

nb: these patches are missing the RFC tag if that's what they are.

> >
> >> +/*
> >> + * Writes a split record to the stream, applying correct padding where
> >> + * appropriate.  It is common when sending records containing blobs from 
> >> Xen
> >> + * that the header and blob data are separate.  This function accepts a 
> >> second
> >> + * buffer and length, and will merge it with the main record when sending.
> >> + *
> >> + * Records with a non-zero length must provide a valid data field; records
> >> + * with a 0 length shall have their data field ignored.
> >> + *
> >> + * Returns 0 on success and non0 on failure.
> > "non-0". Do the non-zero values have any particular meanings? (errno
> > codes, other error codes?)
> 
> No - see also the other inconsistencies in libxc.
> 
> The current state of all the new functions are "0 on success, -1 on
> error and errno is unlikely be related".  This is no worse than the vast
> majority of libxc currently, and stating "non-0" on error allows for a
> positive "libxc error" range as discussed down the pub.
> 
> I am not shaving that yak right now, but have tried to leave the code in
> a state where inserting consistent error handing should be less hassle
> than it otherwise could be.

OK.

> In all error cases, the log will (should) have accurate information,
> including errno when relevent.

Mentioning this in the comment is nice too, since it prevents callers
thinking they must log as well.

> >> @@ -0,0 +1,431 @@
> >> +#include <assert.h>
> >> +
> >> +#include "common_x86_pv.h"
> >> +
> >> +xen_pfn_t mfn_to_pfn(struct context *ctx, xen_pfn_t mfn)
> >> +{
> >> +    assert(mfn <= ctx->x86_pv.max_mfn);
> > Just to confirm that anywhere there is an assert like this then if it is
> > based on potentially user controller input then it has already been
> > validated elsewhere first? (with the abstraction I couldn't spot where
> > that was)
> 
> It is validated in every case I am aware of, usually by
> "mfn_in_pseudophysmap()"

Good, because it would be a security issue if not...

> >> +        switch ( type )
> >> +        {
> >> +        case XEN_DOMCTL_PFINFO_L4TAB:
> >> +            ERROR("??? Found L4 table for 32bit guest");
> >> +            errno = EINVAL;
> >> +            return -1;
> >> +
> >> +        case XEN_DOMCTL_PFINFO_L3TAB:
> >> +            /* 32bit guests can only use the first 4 entries of their L3 
> >> tables.
> >> +             * All other are potentially used by Xen. */
> >> +            xen_first = 4;
> >> +            xen_last = 512;
> >> +            break;
> >> +
> >> +        case XEN_DOMCTL_PFINFO_L2TAB:
> >> +            /* It is hard to spot Xen mappings in a 32bit guest's L2.  
> >> Most
> >> +             * are normal but only a few will have Xen mappings.
> > Internally Xen has PGT_pae_xen_l2, I wonder if it could be coaxed into
> > exposing that here too?
> 
> That would require exposing xen struct page_info's for specific mfns. 
> This way seems less hacky.

I meant via the introduction of XEN_DOMCTL_PFIINF_L2XENTAB (name TBD)
(AIUI this interface is essentially exposing that bit of the page info,
isn't it?)

> >> ... normalize_page
> >> +    local_page = malloc(PAGE_SIZE);
> > How bad is the overhead of (what I pressume are) all these allocs and
> > frees? (I've not come across the caller in this patch, maybe it will
> > become clear later).
> >
> > Would it be worth keeping an array around shadowing the "live" batch of
> > pages?
> 
> The allocation and freeing of pages has allowed valgrind to be
> fantastically useful at pointing out when my code was doing something silly.

Fair enough (I wonder if there is some valgrind mechanism for marking
memory as explicitly uninitialised as if it had been freed/reallocated)

> I would expect the overhead is negligible (the slow parts of migration
> are the blocking fd reads/writes, mapping hypercalls and memcpy()s), and
> the aid to debugging is far more important.

True, AIUI glibc's malloc is pretty speedy.

> 
> >
> >> +static int x86_pv_localise_page(struct context *ctx, uint32_t type, void 
> >> *page)
> > "localise" means the inverse of "normalise"?
> 
> Yes.  I specifically wanted to avoid the term "canonicalise" from the
> old code in reference to pages and pagetables, as a canonical 64bit
> address is an architectural term.

OK.

> >> +
> >> +    if ( nr_pages > 0 )
> >> +    {
> >> +        mapping = guest_page = xc_map_foreign_bulk(
> >> +            xch, ctx->domid, PROT_READ | PROT_WRITE,
> >> +            mfns, map_errs, nr_pages);
> >> +        if ( !mapping )
> >> +        {
> >> +            PERROR("Unable to map %u mfns for %u pages of data",
> >> +                   nr_pages, count);
> >> +            goto err;
> >> +        }
> >> +    }
> >> +
> >> +    for ( i = 0, j = 0; i < count; ++i )
> >> +    {
> >> +        switch ( types[i] )
> >> +        {
> >> +        case XEN_DOMCTL_PFINFO_XTAB:
> >> +        case XEN_DOMCTL_PFINFO_BROKEN:
> >> +            /* Nothing at all to do */
> > Is this a deliberate fall-thru? Please comment if so.
> >
> >> +        case XEN_DOMCTL_PFINFO_XALLOC:
> >> +            /* Nothing futher to do */
> > "further"
> 
> They are all fall-though, indicated by the continue on the next line,
> and lack of default.

The existing comment half way through the list is what breaks this
though, since it's no longer clear that it does indicate that.

> 
> >
> >> +    for ( i = 0; i < pages->count; ++i )
> >> +    {
> >> +        pfn = pages->pfn[i] & PAGE_DATA_PFN_MASK;
> >> +        if ( !ctx->ops.pfn_is_valid(ctx, pfn) )
> >> +        {
> >> +            ERROR("pfn %#lx (index %u) outside domain maximum", pfn, i);
> >> +            goto err;
> >> +        }
> >> +
> >> +        type = (pages->pfn[i] & PAGE_DATA_TYPE_MASK) >> 32;
> >> +        if ( ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) >= 5) &&
> >> +             ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) <= 8) )
> > What are 5 and 8?
> >
> > This might all be clearer with the use of a #define or two, and perhaps
> > using a switch over the expected valid types and explicitly invalid
> > types.
> 
> 5 to 8 are the "holes" in the otherwise complete PFINFO_TYPE numerspace.
> 
> My views are that the XEN_DOMCTL_PFINFO_* macros are unconditionally
> horrible to use, whatever you are trying to do, which is why the use of
> them is gauged for local clarity rather than global consistency.
> 
> I am not sure a switch statement would help here.

In that case a comment would help.

> >> diff --git a/tools/libxc/saverestore/save.c 
> >> b/tools/libxc/saverestore/save.c
> >> index c013e62..e842e6c 100644
> >> --- a/tools/libxc/saverestore/save.c
> >> +++ b/tools/libxc/saverestore/save.c
> >> @@ -1,5 +1,47 @@
> >> +#include <arpa/inet.h>
> >> +
> >>  #include "common.h"
> >>  
> >> +int write_headers(struct context *ctx, uint16_t guest_type)
> >> +{
> >> +    xc_interface *xch = ctx->xch;
> >> +    int32_t xen_version = xc_version(xch, XENVER_version, NULL);
> >> +    struct ihdr ihdr =
> >> +        {
> > I think coding style puts this on the previous line and the body should
> > be 4 spaces further in.
> 
> Not according to the emacs mode at the bottom, which would appear to
> create a contradiction in CODING_SYTLE.

CODING_STYLE would take precedence if it said anything about this
specific case, which it doesn't. The prevailing style everywhere else I
could find was with the { on the same line.

(not sure which bit of the emacs marker you think contradicts this,
BSD's style(9) doesn't cover this case either)

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.