[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/<some blob> to the hypervisor



On Fri, Jan 30, 2015 at 02:23:48PM +0000, Andrew Cooper wrote:
> On 30/01/15 01:14, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" <mcgrof@xxxxxxxx>
> >
> > There are several ways that a Xen system can update the
> > CPU microcode on a pvops kernel [0] now, the preferred way
> > is through the early microcode update mechanism. At run
> > time folks should use this new xenmicrocode tool and use
> > the same CPU microcode file as present on /lib/firmware.
> >
> > Some distributions may use the historic sysfs rescan interface,
> > users of that mechanism should be aware that the interface will
> > not be available when using Xen and as such should first check
> > the presence of the interface before usage, as an alternative
> > this xenmicrocode tool can be used on priviledged domains.
> >
> > Folks wishing to update CPU microcode at run time should be
> > aware that not all CPU microcode can be updated on a system
> > and should take care to ensure that only known-to-work and
> > supported CPU microcode updates are used [0]. To avoid issues
> > with delays on the hypercall / microcode update this
> > implementation will quiesce all domains prior to updating
> > the microcode, and then only queisce'd domains will be
> > unpaused once done.
> >
> > [0] http://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update
> >
> > Based on original work by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> > Cc: Borislav Petkov <bp@xxxxxxx>
> > Cc: Takashi Iwai <tiwai@xxxxxxx>
> > Cc: Olaf Hering <ohering@xxxxxxx>
> > Cc: Jan Beulich <JBeulich@xxxxxxxx>
> > Cc: Jason Douglas <jdouglas@xxxxxxxx>
> > Cc: Juergen Gross <jgross@xxxxxxxx>
> > Cc: Michal Marek <mmarek@xxxxxxx>
> > Cc: Henrique de Moraes Holschuh <hmh@xxxxxxxxxx>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxx>
> > ---
> >
> > Just wrote this, haven't tested it. This does some queiscing prior
> > to doing the microcode update. The quiescing is done by pausing
> > all domains. Once the microcode update is done we only unpause
> > domains which we queisced as part of our work. Let me know if this
> > is on the right track to help avoid issues noted on the list.
> 
> There is also a TOCTOU race with your paused check, which itself is
> buggy as it you should unconditionally pause all VMs (userspace pause
> refcounting has been fixed for a little while now).

Also, currently __domain_pause_by_systemcontroller() has a limit to 255
guests. Are the fixes that you describe to the refcounting sufficient
to remove that limitation from __domain_pause_by_systemcontroller()?

My implementation uses 1024 but has no check on nb_domain (the amount of
domains set) so that requires fixing as well but I figure we should also
review the above first too. Artificial limits bother me and I went with
1024 as I also saw that limit used elsewhere, not sure if it was a stack
concern or what.

> However, xenmicrocode (even indirectly via xc_microcode_update()) is not
> in a position to safely pause all domains as there is no interlock to
> preventing a new domain being created.  Even if all domains do get
> successfully paused, 

I did think about this and figured we could use this as a segway into
a discussion about how we would want to implement this sort of
interlocking or see if there is anything available already for this
sort of thing. Also, are there other future users of this that could benefit
from it ? If so then perhaps we can wrap the requirements up together.

> the idle loops are substantially less trivial than
> ideal.

Interesting, can you elaborate on the possible issues that might creep
up on the idle loops while a guest is paused during a microcode update?
Any single issue would suffice, just curious.

Do we need something more intricate than pause implemented then? Using
suspend seemed rather grotesque to shove down a guest's throat. If
pause / suspend do not suffice perhaps some new artificial virtual
tempory quiesce is needed to ensure integrity here, which would address
some of the idle concerns you highligted might creep up.

> The toolstack should not hack around hypervisor bugs, and indeed is not
> capable of doing so.

Agreed. I figured I'd at least do what I can with what is available
and use this as a discussoin for what is the Right Thing To Do (TM)
in the future.

> >  tools/libxc/include/xenctrl.h |   2 +
> >  tools/libxc/xc_misc.c         | 102 
> > ++++++++++++++++++++++++++++++++++++++++++
> >  tools/misc/Makefile           |   4 ++
> >  tools/misc/xenmicrocode.c     | 101 
> > +++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 209 insertions(+)
> >  create mode 100644 tools/misc/xenmicrocode.c
> >
> > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> > index 0ad8b8d..d5db0b8 100644
> > --- a/tools/libxc/include/xenctrl.h
> > +++ b/tools/libxc/include/xenctrl.h
> > @@ -55,6 +55,7 @@
> >  #include <xen/foreign/x86_32.h>
> >  #include <xen/foreign/x86_64.h>
> >  #include <xen/arch-x86/xen-mca.h>
> > +#include <xen/platform.h>
> >  #endif
> >  
> >  #define XC_PAGE_SHIFT           12
> > @@ -2145,6 +2146,7 @@ int xc_cpuid_apply_policy(xc_interface *xch,
> >  void xc_cpuid_to_str(const unsigned int *regs,
> >                       char **strs); /* some strs[] may be NULL if ENOMEM */
> >  int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
> > +int xc_microcode_update(xc_interface *xch, uint8_t *fbuf, size_t len);
> >  #endif
> >  
> >  struct xc_px_val {
> > diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
> > index e253a58..6137e24 100644
> > --- a/tools/libxc/xc_misc.c
> > +++ b/tools/libxc/xc_misc.c
> > @@ -20,6 +20,7 @@
> >  
> >  #include "xc_private.h"
> >  #include <xen/hvm/hvm_op.h>
> > +#include <xen/platform.h>
> >  
> >  int xc_get_max_cpus(xc_interface *xch)
> >  {
> > @@ -250,6 +251,107 @@ int xc_mca_op(xc_interface *xch, struct xen_mc *mc)
> >      xc_hypercall_bounce_post(xch, mc);
> >      return ret;
> >  }
> > +
> > +struct xc_quiesce_request {
> > +    int num_quiesced;
> > +    int domids[1024]; /* never domid 0 */
> > +};
> > +
> > +/*
> > + * Do not pause already paused domains, and allow us to
> > + * unpause only quiesced domains.
> > + */
> > +static int quiesce_all_domains(xc_interface *xch,
> > +                          struct xc_quiesce_request *quiesce_request)
> > +{
> > +    xc_domaininfo_t info[1024];
> > +    int i, nb_domain;
> > +
> > +    nb_domain = xc_domain_getinfolist(xch, 0, 1024, info);
> > +    if ( nb_domain < 0 )
> > +    {
> > +   return -1;
> > +    }
> > +
> > +    for ( i = 0; i < nb_domain; i++ )
> > +    {
> > +        if ( info[i].domain == 0 )
> > +           continue;
> > +   if ( info[i].flags & XEN_DOMINF_paused )
> > +           continue;
> > +
> > +        xc_domain_pause(xch, info[i].domain);
> > +
> > +   quiesce_request->domids[quiesce_request->num_quiesced] = info[i].domain;
> > +   quiesce_request->num_quiesced++;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +static void unquiesce_all_domains(xc_interface *xch,
> > +                                  struct xc_quiesce_request 
> > *quiesce_request)
> > +{
> > +    int i;
> > +
> > +    for ( i = 0; i < quiesce_request->num_quiesced; i++ )
> > +    {
> > +   xc_domain_unpause(xch, quiesce_request->domids[i]);
> > +    }
> > +}
> > +
> > +int xc_microcode_update(xc_interface *xch, uint8_t *fbuf, size_t len)
> > +{
> > +    int ret = 0;
> > +    DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc);
> > +    DECLARE_HYPERCALL;
> > +    struct xen_platform_op op_s;
> > +    struct xen_platform_op *op = &op_s;
> > +    DECLARE_HYPERCALL_BOUNCE(op, sizeof(*op), 
> > XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
> > +    struct xc_quiesce_request quiesce_request;
> > +
> > +    memset(&quiesce_request, 0, sizeof(struct xc_quiesce_request));
> > +
> > +    if ( !xch )
> > +    {
> > +        return -1;
> > +    }
> > +
> > +    uc = xc_hypercall_buffer_alloc(xch, uc, len);
> > +    if ( uc == NULL )
> > +    {
> > +        PERROR("Could not allocate memory for xc_microcode_update 
> > hypercall");
> > +        return -1;
> > +    }
> > +
> > +    memcpy(uc, fbuf, len);
> > +
> > +    set_xen_guest_handle(op->u.microcode.data, uc);
> > +    op->cmd = XENPF_microcode_update;
> > +    op->interface_version = XENPF_INTERFACE_VERSION;
> > +    op->u.microcode.length = len;
> > +
> > +    if ( xc_hypercall_bounce_pre(xch, op) )
> > +    {
> > +   xc_hypercall_buffer_free(xch, uc);
> > +        PERROR("Could not bounce memory for xc_microcode_update 
> > hypercall");
> > +        return -1;
> > +    }
> > +
> > +    hypercall.op = __HYPERVISOR_platform_op;
> > +    hypercall.arg[0] = HYPERCALL_BUFFER_AS_ARG(op);
> > +
> > +    quiesce_all_domains(xch, &quiesce_request);
> > +    ret = do_xen_hypercall(xch, &hypercall);
> > +    unquiesce_all_domains(xch, &quiesce_request);
> > +
> > +    xc_hypercall_bounce_post(xch, op);
> > +    xc_hypercall_buffer_free(xch, uc);
> > +    xc_interface_close(xch);
> > +
> > +    return ret;
> > +}
> > +
> >  #endif
> >  
> >  int xc_perfc_reset(xc_interface *xch)
> > diff --git a/tools/misc/Makefile b/tools/misc/Makefile
> > index a255c22..ba4779a 100644
> > --- a/tools/misc/Makefile
> > +++ b/tools/misc/Makefile
> > @@ -21,6 +21,7 @@ INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmcrash
> >  INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmctx
> >  INSTALL_SBIN-$(CONFIG_X86)     += xen-lowmemd
> >  INSTALL_SBIN-$(CONFIG_X86)     += xen-mfndump
> > +INSTALL_SBIN-$(CONFIG_X86)     += xenmicrocode
> 
> "xen-microcode" perhaps?

Yeah that's better.

> >  INSTALL_SBIN                   += xen-ringwatch
> >  INSTALL_SBIN                   += xen-tmem-list-parse
> >  INSTALL_SBIN                   += xencov
> > @@ -74,6 +75,9 @@ xenperf: xenperf.o
> >  xenpm: xenpm.o
> >     $(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> >  
> > +xenmicrocode: xenmicrocode.o
> > +   $(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> > +
> >  gtracestat: gtracestat.o
> >     $(CC) $(LDFLAGS) -o $@ $< $(APPEND_LDFLAGS)
> >  
> > diff --git a/tools/misc/xenmicrocode.c b/tools/misc/xenmicrocode.c
> > new file mode 100644
> > index 0000000..d027b13
> > --- /dev/null
> > +++ b/tools/misc/xenmicrocode.c
> > @@ -0,0 +1,101 @@
> > +#define _GNU_SOURCE
> > +
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <sys/mman.h>
> > +#include <errno.h>
> > +#include <string.h>
> > +#include <inttypes.h>
> > +#include <unistd.h>
> > +#include <sys/types.h>
> > +#include <sys/stat.h>
> > +#include <fcntl.h>
> > +#include <xenctrl.h>
> > +
> > +/*
> > + * Documentation:
> > + *
> > + * http://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update
> > + */
> > +
> > +void show_help(void)
> > +{
> > +    fprintf(stderr,
> > +            "xenmicrocode: Xen runtime CPU microcode update tool\n"
> > +            "Usage: xenmicrocode </lib/firmware/microcode-file>\n"
> > +           );
> > +}
> > +
> > +int main(int argc, char *argv[])
> > +{
> > +    int fd = 0;
> > +    uint8_t *fbuf;
> > +    char *filename;
> > +    struct stat buf;
> > +    static xc_interface *xch;
> > +    int ret;
> > +
> > +    if ( argc != 2 )
> > +    {
> > +   show_help();
> > +   return 1;
> > +    }
> > +
> > +    if ( !strcmp("--help", argv[1]) ||
> > +         !strcmp("-h", argv[1]) )
> > +    {
> > +   show_help();
> > +        return 1;
> > +    }
> > +
> > +    /* microcode file as present on /lib/firmware/ */
> > +    filename = argv[1];
> > +
> > +    fd = open(filename, O_RDONLY);
> > +    if ( fd <= 0 )
> > +    {
> > +   show_help();
> > +        fprintf(stderr,
> > +                "Could not open; err: %d(%s)\n", errno, strerror(errno));
> 
> You have mixed tabs and spaces. Please adjust for spaces only

Fixed. 

> and put a
> local variable block at the bottom of the file.

Not sure what you mean by this, but I'll just keep that print in one
line.

> > +        return errno;
> > +    }
> > +
> > +    if ( stat(filename, &buf) != 0 )
> > +    {
> > +        fprintf(stderr, "Could not open; err: %d(%s)\n", errno, 
> > strerror(errno));
> > +        close(fd);
> > +        return errno;
> > +    }
> > +
> > +    xch = xc_interface_open(0,0,0);
> > +    if ( !xch )
> > +    {
> > +        close(fd);
> > +        fprintf(stderr, "failed to get the handler\n");
> > +        return 1;
> > +    }
> > +
> > +    printf("%s: %ld\n", filename, buf.st_size);
> > +
> > +    fbuf = mmap(0, buf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
> > +    if ( fbuf == MAP_FAILED )
> > +    {
> > +        fprintf(stderr, "Could not map: error: %d(%s)\n", errno,
> > +                strerror(errno));
> > +        close(fd);
> > +        return errno;
> > +    }
> > +
> > +    ret = xc_microcode_update(xch, fbuf, buf.st_size);
> > +
> > +    if ( munmap(fbuf, buf.st_size) )
> > +    {
> > +        fprintf(stderr, "Could not unmap: %d(%s)\n", errno, 
> > strerror(errno));
> > +        close(fd);
> > +        return errno;
> > +    }
> > +
> > +    close(fd);
> > +
> > +    return ret;
> > +}
> 
> The content of xenmicrocode.c is looking ok now, but as indicated, I do
> not believe the xc_microcode_update() call should be pausing domains.

Lets iron that out with your input.

  Luis

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.