[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 10/11] viridian: add implementation of synthetic timers



> -----Original Message-----
> From: Paul Durrant [mailto:paul.durrant@xxxxxxxxxx]
> Sent: 07 March 2019 14:09
> To: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; 
> Ian Jackson
> <Ian.Jackson@xxxxxxxxxx>; Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>; George 
> Dunlap
> <George.Dunlap@xxxxxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>; Julien Grall 
> <julien.grall@xxxxxxx>;
> Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>; Stefano Stabellini 
> <sstabellini@xxxxxxxxxx>; Tim
> (Xen.org) <tim@xxxxxxx>; Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Subject: [PATCH v4 10/11] viridian: add implementation of synthetic timers
> 
> This patch introduces an implementation of the STIMER0-15_CONFIG/COUNT MSRs
> and hence a the first SynIC message source.
> 
> The new (and documented) 'stimer' viridian enlightenment group may be
> specified to enable this feature.
> 
> While in the neighbourhood, this patch adds a missing check for an
> attempt to write the time reference count MSR, which should result in an
> exception (but not be reported as an unimplemented MSR).
> 
> NOTE: It is necessary for correct operation that timer expiration and
>       message delivery time-stamping use the same time source as the guest.
>       The specification is ambiguous but testing with a Windows 10 1803
>       guest has shown that using the partition reference counter as a
>       source whilst the guest is using RDTSC and the reference tsc page
>       does not work correctly. Therefore the time_now() function is used.
>       This implements the algorithm for acquiring partition reference time
>       that is documented in the specifiction.
> 
> Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
> Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>
> ---
> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Julien Grall <julien.grall@xxxxxxx>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> Cc: Tim Deegan <tim@xxxxxxx>
> Cc: "Roger Pau Monné" <roger.pau@xxxxxxxxxx>
> 
> v4:
>  - Address comments from Jan
> 
> v3:
>  - Re-worked missed ticks calculation
> ---
>  docs/man/xl.cfg.5.pod.in               |  12 +-
>  tools/libxl/libxl.h                    |   6 +
>  tools/libxl/libxl_dom.c                |   4 +
>  tools/libxl/libxl_types.idl            |   1 +
>  xen/arch/x86/hvm/viridian/private.h    |   9 +-
>  xen/arch/x86/hvm/viridian/synic.c      |  53 +++-
>  xen/arch/x86/hvm/viridian/time.c       | 386 ++++++++++++++++++++++++-
>  xen/arch/x86/hvm/viridian/viridian.c   |  19 ++
>  xen/include/asm-x86/hvm/viridian.h     |  32 +-
>  xen/include/public/arch-x86/hvm/save.h |   2 +
>  xen/include/public/hvm/params.h        |   7 +-
>  11 files changed, 523 insertions(+), 8 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> index ad81af1ed8..355c654693 100644
> --- a/docs/man/xl.cfg.5.pod.in
> +++ b/docs/man/xl.cfg.5.pod.in
> @@ -2167,11 +2167,19 @@ This group incorporates the crash control MSRs. These 
> enlightenments
>  allow Windows to write crash information such that it can be logged
>  by Xen.
> 
> +=item B<stimer>
> +
> +This set incorporates the SynIC and synthetic timer MSRs. Windows will
> +use synthetic timers in preference to emulated HPET for a source of
> +ticks and hence enabling this group will ensure that ticks will be
> +consistent with use of an enlightened time source (B<time_ref_count> or
> +B<reference_tsc>).
> +
>  =item B<defaults>
> 
>  This is a special value that enables the default set of groups, which
> -is currently the B<base>, B<freq>, B<time_ref_count>, B<apic_assist>
> -and B<crash_ctl> groups.
> +is currently the B<base>, B<freq>, B<time_ref_count>, B<apic_assist>,
> +B<crash_ctl> and B<stimer> groups.
> 
>  =item B<all>
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index a923a380d3..c8f219b0d3 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -324,6 +324,12 @@
>   */
>  #define LIBXL_HAVE_VIRIDIAN_SYNIC 1
> 
> +/*
> + * LIBXL_HAVE_VIRIDIAN_STIMER indicates that the 'stimer' value
> + * is present in the viridian enlightenment enumeration.
> + */
> +#define LIBXL_HAVE_VIRIDIAN_STIMER 1
> +
>  /*
>   * LIBXL_HAVE_BUILDINFO_HVM_ACPI_LAPTOP_SLATE indicates that
>   * libxl_domain_build_info has the u.hvm.acpi_laptop_slate field.
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index fb758d2ac3..2ee0f82ee7 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -269,6 +269,7 @@ static int hvm_set_viridian_features(libxl__gc *gc, 
> uint32_t domid,
>          libxl_bitmap_set(&enlightenments, 
> LIBXL_VIRIDIAN_ENLIGHTENMENT_TIME_REF_COUNT);
>          libxl_bitmap_set(&enlightenments, 
> LIBXL_VIRIDIAN_ENLIGHTENMENT_APIC_ASSIST);
>          libxl_bitmap_set(&enlightenments, 
> LIBXL_VIRIDIAN_ENLIGHTENMENT_CRASH_CTL);
> +        libxl_bitmap_set(&enlightenments, 
> LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER);
>      }
> 
>      libxl_for_each_set_bit(v, info->u.hvm.viridian_enable) {
> @@ -320,6 +321,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, 
> uint32_t domid,
>      if (libxl_bitmap_test(&enlightenments, 
> LIBXL_VIRIDIAN_ENLIGHTENMENT_SYNIC))
>          mask |= HVMPV_synic;
> 
> +    if (libxl_bitmap_test(&enlightenments, 
> LIBXL_VIRIDIAN_ENLIGHTENMENT_STIMER))
> +        mask |= HVMPV_time_ref_count | HVMPV_synic | HVMPV_stimer;
> +
>      if (mask != 0 &&
>          xc_hvm_param_set(CTX->xch,
>                           domid,
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 9860bcaf5f..1cce249de4 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -236,6 +236,7 @@ libxl_viridian_enlightenment = 
> Enumeration("viridian_enlightenment", [
>      (5, "apic_assist"),
>      (6, "crash_ctl"),
>      (7, "synic"),
> +    (8, "stimer"),
>      ])
> 
>  libxl_hdtype = Enumeration("hdtype", [
> diff --git a/xen/arch/x86/hvm/viridian/private.h 
> b/xen/arch/x86/hvm/viridian/private.h
> index 96a784b840..c272c34cda 100644
> --- a/xen/arch/x86/hvm/viridian/private.h
> +++ b/xen/arch/x86/hvm/viridian/private.h
> @@ -74,6 +74,11 @@
>  int viridian_synic_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val);
>  int viridian_synic_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val);
> 
> +bool viridian_synic_deliver_timer_msg(struct vcpu *v, unsigned int sintx,
> +                                      unsigned int index,
> +                                      uint64_t expiration,
> +                                      uint64_t delivery);
> +
>  int viridian_synic_vcpu_init(const struct vcpu *v);
>  int viridian_synic_domain_init(const struct domain *d);
> 
> @@ -93,7 +98,9 @@ void viridian_synic_load_domain_ctxt(
>  int viridian_time_wrmsr(struct vcpu *v, uint32_t idx, uint64_t val);
>  int viridian_time_rdmsr(const struct vcpu *v, uint32_t idx, uint64_t *val);
> 
> -int viridian_time_vcpu_init(const struct vcpu *v);
> +void viridian_time_poll_timers(struct vcpu *v);
> +
> +int viridian_time_vcpu_init(struct vcpu *v);
>  int viridian_time_domain_init(const struct domain *d);
> 
>  void viridian_time_vcpu_deinit(const struct vcpu *v);
> diff --git a/xen/arch/x86/hvm/viridian/synic.c 
> b/xen/arch/x86/hvm/viridian/synic.c
> index f4510d3829..b5f3a79556 100644
> --- a/xen/arch/x86/hvm/viridian/synic.c
> +++ b/xen/arch/x86/hvm/viridian/synic.c
> @@ -340,9 +340,58 @@ void viridian_synic_domain_deinit(const struct domain *d)
>  {
>  }
> 
> -void viridian_synic_poll_messages(const struct vcpu *v)
> +void viridian_synic_poll_messages(struct vcpu *v)
>  {
> -    /* There are currently no message sources */
> +    viridian_time_poll_timers(v);
> +}
> +
> +bool viridian_synic_deliver_timer_msg(struct vcpu *v, unsigned int sintx,
> +                                      unsigned int index,
> +                                      uint64_t expiration,
> +                                      uint64_t delivery)
> +{
> +    struct viridian_vcpu *vv = v->arch.hvm.viridian;
> +    const union viridian_sint_msr *vs = &vv->sint[sintx];
> +    HV_MESSAGE *msg = vv->simp.ptr;
> +    struct {
> +        uint32_t TimerIndex;
> +        uint32_t Reserved;
> +        uint64_t ExpirationTime;
> +        uint64_t DeliveryTime;
> +    } payload = {
> +        .TimerIndex = index,
> +        .ExpirationTime = expiration,
> +        .DeliveryTime = delivery,
> +    };
> +
> +    if ( test_bit(sintx, &vv->msg_pending) )
> +        return false;
> +
> +    BUILD_BUG_ON(sizeof(*msg) != HV_MESSAGE_SIZE);
> +    msg += sintx;
> +
> +    /*
> +     * To avoid using an atomic test-and-set, and barrier before calling
> +     * vlapic_set_irq(), this function must be called in context of the
> +     * vcpu receiving the message.
> +     */
> +    ASSERT(v == current);
> +    if ( msg->Header.MessageType != HvMessageTypeNone )
> +    {
> +        msg->Header.MessageFlags.MessagePending = 1;
> +        __set_bit(sintx, &vv->msg_pending);
> +        return false;
> +    }
> +
> +    msg->Header.MessageType = HvMessageTimerExpired;
> +    msg->Header.MessageFlags.MessagePending = 0;
> +    msg->Header.PayloadSize = sizeof(payload);
> +    memcpy(msg->Payload, &payload, sizeof(payload));
> +
> +    if ( !vs->mask )
> +        vlapic_set_irq(vcpu_vlapic(v), vs->vector, 0);
> +
> +    return true;
>  }
> 
>  bool viridian_synic_is_auto_eoi_sint(const struct vcpu *v,
> diff --git a/xen/arch/x86/hvm/viridian/time.c 
> b/xen/arch/x86/hvm/viridian/time.c
> index 71291d921c..83b7d2c67f 100644
> --- a/xen/arch/x86/hvm/viridian/time.c
> +++ b/xen/arch/x86/hvm/viridian/time.c
> @@ -12,6 +12,7 @@
>  #include <xen/version.h>
> 
>  #include <asm/apic.h>
> +#include <asm/event.h>
>  #include <asm/hvm/support.h>
> 
>  #include "private.h"
> @@ -72,6 +73,7 @@ static void update_reference_tsc(struct domain *d, bool 
> initialize)
>       * ticks per 100ns shifted left by 64.
>       */
>      p->TscScale = ((10000ul << 32) / d->arch.tsc_khz) << 32;
> +    smp_wmb();
> 
>      p->TscSequence++;
>      if ( p->TscSequence == 0xFFFFFFFF ||
> @@ -118,18 +120,265 @@ static int64_t time_ref_count(const struct domain *d)
>      return raw_trc_val(d) + trc->off;
>  }
> 
> +/*
> + * The specification says: "The partition reference time is computed
> + * by the following formula:
> + *
> + * ReferenceTime = ((VirtualTsc * TscScale) >> 64) + TscOffset
> + *
> + * The multiplication is a 64 bit multiplication, which results in a
> + * 128 bit number which is then shifted 64 times to the right to obtain
> + * the high 64 bits."
> + */
> +static uint64_t scale_tsc(uint64_t tsc, uint64_t scale, uint64_t offset)
> +{
> +    uint64_t result;
> +
> +    /*
> +     * Quadword MUL takes an implicit operand in RAX, and puts the result
> +     * in RDX:RAX. Because we only want the result of the multiplication
> +     * after shifting right by 64 bits, we therefore only need the content
> +     * of RDX.
> +     */
> +    asm ( "mulq %[scale]"
> +          : "+a" (tsc), "=d" (result)
> +          : [scale] "rm" (scale) );
> +
> +    return result + offset;
> +}
> +
> +static uint64_t time_now(struct domain *d)
> +{
> +    const struct viridian_page *rt = &d->arch.hvm.viridian->reference_tsc;
> +    HV_REFERENCE_TSC_PAGE *p = rt->ptr;
> +    uint32_t start, end;
> +    uint64_t tsc;
> +    uint64_t scale;
> +    uint64_t offset;
> +
> +    /*
> +     * If the reference TSC page is not enabled, or has been invalidated
> +     * fall back to the partition reference counter.
> +     */
> +    if ( !p || !p->TscSequence )
> +        return time_ref_count(d);
> +
> +    /*
> +     * The following sampling algorithm for tsc, scale and offset is
> +     * documented in the specifiction.
> +     */
> +    do {
> +        start = p->TscSequence;
> +        smp_rmb();
> +
> +        tsc = rdtsc();

I now realize this is wrong as it does not take into account any h/w based 
scaling of TSC in non-root mode. This function needs tsc as the guest would see 
it. I'll send a v5 with this replaced by a call to hvm_get_guest_tsc().

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.