[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] x86/domctl: Implement XEN_DOMCTL_{get, set}_vcpu_msrs



>>> On 04.06.14 at 19:26, <andrew.cooper3@xxxxxxxxxx> wrote:
> Despite my 'Reviewed-by' tag on c/s 65e3554908 "x86/PV: support data
> breakpoint extension registers", I have re-evaluated my position as far as 
> the hypercall interface is concerned.
> 
> Previously, for the sake of not modifying the migration code in libxc,
> XEN_DOMCTL_get_ext_vcpucontext would jump though hoops to return -ENOBUFS if
> and only if MSRs were in use and no buffer was present.
> 
> This is fragile, and awkward from a toolstack point-of-view when actually
> sending MSR content in the migration stream.  It also complicates fixing a
> further race condition, between querying the number of MSRs for a vcpu, and
> the vcpu touching a new one.
> 
> As this code is still only in staging, take this opportunity to redesign the

s/staging/unstable/

> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1341,6 +1341,132 @@ long arch_do_domctl(
>      }
>      break;
>  
> +    case XEN_DOMCTL_get_vcpu_msrs:
> +    case XEN_DOMCTL_set_vcpu_msrs:
> +    {
> +        struct xen_domctl_vcpu_msrs *vmsrs = &domctl->u.vcpu_msrs;
> +        struct xen_domctl_vcpu_msr msr;
> +        struct vcpu *v;
> +        uint32_t nr_msrs = 0;
> +
> +        ret = -ESRCH;
> +        if ( (vmsrs->vcpu >= d->max_vcpus) ||
> +             ((v = d->vcpu[vmsrs->vcpu]) == NULL) )
> +            break;
> +
> +        ret = -EINVAL;
> +        if ( (v == current) || /* no vcpu_pause() */
> +             !is_pv_domain(d) )
> +            break;
> +
> +        /* Count maximum number of optional msrs. */
> +        if ( boot_cpu_has(X86_FEATURE_DBEXT) )
> +            nr_msrs += 4;
> +
> +        if ( domctl->cmd == XEN_DOMCTL_get_vcpu_msrs )
> +        {
> +            /* NULL guest handle is a request for max size. */
> +            if ( guest_handle_is_null(vmsrs->msrs) ||
> +                 (vmsrs->msr_count < nr_msrs) )
> +            {
> +                vmsrs->msr_count = nr_msrs;
> +                ret = guest_handle_is_null(vmsrs->msrs) ? 0 : -ENOBUFS;

I don't think you should be failing "get" if there is enough space in
the provided buffer to store the actually used number of MSRs. That
way the caller may make a first call with a few (rather than none at
all) entries, an grow the buffer only if this wasn't sufficient.

> +            }
> +            else
> +            {
> +                ret = i = 0;
> +
> +                vcpu_pause(v);
> +
> +                if ( boot_cpu_has(X86_FEATURE_DBEXT) )
> +                {
> +                    unsigned int j;
> +
> +                    if ( v->arch.pv_vcpu.dr_mask[0] )
> +                    {
> +                        msr.index = MSR_AMD64_DR0_ADDRESS_MASK;
> +                        msr.reserved = 0;
> +                        msr.value = v->arch.pv_vcpu.dr_mask[0];
> +                        if ( copy_to_guest_offset(vmsrs->msrs, i, &msr, 1) )
> +                            ret = -EFAULT;
> +                        else
> +                            ++i;
> +                    }
> +
> +                    for ( j = 0; j < 3; ++j )
> +                    {
> +                        if ( !v->arch.pv_vcpu.dr_mask[1 + j] )
> +                            continue;
> +                        if ( !ret )
> +                        {
> +                            msr.index = MSR_AMD64_DR1_ADDRESS_MASK + j;
> +                            msr.reserved = 0;
> +                            msr.value = v->arch.pv_vcpu.dr_mask[1 + j];
> +                            if ( copy_to_guest_offset(vmsrs->msrs, i, &msr, 
> 1) )
> +                                ret = -EFAULT;
> +                            else
> +                                ++i;
> +                        }
> +                    }
> +

Stray blank line.

> +                }
> +
> +                vcpu_unpause(v);
> +
> +                /* Check we didn't lie to userspace then overflow the buffer 
> */
> +                BUG_ON(i > nr_msrs);
> +                vmsrs->msr_count = i;
> +            }
> +
> +            copyback = 1;
> +        }
> +        else
> +        {
> +            ret = -EINVAL;
> +            if ( vmsrs->msr_count > nr_msrs )
> +                break;

Similarly I don't think you should fail the operation simply based on a
too large count. Who knows when or why it may make sense for the
caller to specify multiple entries with the same index.

But then again the way you do it we have a statically determined
maximum and don't need to worry about preemption here until a
really large number of MSRs would get handled.

> +
> +            vcpu_pause(v);
> +
> +            for ( i = 0; i < vmsrs->msr_count; ++i )
> +            {
> +                ret = -EFAULT;
> +                if ( copy_from_guest_offset(&msr, vmsrs->msrs, i, 1) )
> +                    break;
> +
> +                ret = -EINVAL;
> +                if ( msr.reserved )
> +                    break;
> +
> +                switch ( msr.index )
> +                {
> +                case MSR_AMD64_DR0_ADDRESS_MASK:
> +                    if ( !boot_cpu_has(X86_FEATURE_DBEXT) ||
> +                         (msr.value >> 32) )
> +                        break;
> +                    v->arch.pv_vcpu.dr_mask[0] = msr.value;
> +                    continue;
> +
> +                case MSR_AMD64_DR1_ADDRESS_MASK ...
> +                    MSR_AMD64_DR3_ADDRESS_MASK:
> +                    if ( !boot_cpu_has(X86_FEATURE_DBEXT) ||
> +                         (msr.value >> 32) )
> +                        break;
> +                    msr.index -= MSR_AMD64_DR1_ADDRESS_MASK - 1;
> +                    v->arch.pv_vcpu.dr_mask[msr.index] = msr.value;
> +                    continue;
> +                }
> +                break;
> +            }
> +
> +            vcpu_unpause(v);
> +
> +            if ( i == vmsrs->msr_count )
> +                ret = 0;

else {
    vmsrs->msr_count = i
    copyback = 1;
}

to at once give the caller an indication which slot failed?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.