[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access hypercall



> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
> Sent: Friday, July 11, 2014 5:25 PM
> To: Xu, Dongxiao; Jan Beulich
> Cc: Ian.Campbell@xxxxxxxxxx; George.Dunlap@xxxxxxxxxxxxx;
> Ian.Jackson@xxxxxxxxxxxxx; stefano.stabellini@xxxxxxxxxxxxx;
> xen-devel@xxxxxxxxxxxxx; konrad.wilk@xxxxxxxxxx; dgdegra@xxxxxxxxxxxxx;
> keir@xxxxxxx
> Subject: Re: [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access
> hypercall
> 
> On 11/07/14 05:29, Xu, Dongxiao wrote:
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: Friday, July 04, 2014 6:44 PM
> >> To: Xu, Dongxiao
> >> Cc: andrew.cooper3@xxxxxxxxxx; Ian.Campbell@xxxxxxxxxx;
> >> George.Dunlap@xxxxxxxxxxxxx; Ian.Jackson@xxxxxxxxxxxxx;
> >> stefano.stabellini@xxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxx;
> >> konrad.wilk@xxxxxxxxxx; dgdegra@xxxxxxxxxxxxx; keir@xxxxxxx
> >> Subject: Re: [PATCH v12 1/9] x86: add generic resource (e.g. MSR) access
> >> hypercall
> >>
> >>>>> On 04.07.14 at 10:34, <dongxiao.xu@xxxxxxxxx> wrote:
> >>> Add a generic resource access hypercall for tool stack or other
> >>> components, e.g., accessing MSR, port I/O, etc.
> >> Sigh - you're still allocating an unbounded amount of memory for
> >> passing around the input arguments, despite it being possible (and
> >> having been suggested) to read these from the original buffer on
> >> each iteration. You're still not properly checking for preemption
> >> between iterations. And you're still not making use of
> >> continue_hypercall_on_cpu(). Plus you now silently ignore the
> >> upper 32-bits of the passing in "idx" value as well as not
> >> understood XEN_RESOURCE_OP_* values.
> > continue_hypercall_on_cpu() is asynchronized, which requires the "data" 
> > field
> always points to the right place before the hypercall returns.
> > However in our function, we have a "for" loop to cover multiple operations, 
> > so
> the "data" field will be modified in each iteration, which cannot meet the
> continue_hypercall_on_cpu() requirements...
> 
> This is because you are still copying all resource data at once from the
> hypercall.
> 
> As Jan points out, this is an unbounded allocation in Xen which must be
> addresses.  If instead you were to copy each element one at a time, you
> would avoid this allocation entirely and be able to correctly use
> continue_hypercall_on_cpu().

I've accepted the idea to copy the element one by one, however it seems that it 
doesn't help on continue_hypercall_on_cpu().
The full code looks like the following, where the variable "ra" will be updated 
on every "for" loop, and couldn't be used in continue_hypercall_on_cpu().
Do you have idea on how to solve this issue and use continue_hypercall_on_cpu() 
here?

static int resource_access_helper(struct xenpf_resource_op *op)
{
    struct xen_resource_access ra;
    unsigned int i;
    int ret = 0;

    for ( i = 0; i < op->nr; i++ )
    {
        if ( copy_from_guest_offset(&ra.data, op->data, i, 1) )
        {
            ret = -EFAULT;
            break;
        }

        if ( ra.data.cpu == smp_processor_id() )
            resource_access_one(&ra);
        else
            on_selected_cpus(cpumask_of(ra.data.cpu),
                             resource_access_one, &ra, 1);

        if ( copy_to_guest_offset(op->data, i, &ra.data, 1) )
        {
            ret = -EFAULT;
            break;
        }
    }

    return ret;
}

> 
> 
> >
> > For the preemption check, what about the following? Here the preemption is
> checked within each resource_access_one() function.
> 
> None of this preemption works.
> 
> In the case a hypercall gets preempted, you need to increment the guest
> handle along to the next element to process, and decrement the count by
> the number of elements processed in *the guest context*.
> 
> That way, when the hypercall continues in Xen, it shall pick up with the
> next action to perform rather than restarting from the beginning.

Some actions (like CQM) requires the read/write the MSRs in a continuous way, 
if it is interrupted, this "continuity" couldn't be guaranteed. The RESTART 
return value indicates to re-run the operations.
BTW, I tested it in my box, and the "failure" case doesn't happen frequently.

Thanks,
Dongxiao

> 
> ~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.