[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v11 1/9] x86: add generic MSR access hypercall



>>> On 23.06.14 at 08:27, <dongxiao.xu@xxxxxxxxx> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: Friday, June 20, 2014 11:01 PM
>> To: Xu, Dongxiao
>> Cc: andrew.cooper3@xxxxxxxxxx; Ian.Campbell@xxxxxxxxxx;
>> George.Dunlap@xxxxxxxxxxxxx; Ian.Jackson@xxxxxxxxxxxxx;
>> stefano.stabellini@xxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxx;
>> konrad.wilk@xxxxxxxxxx; dgdegra@xxxxxxxxxxxxx; keir@xxxxxxx 
>> Subject: Re: [PATCH v11 1/9] x86: add generic MSR access hypercall
>> 
>> >>> On 20.06.14 at 16:31, <dongxiao.xu@xxxxxxxxx> wrote:
>> > Add a generic MSR access hypercall for tool stack or other components to
>> > read or write certain system MSRs.
>> 
>> If you want this to be usable by components other than the tool stack
>> then this can't be a sysctl (would e.g. need to be a platform-op then).
> 
> Per my understanding, the platform-op related hypercalls are used by Dom0 
> kernel.
> While in this CQM scenario, we want libvirt/libxl/libxc to call such 
> hypercall to access the MSR, however I didn't see any usage of platform_op in 
> libxl/libxc side.
> 
> Could you explain a bit more for it?

Platform ops are for use by Dom0, yes, but note the explicit omission
of "kernel" in my reply. The fact that libxc currently doesn't use any
doesn't mean it mustn't do so. The only question here that matter is
what audience we see for the new functionality: If this is to be a
purely tools interface, then a sysctl is fine. If the kernel is intended to
also be able to use it (as I suggested), something other than a sysctl
needs to be used (and I was merely _hinting_ at platform-op).

>> > --- a/xen/arch/x86/sysctl.c
>> > +++ b/xen/arch/x86/sysctl.c
>> > @@ -66,10 +66,27 @@ void arch_do_physinfo(xen_sysctl_physinfo_t *pi)
>> >          pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm_directio;
>> >  }
>> >
>> > +static void msr_access_helper(void *data)
>> > +{
>> > +    struct msr_access_info *info = data;
>> > +    xen_sysctl_msr_data_t *msr_data;
>> > +    unsigned int i;
>> > +
>> > +    for ( i = 0; i < info->nr_ops; i++ )
>> > +    {
>> > +        msr_data = &info->msr_data[i];
>> > +        if ( msr_data->cmd == XEN_SYSCTL_MSR_read )
>> > +            rdmsrl(msr_data->msr, msr_data->val);
>> > +        else if ( msr_data->cmd == XEN_SYSCTL_MSR_write )
>> > +            wrmsrl(msr_data->msr, msr_data->val);
>> > +    }
>> 
>> Missing preemption check.
> 
> Do you mean hypercall_preemption_check() here? Or the preemption between the 
> MSR accesses?

Not sure what distinction you make between the two (or what the
second question is supposed to refer to). The list of operations can
be arbitrarily long, and individual operations may also take arbitrary
time (especially if this later gets extended to I/O port accesses,
which may incur SMM invocation). Hence preemption checks between
operations are needed (except in well-defined cases where the
caller may request such check to be skipped).

>> Anyway, together with the comments on the interface itself (further
>> down) I think you'd be much better off basing this on
>> continue_hypercall_on_cpu().
> 
> continue_hypercall_on_cpu() function schedules a tasklet to run the specific 
> function on certain CPU.
> However in our CQM case, the libxl/libxc functions want to get the result 
> immediately when the function returns, so that's why I didn't use 
> continue_hypercall_on_cpu().

The use of a tasklet doesn't mean the result would only be available in
a deferred manner: The tasklet runs _before_ control transfers back
to the caller.

>> > --- a/xen/include/public/sysctl.h
>> > +++ b/xen/include/public/sysctl.h
>> > @@ -636,6 +636,29 @@ struct xen_sysctl_coverage_op {
>> >  typedef struct xen_sysctl_coverage_op xen_sysctl_coverage_op_t;
>> >  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t);
>> >
>> > +#define XEN_SYSCTL_MSR_read                0
>> > +#define XEN_SYSCTL_MSR_write               1
>> > +
>> > +struct xen_sysctl_msr_data {
>> > +    uint32_t cmd;       /* XEN_SYSCTL_MSR_* */
>> > +    uint32_t msr;
>> > +    uint64_t val;
>> > +};
>> > +typedef struct xen_sysctl_msr_data xen_sysctl_msr_data_t;
>> > +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_msr_data_t);
>> > +struct xen_sysctl_msr_op {
>> > +    uint32_t nr_ops;
>> 
>> Jusr "nr" would suffice.
> 
> Okay.
> 
>> 
>> > +    uint32_t cpu;
>> 
>> This is rather limiting - you should be able to specify a CPU for
>> each operation. That way you'll have another 32-bit field available
>> for flags: You indicated earlier on that you'd want certain
>> operations (like a write followed by a read) pairable without
>> potentially getting preempted in between. The field then freed up
>> here should then be reserved for use as flags field too (i.e. you'd
>> need to check that it's zero).
> 
> Do you mean the following case?
> 
> Op1: CPU0, write MSR 0x100. Flag: 1 (indicating paring)
> Op2: CPU1, read MSR 0x200. Flag: 0.
> Op3: CPU0, read MSR 0x101. Flag: 1 (paring the above op1)

No - CPU changes alway imply preemption between individual
sub-ops.

> To avoid the preempt between certain operations, my previous solution is to 
> pack all the operations towards a certain CPU via a single IPI. When the 
> target CPU receives the IPI, it will execute the operations one by one 
> (suppose during the execution for loop, no preemption will happen).
> Now if we specify a CPU for each operation, can we pack the operations 
> according to the target CPU, and finally issue one IPI per-target-CPU to 
> avoid 
> preemption? That will get rid of such flag usage.

No, I don't think re-ordering operations can be permitted here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.