[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v12 for-xen-4.5 16/20] x86/VPMU: Handle PMU interrupts for PV guests



>>> On 30.09.14 at 17:07, <boris.ostrovsky@xxxxxxxxxx> wrote:

> On 09/30/2014 04:11 AM, Jan Beulich wrote:
>>>>> On 25.09.14 at 21:28, <boris.ostrovsky@xxxxxxxxxx> wrote:
>>> --- a/xen/arch/x86/hvm/vpmu.c
>>> +++ b/xen/arch/x86/hvm/vpmu.c
>>> @@ -80,44 +80,191 @@ static void __init parse_vpmu_param(char *s)
>>>   
>>>   void vpmu_lvtpc_update(uint32_t val)
>>>   {
>>> -    struct vpmu_struct *vpmu = vcpu_vpmu(current);
>>> +    struct vcpu *curr = current;
>>> +    struct vpmu_struct *vpmu = vcpu_vpmu(curr);
>>>   
>>>       vpmu->hw_lapic_lvtpc = PMU_APIC_VECTOR | (val & APIC_LVT_MASKED);
>>> -    apic_write(APIC_LVTPC, vpmu->hw_lapic_lvtpc);
>>> +
>>> +    /* Postpone APIC updates for PV(H) guests if PMU interrupt is pending
>>> */
>>> +    if ( is_hvm_domain(curr->domain) ||
>> Please don't open-code is_hvm_vcpu() (more instances elsewhere).
> 
> Why is this open-coded?

The above should really be is_hvm_vcpu(curr).

>>> +         !(vpmu->xenpmu_data && (vpmu->xenpmu_data->pmu_flags & 
>>> PMU_CACHED)) )
>> I think readability would benefit if you resolved the !(&&) to !||!
>> (making it the proper inverse of what you do in vpmu_do_wrmsr()
>> and vpmu_do_rdmsr()).
>>
>>> +static struct vcpu *choose_hwdom_vcpu(void)
>>> +{
>>> +    struct vcpu *v;
>>> +    unsigned idx = smp_processor_id() % hardware_domain->max_vcpus;
>>> +
>>> +    if ( hardware_domain->vcpu == NULL )
>>> +        return NULL;
>>> +
>>> +    v = hardware_domain->vcpu[idx];
>>> +
>>> +    /*
>>> +     * If index is not populated search downwards the vcpu array until
>>> +     * a valid vcpu can be found
>>> +     */
>>> +    while ( !v && idx-- )
>>> +        v = hardware_domain->vcpu[idx];
>> Each time I get here I wonder what case this is good for.
> 
> I thought we can have a case when first hardware_domain->vcpu[idx] is 
> NULL so we walk the array down until we find the first non-NULL vcpu. 
> Hot unplug, for example (we may be calling this from NMI context).

Hot unplug of a vCPU is a guest thing - this doesn't destroy the
vCPU in the hypervisor.

>>>   int vpmu_do_interrupt(struct cpu_user_regs *regs)
>>>   {
>>> -    struct vcpu *v = current;
>>> -    struct vpmu_struct *vpmu = vcpu_vpmu(v);
>>> +    struct vcpu *sampled = current, *sampling;
>>> +    struct vpmu_struct *vpmu;
>>> +
>>> +    /* dom0 will handle interrupt for special domains (e.g. idle domain) */
>>> +    if ( sampled->domain->domain_id >= DOMID_FIRST_RESERVED )
>>> +    {
>>> +        sampling = choose_hwdom_vcpu();
>>> +        if ( !sampling )
>>> +            return 0;
>>> +    }
>>> +    else
>>> +        sampling = sampled;
>>> +
>>> +    vpmu = vcpu_vpmu(sampling);
>>> +    if ( !is_hvm_domain(sampling->domain) )
>>> +    {
>>> +        /* PV(H) guest */
>>> +        const struct cpu_user_regs *cur_regs;
>>> +
>>> +        if ( !vpmu->xenpmu_data )
>>> +            return 0;
>>> +
>>> +        if ( vpmu->xenpmu_data->pmu_flags & PMU_CACHED )
>>> +            return 1;
>>> +
>>> +        if ( is_pvh_domain(sampled->domain) &&
>> Here and below - is this really the right condition? I.e. is the
>> opposite case (doing nothing here, but the one further down
>> having an else) really meant to cover both HVM and PV? The outer
>> !is_hvm_() doesn't count here as that acts on sampling, not
>> sampled.
> 
> This is test for an error in do_interrupt() --- if it reported a failure 
> then there is no reason to proceed further.

That's not the question. Why is this done only for PVH?

>>> +            {
>>> +                r->cs = cur_regs->cs;
>>> +                if ( sampled->arch.flags & TF_kernel_mode )
>>> +                    r->cs &= ~3;
>> And once again I wonder how the consumer of this data is to tell
>> apart guest kernel and hypervisor addresses.
> 
> Based on the RIP --- perf, for example, searches through various symbol 
> tables.

That doesn't help when profiling HVM/PVH guests - addresses are
ambiguous in that case.

> I suppose I can set xenpmu_data->domain_id below to either DOMID_SELF 
> for guest and DOMID_XEN for the hypervisor.

That's an option, but I'm really having reservations against simulating
ring-0 execution in PV guests here. It would certainly be better if we
could report reality here, but I can see reservations on the consumer
(perf) side against us doing so.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.