[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v14 for-xen-4.5 00/21] x86/PMU: Xen PMU PV(H) support



Hi Boris,

Am Freitag 17 Oktober 2014, 17:17:48 schrieb Boris Ostrovsky:
> Version 14 of PV(H) PMU patches.
> 
> Changes in v14:
> 
> * Moved struct xen_pmu_regs to pmu.h
> * Moved CHECK_pmu_* to an earlier patch (when structures are first introduced)
> * Added PMU_SAMPLE_REAL flag to indicate whether the sample was taken in real 
> mode
> * Simplified slightly setting rules for xenpmu_data flags
> * Rewrote vpmu_force_context_switch() to again use continuations. (Returning 
> EAGAIN
>   to user would mean that VPMU mode may get into inconsistent state (across 
> processors)
>   and dealing with that is more compicated than I'd like).
> * Fixed msraddr_to_bitpos() and converted it into an inline
> * Replaced address range check in vmpu_do_interrupt() with guest_mode()
> * No error returns from __initcall
> * Rebased on top of recent VPMU changes
> * Various cleanups

I did some tests with the Intel part and all went fine, so

Tested-by: Dietmar Hahn <dietmar.hahn@xxxxxxxxxxxxxx>

Btw. you use different names for the same structure in the linux kernel driver
part "struct xen_arch_pmu" and in the xen part "struct xen_pmu_arch". This is a
little bit confusing when searching through the code. So when possible ... ;-)
Many thanks!

Dietmar.

> 
> Changes in v13:
> 
> * Rearranged data in xenpf_symdata to eliminate a hole (no change in
>   structure size)
> * Removed unnecessary zeroing of last character in name string during
>   symbol read hypercall
> * Updated comment in access_vectors for pmu_use operation
> * Compute AMD MSR bank size at runtime
> * Moved a couple of BUILD_BUG_ON later, to when the structures they are
>   checking are first used
> * Added ss and eflags to xen_pmu_registers structure
> * vpmu_force_context_switch() uses per-cpu tasklet pointers.
> * Moved most of arch-specific VPMU initialization code into an __initcall()
>   to avoid re-calculating/re-checking things more than once (new patch, #12)
> * Replaced is_*_domain() with is_*_vcpu()
> * choose_hwdom_vcpu() will now assume that hardware_domain->vcpu[idx]
>   is always there (callers will still verify whether or not that's true)
> * Fixed a couple of sampled vs. sampling tests in vpmu_do_interrupt()
> * Pass CS to the guest unchanged, add pmu_flags's flag to indicate whether the
>   sample was for a user or kernel space. Move pmu_flags from xen_pmu_data into
>   xen_pmu_arch
> * Removed local msr_content variable from vpmu_do_wrmsr()
> * Re-arranged code in parse_vpmu_param()
> * Made vpmu_interrupt_type checks test for value, not individual bits
> * Moved PMU_SOFTIRQ definition into arch-specific header
> * Moved vpmu*.c files into xen/arch/x86/cpu/ instead of xen/arch/x86/
> * For hypervisor samples, report DOMID_XEN to the guest
> * Changed logic to select which registers to report to the guest (include RIP
>   check to see whether we are in the hypervisor)
> 
> Changes in v12:
> 
> * Added XSM support
> * Made a valifity check before writing MSR_CORE_PERF_GLOBAL_OVF_CTRL
> * Updated documentation for 'vpmu=nmi' option
> * Added more text to a bunch of commit messages (per Konrad's request)
> 
> Changes in v11:
> 
> * Replaced cpu_user_regs with new xen_pmu_regs (IP, SP, CS) in xen_pmu_arch.
>   - as part of this re-work noticed that CS registers were set in later patch 
> then
>     needed. Moved those changes to appropriate place
> * Added new VPMU mode (XENPMU_MODE_HV). Now XENPMU_MODE_SELF will only 
> provide dom0
>   with its own samples only (i.e. no hypervisor data) and XENPMU_MODE_HV will 
> be what
>   XENPMU_MODE_SELF used to be.
> * Kept  vmx_add_guest_msr()/vmx_add_host_load_msr() as wrappers around 
> vmx_add_msr()
> * Cleaned up VPMU context switch macros (moved  'if(prev!=next)' back to 
> context_switch())
> * Dropped hypercall continuation from vpmu_force_context_switch() and 
> replaced it with
>   -EAGAIN error if hypercall_preempt_check() is true after 2ms.
> * Kept vpmu_do_rdmsr()/vpmu_do_wrmsr as wrapperd for vpmu_do_msr()
> * Move context switching patch (#13) earlier in the series (for proper 
> bisection support)
> * Various comment updates and cleanups
> * Dropped a bunch of Reviewed-by and all Tested-by tags
> 
> Changes in v10:
> 
> * Swapped address and name fields of xenpf_symdata (to make it smaller on 
> 32-bit)
> * Dropped vmx_rm_guest_msr() as it requires refcountig which makes code more 
> complicated.
> * Cleaned up vlapic_reg_write()
> * Call vpmu_destroy() for both HVM and PVH VCPUs
> * Verify that (xen_pmu_data+PMU register bank) fit into a page
> * Return error codes from arch-specific VPMU init code
> * Moved VPMU-related context switch logic into inlines
> * vpmu_force_context_switch() changes:
>   o Avoid greater than page-sized allocations
>   o Prevent another VCPU from starting VPMU sync while the first sync is in 
> progress
> * Avoid stack leak in do_xenpmu_op()
> * Checked validity of Intel VPMU MSR values before they are committed
> * Fixed MSR handling in traps.c (avoid potential accesses to Intel MSRs on 
> AMD)
> * Fixed VCPU selection in interrupt handler for 32-bit dom0 (sampled => 
> sampling)
> * Clarified commit messages (patches 2, 13, 18) 
> * Various cleanups
> 
> Changes in v9:
> 
> * Restore VPMU context after context_saved() is called in
>   context_switch(). This is needed because vpmu_load() may end up
>   calling vmx_vmcs_try_enter()->vcpu_pause() and that needs is_running
>   to be correctly set/cleared. (patch 18, dropped review acks)
> * Added patch 2 to properly manage VPMU_CONTEXT_LOADED
> * Addressed most of Jan's comments.
>   o Keep track of time in vpmu_force_context_switch() to properly break
>     out of a loop when using hypercall continuations
>   o Fixed logic in calling vpmu_do_msr() in emulate_privileged_op()
>   o Cleaned up vpmu_interrupt() wrt vcpu variable names to (hopefully)
>     make it more clear which vcpu we are using
>   o Cleaned up vpmu_do_wrmsr()
>   o Did *not* replace sizeof(uint64_t) with sizeof(variable) in
>     amd_vpmu_initialise(): throughout the code registers are declared as
>     uint64_t and if we are to add a new type (e.g. reg_t) this should be
>     done in a separate patch, unrelated to this series.
>   o Various more minor cleanups and code style fixes
>   
> Changes in v8:
> 
> * Cleaned up a bit definitions of struct xenpf_symdata and xen_pmu_params
> * Added compat checks for vpmu structures
> * Converted vpmu flag manipulation macros to inline routines
> * Reimplemented vpmu_unload_all() to avoid long loops
> * Reworked PMU fault generation and handling (new patch #12)
> * Added checks for domain->vcpu[] non-NULLness
> * Added more comments, renamed some routines and macros, code style cleanup
> 
> 
> Changes in v7:
> 
> * When reading hypervisor symbols make the caller pass buffer length
>   (as opposed to having this length be part of the API). Make the
>   hypervisor buffer static, make xensyms_read() return zero-length
>   string on end-of-symbols. Make 'type' field of xenpf_symdata a char,
>   drop compat_pf_symdata definition.
> * Spread PVH support across patches as opposed to lumping it into a
>   separate patch
> * Rename vpmu_is_set_all() to vpmu_are_all_set()
> * Split VPMU cleanup patch in two
> * Use memmove when copying VMX guest and host MSRs
> * Make padding of xen_arch_pmu's context union a constand that does not
>   depend on arch context size.
> * Set interface version to 0.1
> * Check pointer validity in pvpmu_init/destroy()
> * Fixed crash in core2_vpmu_dump()
> * Fixed crash in vmx_add_msr()
> * Break handling of Intel and AMD MSRs in traps.c into separate cases
> * Pass full CS selector to guests
> * Add lock in pvpmu init code to prevent potential race
> 
> 
> Changes in v6:
> 
> * Two new patches:
>   o Merge VMX MSR add/remove routines in vmcs.c (patch 5)
>   o Merge VPMU read/write MSR routines in vpmu.c (patch 14)
> * Check for pending NMI softirq after saving VPMU context to prevent a 
> newly-scheduled
>   guest from overwriting sampled_vcpu written by de-scheduled VPCU.
> * Keep track of enabled counters on Intel. This was removed in earlier 
> patches and
>   was a mistake. As result of this change struct vpmu will have a pointer to 
> private
>   context data (i.e. data that is not exposed to a PV(H) guest). Use this 
> private pointer
>   on SVM as well for storing MSR bitmap status (it was unnecessarily exposed 
> to PV guests
>   earlier).
>   Dropped Reviewed-by: and Tested-by: tags from patch 4 since it needs to be 
> reviewed
>   agan (core2_vpmu_do_wrmsr() routine, mostly)
> * Replaced references to dom0 with hardware_domain (and is_control_domain with
>   is_hardware_domain for consistency)
> * Prevent non-privileged domains from reading PMU MSRs in VPMU_PRIV_MODE
> * Reverted unnecessary changes in vpmu_initialise()'s switch statement
> * Fixed comment in vpmu_do_interrupt
> 
> 
> Changes in v5:
> 
> * Dropped patch number 2 ("Stop AMD counters when called from 
> vpmu_save_force()")
>   as no longer needed
> * Added patch number 2 that marks context as loaded before PMU registers are
>   loaded. This prevents situation where a PMU interrupt may occur while 
> context
>   is still viewed as not loaded. (This is really a bug fix for exsiting VPMU
>   code)
> * Renamed xenpmu.h files to pmu.h
> * More careful use of is_pv_domain(), is_hvm_domain(, is_pvh_domain and
>   has_hvm_container_domain(). Also explicitly disabled support for PVH until
>   patch 16 to make distinction between usage of the above macros more clear.
> * Added support for disabling VPMU support during runtime.
> * Disable VPMUs for non-privileged domains when switching to privileged
>   profiling mode
> * Added ARM stub for xen_arch_pmu_t
> * Separated vpmu_mode from vpmu_features
> * Moved CS register query to make sure we use appropriate query mechanism for
>   various guest types.
> * LVTPC is now set from value in shared area, not copied from dom0
> * Various code and comments cleanup as suggested by Jan.
> 
> Changes in v4:
> 
> * Added support for PVH guests:
>   o changes in pvpmu_init() to accommodate both PV and PVH guests, still in 
> patch 10
>   o more careful use of is_hvm_domain
>   o Additional patch (16)
> * Moved HVM interrupt handling out of vpmu_do_interrupt() for NMI-safe 
> handling
> * Fixed dom0's VCPU selection in privileged mode
> * Added a cast in register copy for 32-bit PV guests cpu_user_regs_t in 
> vpmu_do_interrupt.
>   (don't want to expose compat_cpu_user_regs in a public header)
> * Renamed public structures by prefixing them with "xen_"
> * Added an entry for xenpf_symdata in xlat.lst
> * Fixed pv_cpuid check for vpmu-specific cpuid adjustments
> * Varios code style fixes
> * Eliminated anonymous unions
> * Added more verbiage to NMI patch description
> 
> 
> Changes in v3:
> 
> * Moved PMU MSR banks out from architectural context data structures to allow
> for future expansion without protocol changes
> * PMU interrupts can be either NMIs or regular vector interrupts (the latter
> is the default)
> * Context is now marked as PMU_CACHED by the hypervisor code to avoid certain
> race conditions with the guest
> * Fixed races with PV guest in MSR access handlers
> * More Intel VPMU cleanup
> * Moved NMI-unsafe code from NMI handler
> * Dropped changes to vcpu->is_running
> * Added LVTPC apic handling (cached for PV guests)
> * Separated privileged profiling mode into a standalone patch
> * Separated NMI handling into a standalone patch
> 
> 
> Changes in v2:
> 
> * Xen symbols are exported as data structure (as opoosed to a set of formatted
> strings in v1). Even though one symbol per hypercall is returned performance
> appears to be acceptable: reading whole file from dom0 userland takes on 
> average
> about twice as long as reading /proc/kallsyms
> * More cleanup of Intel VPMU code to simplify publicly exported structures
> * There is an architecture-independent and x86-specific public include files 
> (ARM
> has a stub)
> * General cleanup of public include files to make them more presentable (and
> to make auto doc generation better)
> * Setting of vcpu->is_running is now done on ARM in schedule_tail as well 
> (making
> changes to common/schedule.c architecture-independent). Note that this is not
> tested since I don't have access to ARM hardware.
> * PCPU ID of interrupted processor is now passed to PV guest
> 
> 
> The following patch series adds PMU support in Xen for PV(H)
> guests. There is a companion patchset for Linux kernel. In addition,
> another set of changes will be provided (later) for userland perf
> code.
> 
> This version has following limitations:
> * For accurate profiling of dom0/Xen dom0 VCPUs should be pinned.
> * Hypervisor code is only profiled on processors that have running dom0 VCPUs
> on them.
> * No backtrace support.
> 
> A few notes that may help reviewing:
> 
> * A shared data structure (xenpmu_data_t) between each PV VPCU and hypervisor
> CPU is used for passing registers' values as well as PMU state at the time of
> PMU interrupt.
> * PMU interrupts are taken by hypervisor either as NMIs or regular vector
> interrupts for both HVM and PV(H). The interrupts are sent as NMIs to HVM 
> guests
> and as virtual interrupts to PV(H) guests
> * PV guest's interrupt handler does not read/write PMU MSRs directly. 
> Instead, it
> accesses xenpmu_data_t and flushes it to HW it before returning.
> * PMU mode is controlled at runtime via 
> /sys/hypervisor/pmu/pmu/{pmu_mode,pmu_flags}
> in addition to 'vpmu' boot option (which is preserved for back compatibility).
> The following modes are provided:
>   * disable: VPMU is off
>   * enable: VPMU is on. Guests can profile themselves, dom0 profiles itself 
> and Xen
>   * priv_enable: dom0 only profiling. dom0 collects samples for everyone. 
> Sampling
>     in guests is suspended.
> * /proc/xen/xensyms file exports hypervisor's symbols to dom0 (similar to
> /proc/kallsyms)
> * VPMU infrastructure is now used for HVM, PV and PVH and therefore has been 
> moved
> up from hvm subtree
> 
> 
> Boris Ostrovsky (21):
>   common/symbols: Export hypervisor symbols to privileged guest
>   x86/VPMU: Manage VPMU_CONTEXT_SAVE flag in vpmu_save_force()
>   x86/VPMU: Set MSR bitmaps only for HVM/PVH guests
>   x86/VPMU: Make vpmu macros a bit more efficient
>   intel/VPMU: Clean up Intel VPMU code
>   vmx: Merge MSR management routines
>   x86/VPMU: Handle APIC_LVTPC accesses
>   intel/VPMU: MSR_CORE_PERF_GLOBAL_CTRL should be initialized to zero
>   x86/VPMU: Add public xenpmu.h
>   x86/VPMU: Make vpmu not HVM-specific
>   x86/VPMU: Interface for setting PMU mode and flags
>   x86/VPMU: Initialize AMD and Intel VPMU with __initcall
>   x86/VPMU: Initialize PMU for PV(H) guests
>   x86/VPMU: Save VPMU state for PV guests during context switch
>   x86/VPMU: When handling MSR accesses, leave fault injection to callers
>   x86/VPMU: Add support for PMU register handling on PV guests
>   x86/VPMU: Handle PMU interrupts for PV guests
>   x86/VPMU: Merge vpmu_rdmsr and vpmu_wrmsr
>   x86/VPMU: Add privileged PMU mode
>   x86/VPMU: NMI-based VPMU support
>   x86/VPMU: Move VPMU files up from hvm/ directory
> 
>  docs/misc/xen-command-line.markdown                |   8 +-
>  tools/flask/policy/policy/modules/xen/xen.te       |   7 +
>  xen/arch/x86/cpu/Makefile                          |   1 +
>  xen/arch/x86/cpu/vpmu.c                            | 881 
> +++++++++++++++++++++
>  xen/arch/x86/{hvm/svm/vpmu.c => cpu/vpmu_amd.c}    | 248 +++---
>  .../x86/{hvm/vmx/vpmu_core2.c => cpu/vpmu_intel.c} | 784 +++++++++---------
>  xen/arch/x86/domain.c                              |  23 +-
>  xen/arch/x86/hvm/Makefile                          |   1 -
>  xen/arch/x86/hvm/hvm.c                             |   3 +-
>  xen/arch/x86/hvm/svm/Makefile                      |   1 -
>  xen/arch/x86/hvm/svm/svm.c                         |  10 +-
>  xen/arch/x86/hvm/vlapic.c                          |   3 +
>  xen/arch/x86/hvm/vmx/Makefile                      |   1 -
>  xen/arch/x86/hvm/vmx/vmcs.c                        |  84 +-
>  xen/arch/x86/hvm/vmx/vmx.c                         |  28 +-
>  xen/arch/x86/hvm/vpmu.c                            | 263 ------
>  xen/arch/x86/oprofile/op_model_ppro.c              |   8 +-
>  xen/arch/x86/platform_hypercall.c                  |  28 +
>  xen/arch/x86/traps.c                               |  62 +-
>  xen/arch/x86/x86_64/compat/entry.S                 |   4 +
>  xen/arch/x86/x86_64/entry.S                        |   4 +
>  xen/common/event_channel.c                         |   1 +
>  xen/common/symbols.c                               |  54 ++
>  xen/include/Makefile                               |   2 +
>  xen/include/asm-x86/domain.h                       |   2 +
>  xen/include/asm-x86/hvm/vcpu.h                     |   3 -
>  xen/include/asm-x86/hvm/vmx/vmcs.h                 |  18 +-
>  xen/include/asm-x86/hvm/vmx/vpmu_core2.h           |  51 --
>  xen/include/asm-x86/softirq.h                      |   3 +-
>  xen/include/asm-x86/{hvm => }/vpmu.h               |  93 ++-
>  xen/include/public/arch-arm.h                      |   3 +
>  xen/include/public/arch-x86/pmu.h                  |  95 +++
>  xen/include/public/platform.h                      |  19 +
>  xen/include/public/pmu.h                           |  90 +++
>  xen/include/public/xen.h                           |   2 +
>  xen/include/xen/hypercall.h                        |   4 +
>  xen/include/xen/symbols.h                          |   3 +
>  xen/include/xlat.lst                               |   6 +
>  xen/include/xsm/dummy.h                            |  20 +
>  xen/include/xsm/xsm.h                              |   6 +
>  xen/xsm/dummy.c                                    |   1 +
>  xen/xsm/flask/hooks.c                              |  28 +
>  xen/xsm/flask/policy/access_vectors                |   6 +
>  43 files changed, 2028 insertions(+), 934 deletions(-)
>  create mode 100644 xen/arch/x86/cpu/vpmu.c
>  rename xen/arch/x86/{hvm/svm/vpmu.c => cpu/vpmu_amd.c} (68%)
>  rename xen/arch/x86/{hvm/vmx/vpmu_core2.c => cpu/vpmu_intel.c} (58%)
>  delete mode 100644 xen/arch/x86/hvm/vpmu.c
>  delete mode 100644 xen/include/asm-x86/hvm/vmx/vpmu_core2.h
>  rename xen/include/asm-x86/{hvm => }/vpmu.h (55%)
>  create mode 100644 xen/include/public/arch-x86/pmu.h
>  create mode 100644 xen/include/public/pmu.h
> 
> 

-- 
Company details: http://ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.