[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Performance Monitor and Profiling tools for ARM64





On 22/03/2019 07:25, Diego Alejandro Parra Guzman wrote:
HI Everyone

Hi Diego,

My name is diego. I'm very interesting in extend the XenOprof to ARM64 based architectures, and also integrate some tools for hypervisor and application profiling and performance evaluation.

I read the documentation for Oprofile a perf which is in the wiki page and I noticed that XEN doesn't support ARM64 architectures.  For this reason I have two ideas.

Thank you for your interest to add perf support in Xen on Arm.


1. Add support for ARM64 architectures to XenOprof  in its current 
implementation.

2. I found an interesting library called libpfm4 which also work with perf_event, and support ARM64 and ARM32 architectures. Well I can tray to use this library in order to profile DOM0 and DOMU-VP guest.

Personally I prefer the option 2 since the  library currently works in normal linux OS and I guest could be easy to replicate it on XEN.

Approaches:

(hypercalls) from DOM0-DOMU to xen

(direct pass through from DOM0 to PMU counters), and VPMU in DOMU .

I think there are two (more or less distinct) use cases to take into account:
  1) A guest profiling itself
  2) The hypervisor profiling the system (i.e guests and itself).

From my understanding, the latter is implemented using XenOprof. For the former, then giving direct access to the PMU counters is probably a better approach over a PV solution.


Here my questions?

I would like to know if currently DOM0 and DOMU have support to  perf_event i.e., they can read directly performance monitoring unit (PMU) counters, only DOM0 or none of them?

Currently, there are no PMU support for Dom0 and DomU.


should I implement some traps in XEN hypervisor ?

The registers are already trapped (see arch/arm64/vsysregs.c) and implemented as a NOP for now. Depending on your use case, then trapping the registers may not be necessary (see more below).


someone  currently  is working on this?

I am not aware of anyone working on it so far.


Which is the most efficient way to implement it ?

I haven't fully thought through. The two use cases I mentioned above are quite distinct in term of implementation. It would probably be easy to implement them separately, but would require more thoughts if you want to handle the both at the same time.

In the case of guest profiling itself, then I think you could just disable the traps and context switch the registers. You will also need to forward the PMU interrupts (such as the overflow one) to the domains. For accuracy, you may also need to enable/disable PMU counters (see PMCR_EL0) on each traps so you don't count events in the hypervisor. Unless there are a way to ignore events when at running at EL2 (I haven't explored the spec yet).

In the case of the hypervisor profiling system, you would need to implement a PMU drivers in Xen. I don't know much about xenoprofile to be able to give more details here.


There is a guide line to do this easily?
I am afraid there are no silver bullet. You would need to read the ARM ARM section about the PMU and look at the xenoprofile code. I would be happy to answer any specific question if you are not understand some part of the code.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.