Re: [XenPPC] performance profiling current and future steps

some comments

On Mar 20, 2007, at 2:09 PM, Christian Ehrhardt wrote:

Hi,
this mail consists of two parts. Part I tries to summarize all theperformance profiling related discussions of the past few weeks ina short item listing that is now on my todo list ;)In Part II I want to encourage everyone to discuss about thefollowing steps which are profiling xen and passive domains. Ithink we should discuss and shape the ideas for these steps whilepart I is developed the next weeks.
Part I
== profiling xenppc - implementation subparts ==
completely independent domains - context_switch:
-load/save MMCR0, PMC's, MMCRA in context_switch()
->we have to load/save these if prev OR next do measuring so wehave to save/rest all in
 the end after one Dom starts to measure
->do it dependent on a variable, don't save/rest them always (slowsdown context switch)

Maybe not restore, but at a minimum turn off, and to be "correct"zero them.

Ensure MMCR0[FCH] for this first step:
-(ensure) set MMCR[FCH] always in xen when entering xen space. Thisshould prevent a domain
 messing up MMCR0[FCH]
->EXCEPTION_HEAD in exception.S set MMCR0[FCH] always


You need at least the following instructions:
    mfspr r0, SPRN_MMCR0
    ori r0, r0, 1 /* MMRC0_FCH */
    mtspr 795, SPRN_MMCR0

Unfortuantely, there is not enough room in EXCEPTION_HEAD for thatand you will get:


  exceptions.S: Assembler messages:
  exceptions.S:246: Error: attempt to .org/.space backwards? (-4)
  exceptions.S:253: Error: attempt to .org/.space backwards? (-4)
  exceptions.S:260: Error: attempt to .org/.space backwards? (-4)
  exceptions.S:267: Error: attempt to .org/.space backwards? (-4)
  exceptions.o: Bad value
  exceptions.S:626: FATAL: Can't write exceptions.o: Bad value

I suggest we start a new macro PMU_SAVE_STATE(save,scratch), whichdoes the above (for now, using only scratch) and sprinkling it in allthe code that EXCEPTION_HEAD branches to.

Inform xen about profiling:
->Hypercall use in setup of oprofile like Jimi suggested
->Store initial MMCR0/1, PMC, MMCRA to restore this one on LASTHypercall
 that say "end profiling" (refcount like)


I think off and zeroing is fine, so sense in loading a bunch of zeros.

->Use this initial set for non-profiling domains too if they havenot yet
 stored their set on a context_switch to a profiling one

IRQ Setup in Linux
->each Guest already set's up it's pmc_irq handler head32.S/head64.S as it is done nowbehind its address tranlation and therefor handles its "own" perf-irq's.

Correct, so we need our own ppc_md.enable_pmcs = xen_enable_pmcswhich should be a straight copy of pseries_lpar_enable_pmcs().

->The pmc_irq does not affect MSR[HV] so the irq is handled by theright linux guest
->xen sets up no handler for its own address space (first step)

MMCR0[FCM1] and MSR[PMM] usage:
-The linux implementation uses MMCR0[FCM1]=1 to sample only onPMM=0. no change herein the first step because we set MMCR[FCH] anyway and save/resteverything on context switch
Even if this step 1 is not yet about profiling xen itself it mighthelp us in xenppc to:a) understand how the virtualization changes the runtime behaviorof a guest in our caseb) to profile hotspots in new components not knwon to non-virtlinux e.g. *front drivers
Part II
== Thoughts about the way to step 2 - profiling xen ==
->the hypercall could now additionally pass a function pointer fora function in linux
 that handles xen perf interrupts

hmm, we generally do not have that ability to specify a function thatXen run in a domain, we could but I'm not sure it makes sense. Iwould think that we would use event channels exclusively for this andthe domain activity would be wired to an IRQ.


Is that not how it works in the other architectures?
Please expand on this "function pointer"

->Only one Domain can set this up. Xen then sets up an own handlerfor 0xf00pmc_irq and passes the sampled data via a shared buffer/virtualirq to the domain
 (this part would be similar to xeonoprof)
->in this case MMCR0[FCH] is set to zero in xen space as long asthe profiling takes place
 in exception.S

Correct, This means that before we set MMCR0[FCH] we will need tosave/restore guest/xen state.this would also mean that the context S&R code would have to use thismemory area for guest switches if we a profiling Xen.

We need to also make sure that this path is similar to 0x500 path inthat we do not allow MSR[EE] to be set and we simply sample and hrfid.make sure you add a BUG() that makes sure that MSR[HV] was set beforethe interrupt occurred.

->the handling of the sampled xen data in the "main"-domain couldbe verysimilar to the xenoprof approach which also passes xen samples tothe primarysampling domain. In this way we should be able to reuse a lot ofcode there.
->additionally our samples contain a clear flag if it was sampled in
hypervisor in MMCRA[SAMPHV]. This should allow us an early codeunification
 without a lot of "magic"
-If this would work we would be able to profile each domain completely
independent to each other because each would have it's own saved/restored perfcounters. As example - in the max stage of expansion this wouldenable our solution to
e.g. profile one domain per cycles and another per L2 misses.
The xen samples would be managed by a primary domain e.g. the firstone that demands it via
the hypercall - the later ones get an ebusy or something like that

== Thoughts about step 2b - profiling passive domains ==
->because the solution to profile domains is so similar to theplain linuxoprofile approach it could be possible to enable the "normal"performancemonitor usage in that domain as long as another domain e.g. Adminon dom0 tellxen that there will be some profiling and it has to save/rest theperformance
 SPR's. This is not fully passive but at least a solution for non
 virtualization aware domains whatever these might be in xenppc ;)
-To really discuss about that step step 1 has to shape up its final
implementation so we know it work the way we currently think

--

Grüsse / regards, Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
+49 7031/16-3385
Ehrhardt@xxxxxxxxxxxxxxxxxxx
Ehrhardt@xxxxxxxxxx

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung:Herbert Kircher Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel



_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel

WARNING - OLD ARCHIVES

xen-ppc-devel

Re: [XenPPC] performance profiling current and future steps