|
|
|
|
|
|
|
|
|
|
xen-ppc-devel
[XenPPC] Profiling in xen – ppc considerations
Hi Folks,
I analyzed the oprofile/xenoprof code and tried to do a simple minded
powerpc mapping in the last two weeks. As it come up in a phone call on
Monday I overlooked some possible issues arising out of the simple
mapping of xenoprof to the power architecture. In this mail I briefly
describe some backgrounds as well as my considerations so far.
I'm not sure If I got all power and x86 specifics in the right way so
feel free to correct me - I'm open to any comments and ideas - I hope
together we reach a realizable plan if and how this could be implemented.
-- Background I - oprofile basic principles --
Oprofile is a common profiling tool used in the linux world. It consists
of two layers. First the kernel space driver that contains a generic
infrastructure and management part as well as a architecture dependent
part that handles the hardware specific tasks. The second part is the
userspace component that controls the kernel part and computes the
output to different reports.
-- Background II - xenoprof approach --
To use oprofile (http://oprofile.sourceforge.net/about/) in the xen
environment it was extended to xenoprof
(http://xenoprof.sourceforge.net/) which adds a third layer in the xen
hypervisor. The linux kernel space driver supports now a new
“architecture” that repesents xen. This implementation uses a hypercall
instead of hardware specific code. The data that is usually reported by
interrupts is now reported to xen by the hardware. Xen distinguish some
parameters and reports the data chunk to the profiling domain via the
virtual interrupt event notification provided by xen. This gets more
complex with multiple domains etc. For more read the docs on xenoprof
web page.
The hardware specific code that once was in the oprofile kernel drivers
is now located (adapted to the new environment) in the xen source where
the new hypercalls are mapped to the real hardware.
-- Mapping xenoprof to Power - simple approach --
This approach tries to use as much of the initial xenoprof architecture
by trying to map the power implementation to the technically x86
oriented xenoprof architecture. This would ease the implementation but
spawn some risks I try to list here (The list is not complete, there may
be more not yet realized issues).
The basic principle of those profiling implementations is a performance
counter (real time, cycles, special events, ... ) that triggers an
interrupt. This interrupt then tries to save information about the
current point of execution in its interrupt handler. The oprofile
implementation for power works in a similar scheme so I thought this
should be the easiest way.
-- Possible issues and their background --
Please take a look at this graphic before/while reading the following
details (https://ltc.linux.ibm.com/wiki/XenPPC/profilingdiscussion) – it
might also be useful to have a PowerISA doc to read about special
registers and bits (http://www.power.org/news/articles/new_brand/#isa).
The setting of the used hardware elements in the x86 implementation
needs ring0 afaik and the Dom kernel runs in ring1, because of that it
can't interfere the nmi programming done by xen in ring0. In the power
architecture there are three privilege levels and the linux kernel
usually runs in the second level. Afaik the Dom linux kernel does also
run in this level in the xen-ppc implementation, because of that we
could set performance monitor registers up in the right way in xen but
could not really be sure that a Dom kernel does not change the related
registers without “asking” the hypervisor.
-> is there a way still unknown to me to protect those registers?
-- Other possible approaches --
After consulting the current Power ISA documents again I found some
points that may allow other implementations of profiling in xen.
a) Because the Dom Kernel seem to be able to setup the performance
profiling without invoking the hypervisor it could be possible to let a
domain just do the profiling on their own. But there are other issues in
this way too e.g. In which way would samples of other domains occur and
would this be a security breach?
b) The Power architecture provides a very potent performance monitor
with features that allow the freezing of the counters e.g. Freeze them
while the execution is in hypervisor mode MSR_HVPR =0b10. But such
features would only help to distinguish vertically in the graphics
referenced above. Only the hypervisor is in a position to differ
horizontally between different domains.
I'm planning to move the illustration I used to the public wiki after
the first round of review and keep the planned design up to date there.
More but not yet mature thought&ideas about that in mind,
Christian
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
+49 7031/16-3385
Ehrhardt@xxxxxxxxxxxxxxxxxxx
Ehrhardt@xxxxxxxxxx
IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [XenPPC] Profiling in xen – ppc considerations,
Christian Ehrhardt <=
|
|
|
|
|