|  |  | 
  
    |  |  | 
 
  |   |  | 
  
    |  |  | 
  
    |  |  | 
  
    |   xen-ppc-devel
[XenPPC] Profiling in xen – ppc considerations 
| 
Hi Folks,
I analyzed the oprofile/xenoprof code and tried to do a simple minded 
powerpc mapping in the last two weeks. As it come up in a phone call on 
Monday I overlooked some possible issues arising out of the simple 
mapping of xenoprof to the power architecture. In this mail I briefly 
describe some backgrounds as well as my considerations so far.
I'm not sure If I got all power and x86 specifics in the right way so 
feel free to correct me - I'm open to any comments and ideas - I hope 
together we reach a realizable plan if and how this could be implemented. 
-- Background I - oprofile basic principles --
Oprofile is a common profiling tool used in the linux world. It consists 
of two layers. First the kernel space driver that contains a generic 
infrastructure and management part as well as a architecture dependent 
part that handles the hardware specific tasks. The second part is the 
userspace component that controls the kernel part and computes the 
output to different reports. 
-- Background II - xenoprof approach --
To use oprofile (http://oprofile.sourceforge.net/about/) in the xen 
environment it was extended to xenoprof 
(http://xenoprof.sourceforge.net/) which adds a third layer in the xen 
hypervisor. The linux kernel space driver supports now a new 
“architecture” that repesents xen. This implementation uses a hypercall 
instead of hardware specific code. The data that is usually reported by 
interrupts is now reported to xen by the hardware. Xen distinguish some 
parameters and reports the data chunk to the profiling domain via the 
virtual interrupt event notification provided by xen. This gets more 
complex with multiple domains etc. For more read the docs on xenoprof 
web page.
The hardware specific code that once was in the oprofile kernel drivers 
is now located (adapted to the new environment) in the xen source where 
the new hypercalls are mapped to the real hardware. 
-- Mapping xenoprof to Power - simple approach --
This approach tries to use as much of the initial xenoprof architecture 
by trying to map the power implementation to the technically x86 
oriented xenoprof architecture. This would ease the implementation but 
spawn some risks I try to list here (The list is not complete, there may 
be more not yet realized issues).
The basic principle of those profiling implementations is a performance 
counter (real time, cycles, special events, ... ) that triggers an 
interrupt. This interrupt then tries to save information about the 
current point of execution in its interrupt handler. The oprofile 
implementation for power works in a similar scheme so I thought this 
should be the easiest way. 
-- Possible issues and their background --
Please take a look at this graphic before/while reading the following 
details (https://ltc.linux.ibm.com/wiki/XenPPC/profilingdiscussion) – it 
might also be useful to have a PowerISA doc to read about special 
registers and bits (http://www.power.org/news/articles/new_brand/#isa).
The setting of the used hardware elements in the x86 implementation 
needs ring0 afaik and the Dom kernel runs in ring1, because of that it 
can't interfere the nmi programming done by xen in ring0. In the power 
architecture there are three privilege levels and the linux kernel 
usually runs in the second level. Afaik the Dom linux kernel does also 
run in this level in the xen-ppc implementation, because of that we 
could set performance monitor registers up in the right way in xen but 
could not really be sure that a Dom kernel does not change the related 
registers without “asking” the hypervisor. 
-> is there a way still unknown to me to protect those registers?
-- Other possible approaches --
After consulting the current Power ISA documents again I found some 
points that may allow other implementations of profiling in xen.
a) Because the Dom Kernel seem to be able to setup the performance 
profiling without invoking the hypervisor it could be possible to let a 
domain just do the profiling on their own. But there are other issues in 
this way too e.g. In which way would samples of other domains occur and 
would this be a security breach? 
b) The Power architecture provides a very potent performance monitor 
with features that allow the freezing of the counters e.g. Freeze them 
while the execution is in hypervisor mode MSR_HVPR =0b10. But such 
features would only help to distinguish vertically in the graphics 
referenced above. Only the hypervisor is in a position to differ 
horizontally between different domains. 
I'm planning to move the illustration I used to the public wiki after 
the first round of review and keep the planned design up to date there. 
More but not yet mature thought&ideas about that in mind,
Christian
--
Grüsse / regards,
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization
+49 7031/16-3385
Ehrhardt@xxxxxxxxxxxxxxxxxxx
Ehrhardt@xxxxxxxxxx
IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel
 | 
 
| <Prev in Thread] | Current Thread | [Next in Thread> |  | 
[XenPPC] Profiling in xen – ppc considerations,
Christian Ehrhardt <=
 |  |  | 
  
    |  |  |