[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xentrace, xenalyze



On 24/02/16 16:00, Paul Sujkov wrote:
>> Have a look at this series:
>> http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg02233.html
> 
> Thanks a lot! Looking through it at the moment, looks very promising.
> 
>> And I've got another one that I'll send out asap (and I can Cc you).
> 
> Thanks in advance :)
> 
>> I usually enable a subset of them (one or more "classes") and try to
>> figure out if I see the problem in the resulting trace. If yes, I try
>> with a narrower subset. If not, I try with either a broader or a
>> different one.
> 
> Well, I have doubts on how to interpret the very basic info xenalyze is
> supporting me with. E.g. how can I measure intra-vm latencies, both global
> (how much PCPU time did hypervisor itself spent during all the testing
> time) or local (doing the same for specific interrupts)?

You need to add ARM-specific traces to xen and xenalyze to get this
information.

>
 Why domain 32767
> (default domain for cases when it's not clear what domain traces are about
> - according to documentation) is getting quite a lot of PCPU time (does
> this mean traces are incorrect or there is some significant problem in
> setup)? 

Domain 3276*8* is the "default domain".  32767 is the idle domain.  This
domain "getting pcpu time" means that the cpu is idle. :-)


What's concurrency_hazard, partial contention, full_contention, etc
> (these are from xenalyze summary)? How can I get number of context switches
> (overall or average)?
> 
> Adding some subtle questions, like, e.g. I have domain summary looking like
> this:
> 
> |-- Domain 2 --|
>  Runstates:
>    blocked:     273  0.35s   7908 {  2093|  9561| 47811}
>   partial run:    2284  1.27s   3420 {  6183|  6197|  6382}
>   full run:    1322  0.10s    479 {    95|  3772|  6164}
>   partial contention:     907  1.73s  11713 { 30655| 34266| 34305}
>   concurrency_hazard:    2474  0.18s    435 {    48|  5681|  6206}
>   full_contention:     381  0.02s    383 {    56| 36601| 36601}
> ...
> -- v0 --
>  Runstates:
>    running:    1981  1.36s   4217 {  6193|  6215|  6242}
>   runnable:     737  1.74s  14472 {   271| 36780| 38705}
>         wake:     430  0.04s    632 {    67| 26049| 35549}
>      preempt:     307  1.69s  33856 {   108| 36650| 39345}
>    blocked:     430  0.56s   7974 {  1189| 21758| 60893}
>  cpu affinity:     336  66914 {  3456| 52202|243760}
>    [0]:     167  66156 {  3650| 57926|216477}
>    [1]:     169  67663 {  3205| 44754|245733}
> -- v1 --
>  Runstates:
>    running:    2773  0.29s    649 {    54|  6382|  6382}
>   runnable:     874  0.22s   1520 {  5995| 36669| 36710}
>         wake:     845  0.09s    640 {   452| 25366| 26313}
>      preempt:      29  0.13s  27152 { 34413| 36708| 36710}
>    blocked:     845  3.14s  22856 {  2477| 61224| 61422}
>  cpu affinity:     391  57508 {  2788| 58686|128810}
>    [0]:     196  59685 {  2834| 58664|128810}
>    [1]:     195  55319 {  2770| 60622|130371}
> 
> It looks like Domain 2 had 0.10s of full run and 1.27s of partial run, but
> it's VCPU v0 was running 1.36s and VCPU v1 was running 0.29s. How does
> these numbers relate, what exactly is partial run, can I get some insight
> from concurrency_hazard or full_contention numbers?

So the *real* thing is the per-vcpu runstates.  These correspond to
runstates inside of Xen.  In the above traces, vcpu 0 entered the
"running" state 1981 times, and vcpu 1 entered the "running" state 2773
times.  This measures the time the vcpu started executing and then
stopped (which is probably what you mean by "context switch").

The stuff for the domain is a concept I invented called "domain
runstates".  Probably the easiest thing to do is to look at the
description I made when I tried to submit the calculation of these into
Xen itself:

http://lists.xen.org/archives/html/xen-devel/2010-11/msg01325.html

(That patch series was rejected, but as I understand it the patches are
still carried by XenServer.)

> I am trying to build up some understanding using xenalyze sources mostly
> because documentation does not go into any details whatsoever, but it goes
> pretty slow.

Right -- there's a huge amount of functionality, and it was initially
just a tool that I used myself.  That said, I did write an html file
with a bit of documentation when xenalyze was in a separate tree; that
seems not to have been checked in with xenalyze:

...hmm, permissions seem to be borked; I'll see if I can get that sorted
and then send you a link to it.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.