[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen, oprofile, perf, PEBS, event counters, PVHVM, PV



On 14/01/13 20:45, Konrad Rzeszutek Wilk wrote:
a). 32/64 compat is missing backtrace support. If you run with a 32-bit dom0
    and try to set the backtrace, the hypervisor sets is as -some-huge-number.
    It might be there are some other hypercalls that need some compat tweaks.

It's not clear to me if it is the same issue, but there was some work to make xenoprof's callgraph work with 32-bit domains on a 64-bit xen here:
http://lists.xen.org/archives/html/xen-devel/2012-01/msg01721.html
The patch should be now in xen but it requires a one-line change in the 32-bit dom0 kernel matching this one in xen:
http://lists.xen.org/archives/html/xen-devel/2012-01/txt4qZ7uGGPTc.txt

b). 32-bit dom0 oprofile toolstack truncates the EIP of 64-bit guests
    (or hypervisor). I am not really sure how to solve that except just
    not run 64-bit guests/hypervisor with a 32-bit dom0. Or make
    oprofile and its tools capable of doing 64-bit architecture.
    The vice-versa condition does not exist - so I can profile 32-bit
    guests using a 64-bit dom0.
Afaics, the 32-bit dom0 oprofile.ko module receives the 64-bit eips; the XENOPROF_ESCAPE_CODE comparison is made as ULL in the kernel and seems to work. This could be happening maybe in either opcontrol, oprofiled or opreport, but with the patches above I obtained the following result in an idle 32-bit dom0, which seems to display the correct 64-bit memory location information for hypervisor functions:


# opreport -lwc #(functions calling other functions):
CPU: Core 2, speed 2493.77 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 1000000
vma      samples  %        image name               app name                 symbol name
-------------------------------------------------------------------------------
  00000000000b9610 1         2.0000  libc-2.5.so              python                   getaddrinfo
  000000000000fad0 1         2.0000  libpthread-2.5.so        python                   _fini
  000000000000e790 3         6.0000  ld-2.5.so                python                   _dl_fini
  000000000002aec0 12       24.0000  libc-2.5.so              python                   msort_with_tmp
  0000000000004480 25       50.0000  _xslib.so                python                   init_xslib
0000000000000000 415746   22.5841  libpython2.4.so.1.0      python                   /usr/lib/libpython2.4.so.1.0
  0000000000000000 415746   100.000  libpython2.4.so.1.0      python                   /usr/lib/libpython2.4.so.1.0 [self]
...
-------------------------------------------------------------------------------
ffff82c480170470 36        0.1587  xen-syms                 qemu-dm                  send_IPI_mask_flat
  ffff82c480170470 36       100.000  xen-syms                 qemu-dm                  send_IPI_mask_flat [self]
-------------------------------------------------------------------------------
ffff82c480120d40 33        0.1455  xen-syms                 qemu-dm                  cpumask_raise_softirq
  ffff82c480120d40 33       100.000  xen-syms                 qemu-dm                  cpumask_raise_softirq [self]


This quote from http://lists.xen.org/archives/html/xen-devel/2012-01/msg01721.html may be useful:
"
A few comments from my tests with oprofile 0.9.6 in userspace:
- to obtain callgraphs of the xen code, you need to enable the CONFIG_FRAME_POINTER flag during compilation of the xen binary, eg. using "make" with "frame_pointer=y". - if the oprofiled daemon is running in a 32-bit guest, it needs to receive the xen-range in 32-bits, eg. --xen-image=/boot/xen-syms-4.1.1 --xen-range=80100000,801fe5ee
"

  h). There are some counters in the hypervisor for the oprofile statistics, like
   lost samples, etc. I does not look like they are exported/printed anywhere. Perhaps
   an 'register_keyhandler' should be written to dump those (and also which domains
   are profiled).

I see some lost sample information when I run 'opcontrol --start --verbose=all', 'opcontrol --deinit' and look at oprofiled.log, are these the counters you are looking for?
# cat /var/lib/oprofile/samples/oprofiled.log
oprofiled started Tue Jan 15 16:02:00 2013
kernel pointer size: 4
Tue Jan 15 16:04:34 2013
-- OProfile Statistics --
Nr. sample dumps: 4
Nr. non-backtrace samples: 25508
Nr. kernel samples: 14344
Nr. lost samples (no kernel/user): 0
Nr. lost kernel samples: 0
Nr. incomplete code structs: 0
Nr. samples lost due to sample file open failure: 4569
Nr. samples lost due to no permanent mapping: 78
Nr. event lost due to buffer overflow: 0
Nr. samples lost due to no mapping: 20
Nr. backtraces skipped due to no file mapping: 0
Nr. samples lost due to no mm: 4727
---- Statistics for cpu : 3
Nr. samples lost cpu buffer overflow: 0
Nr. samples received: 11734
Nr. backtrace aborted: 0
Nr. samples lost invalid pc: 0
...


 i). opreports often tells me
        warning: /domain1-apps could not be found.
        warning: /domain1-modules could not be found.
        warning: /domain1-xen-unknown could not be found.
        warning: /domain2-apps could not be found.
        warning: /domain2-modules could not be found.
        warning: /domain2-xen-unknown could not be found.
        warning: /domain3-apps could not be found.
        warning: /domain3-modules could not be found.
        warning: /domain3-xen-unknown could not be found.
        warning: /vmlinux-unknown could not be found.
        warning: /xen-unknown could not be found.

These warnings remind me of what I was receiving for the dom0 kernel modules, I fixed them by using -p for the modules in opreport:
# opreport -l -p/usr/lib/debug/lib/modules/`uname -r`
I guess opreport may be in need of this parameter pointing to the guest kernel symbols.

And it occurs to me it could be possible be to make some inroads on making
performance monitoring easier:

  1). fix the glaring omissions in oprofile for the new CPUs
  2). Add a register keyhandle to get some debug info.
  3). piggyback on oprofile hypercalls and insert some bridge in perf (lots
      of handwaving here). Or perhaps emulate in the Linux kernel the
      wmsrs (so xen_safe_wrmsrs) and have the pvops kernel based on the MSRs
      make the hypercalls to setup the buffers, etc.

     3a). new hypercalls? intercept rdmsr/wrmsrs and stuff the right data
      in the initial domain? Other thoughts?

  4). Extend perf to have '--xen' so it can also look at the xen-hypervisor
      ELF file.

5) live event reports from xenoprof/opreport, ala perf top.
6) ports of oprofile kernel modules for other oses (bsd, windows, mirage), so that these oses can be used as active participants.

cheers,
Marcus



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.