[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] "right" way to gather domU stats in xen 3 & 4?


  • To: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: Florian Heigl <florian.heigl@xxxxxxxxx>
  • Date: Sat, 26 Feb 2011 15:34:15 +0100
  • Delivery-date: Sat, 26 Feb 2011 06:35:06 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=pUA2fSWSS3XJpIbsaPt3zfoRsyhIWmAEnIE4YqBAyQVyzvOEzG6lq5kALgot6pTkqA M4HFfiF2xKxl8FhgDWpXzKTlGFGfjF7lOyvxILwX1zmCWZCf3UB3VcawgtOP7betYbWS BoRFM2q6UcQXmlXOuMxOt4OXQ6FWjRKg0lT6A=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi all,

I'm building a xen agent for nagios / check_mk.
Automatic inventory of VMs and the basic up / down reporting are
reliable now, and I'm looking at the next items on my list.

* Free memory. This seems easy at first, look at xm info and that's
mostly it. I can have a different color for memory allocated to dom0
minus the dom0 lower balloon limit, but I'll also have a check that
will go to full alarm if anyone is crazy enough to use dom0 balloning.
;)
  What I don't know is if I also need to substract something for the
Xen heap? Long ago it used to default to 32MB i think. Can someone
clue me in about that - is it relevant to xm info free / total mem?

* per domU I also wanna look at memory statistics.
  -  one thing is: mem vs. mem-max to show balloning.
  -  the other thing is tmem: i don't know if i should spend the time
getting it right as I start getting the impression that since it was
added by Dan and now tmem2 was added, two-and-a-half years went down
where it's considered working implemented none bothers to make it work
for everyone. i.e. the recent directed that the direct ballooning
daemon was just a lab exercise ;) If you know of any people that
successfully run xen with tmem2 and such, I'd love to work with them
to build the nagios-sy statistics .Otherwise I'll save myself the
headaches.

* per domU cpu percent (to show how much of the dom0 power the vm is
consuming)...


Speed issues:
Usually checks in check_mk are fired off every minute, so it would be
good if I can directly via xenstore to collect and report my data
within 1-2 seconds or less. Speed seems to be an issue I have to worry
about - on my "top of the shelf" xen host it will take around
0.6seconds to query a meager 5 VMs.
That's just a 1.5GHz VIA box, but I'll have to see how long it takes
for 100 VMs or more.

Documentation??
What I'm missing is some document that'd show all nodes in the
xenstore that are readable. I've poked around a lot already but the
statistics are hiding from me.
Also I would try to use something that can work in xen4 and xen3. But
that's not mandatory, I can fallback from xl to xenstore-read to xm to
libvirt.

Why you might want to help:
Using check_mk you can pull off all kinds of crazy stuff with the data
it collects:
    trend analyzing on disk usage ("simple" example: get an alert if
your vm store is growing at a rate that will let it run out of space
in 3 days)
    if somebody feels they need it, use the block IO rates to trigger
an eventhandler that will put io & cpu caps on a VM. (hosters might
love that :)
    I think most of these features are not implemented in any nagios
checks so far
    If I just hack it in ksh, it *will work*, but be ugly and slow :)
    and of course you won't have to bother with any config files to add a VM!

Maybe someone likes xenstore *a lot* and can point me at the right spots.

Florian


p.s.:
could interested parties consider spending a day to improve the xm list output?
it may technically make sense that a vm created using xm new has no ID
and no status instead of "-------" and a VM that is running but didn't
use CPU during the microsecond we queried it is shown as blocking.
But it makes life harder for each and every xen user for 5 or 6 years
now, and technical reasons really don't cut it if they turn
information into worthless bytes. (I still feel you would get an
"-r-----" state most of the time back in Xen2...)
-- 
the purpose of libvirt is to provide an abstraction layer hiding all
xen features added since 2006 until they were finally understood and
copied by the kvm devs.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.