[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC] docs/designs: PV stats interface



On Wed, Nov 23, 2016 at 02:14:27PM +0000, Paul Durrant wrote:
> This patch introduces a design proposal for an interface to used by
> guest PV drivers or agents to convey statistics to a monitoring agent
> running in a toolstack domain.
> 

Hey Paul!

Thank you for posting this, adding on the CC some extra folks
from Oracle. We have also an internal driver:

https://oss.oracle.com/git/?p=linux-uek.git;a=blob;f=drivers/xen/ovmapi.c;h=672d33f46f7c62053d002fb48ef6ad9f187a69bb;hb=020026b088c778996dfa7617ccb632ddbf04e9fe

which could use a design document or some synergy.

But first of couple of questions:
 - What about qemu-agent? Why not use that?

 - What about using Microsoft key-value pair drivers? They function
   in Linux and Windows. Linux underlaying layers can be changed
   so that the protocol is an XenStore one instead of the ring
   the kvp driver is using..

> Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
> ---
> RFC at the moment since this is a first draft and it is not yet
> prototyped
> ---
>  docs/designs/pv_stats.markdown | 116 
> +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 116 insertions(+)
>  create mode 100755 docs/designs/pv_stats.markdown
> 
> diff --git a/docs/designs/pv_stats.markdown b/docs/designs/pv_stats.markdown
> new file mode 100755
> index 0000000..4189369
> --- /dev/null
> +++ b/docs/designs/pv_stats.markdown
> @@ -0,0 +1,116 @@
> +# Statistics Interface for PV Drivers/Agents
> +
> +## Background
> +
> +It is common for guest PV drivers or agents to communicate statistics to an
> +agent running in a toolstack domain so that these can be displayed via a UI,
> +or even influence guest placement etc. The mechanism for conveying these
> +statistics is currently ad-hoc, undocumented, and usually based on xenstore.
> +
> +Whilst xenstore does indeed provide a convenient mechanism, the lack of
> +documentation and standardisation of the protocols used creates
> +compatibility issues for PV drivers or agents not tied to a specific
> +product or environment. Also, the guest xenstore quota and the single
> +instance of xenstored can easily become scalability issues.
> +
> +## Proposal
> +
> +The proposed interface is intended to be used only for the purposes of
> +conveying statistics from guest PV drivers or agents to an agent or agents
> +running in a toolstack domain. It is not intended for bulk data transfer,
> +nor as another means of control of the PV drivers or agents by the
> +toolstack domain.
> +
> +PV drivers or agents typically publish multiple related sets of statistics.
> +For example, a PV network frontend may publish statistics relating to
> +received traffic and transmitted traffic. These sets are likely to be
> +updated asynchronously from each other and therefore it makes sense that
> +they can be separated such that a monitoring agent can refresh its view of
> +them asynchronously. It is therefore proposed that a two-level hierarchy of
> +xenstore keys is used to advertise sets of guest statistics.
> +
> +The toolstack will create a writeable top-level `stats` key in the guest
> +space. Under this each guest statistics *provider* creates a key its name,
> +e.g. `Network #0`. This key acts as a parent to keys that then name each
> +set of statistics that it provides, e.g. `Tx`. Under the key for a
> +particular set the provider then writes entries containing grant references
> +of pages containing the names and values of the statistics in that set, and
> +an event channel to be used for signalling e.g.:
> +
> +    name-type-ref0 = 10
> +    name-type-ref1 = 11
> +    val-ref0 = 12
> +    val-ref1 = 13
> +    event-channel = 10
> +
> +The provider must always write the `event-channel` key after all the other
> +keys have been written such that a monitoring agent with a watch on the
> +guest's top-level `stats` key can make the assumption that it is safe to
> +sample all other keys once that key has been written. Thus no explicit
> +`state` key is required.
> +
> +There are separate references for pages containing the names and types of
> +the statistics and the values of those statistics since it is required that
> +the names and types do not change and hence a monitoring agent need only
> +sample them once and can do so as soon as the as the `name-ref` keys are
> +valid. The format of each (*name, type*) tuple is as follows:
> +
> +    struct stats_name_type {
> +     uint8_t type;
> +     char name[63];
> +    };
> +
> +and the possible types are:
> +
> +    #define STATS_TYPE_S64           1
> +    #define STATS_TYPE_U64           2
> +    #define STATS_TYPE_DOUBLE        3
> +    #define STATS_TYPE_ASCII 4
> +
> +The `name` must be a NUL terminated ASCII string containing only
> +alphanumeric characters, printable non-alphanumeric characters or a space
> +character, i.e. the expression:
> +
> +    (isalnum(name[i]) || ispunct(name[i]) || name[i] == ' ')
> +
> +must be true for each value of `i` until `name[i] == '\0'`
> +
> +When iterating through the `stats_name_type` structures a monitoring agents
> +can determine that it has finished when it either encounters a `type` value
> +of 0, or it has iterated through all 64 structures in the granted page and
> +there are no further `name-type-ref` keys.
> +
> +A monitoring agent can find the value of a statistic by noting the
> +`name-type-ref` index and the offset into the page where the
> +`stats_name_type` was found and the looking at the same offset in the
> +corresponding `val-ref` page. Values are therefore also 64 octets
> +in length and contain:
> +
> +* `STATS_TYPE_S64` : A signed 64-bit integer in little endian form in
> +                  octets 0..7
> +* `STATS_TYPE_U64` : An unsigned 64-bit integer in little endian form in
> +                  octets 0..7
> +* `STATS_TYPE_DOUBLE` : A double precision floating point value in little
> +                     endian form in
> +                     octets 0..7
> +* `STATS_TYPE_ASCII` : A NUL terminated ASCII string meeting the same
> +                    criteria as `name`
> +
> +A statistics provider should not update any of the statistics in a set
> +until the `event-channel` is signalled indicating that a monitoring agent
> +wishes to sample them. Thus the rate of update is nominally under control
> +of the monitoring agent.
> +
> +When such a signal is received, all statistics in the set should then be
> +updated and a signal set back to the monitoring agent via the
> +`event-channel` to say that the update is complete. The monitoring agent
> +can then sample the whole set, knowing that it is self-consistent, as long
> +as the provider is not misbehaving.
> +
> +When a provider wishes to withdraw a set of statistics, e.g. when it is
> +shutting down, it notifies a monitoring agent by deleting the
> +`event-channel` key from xenstore. Thus a monitoring agent must maintain a
> +watch on that key and respond in a timely manner by closing its port. Once
> +the provider detects that channel is no longer bound then it can remove the
> +whole set of keys corresponding to the set. When the provider is no longer
> +advertising any sets it can then remove its top-level key.
> -- 
> 2.5.3
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.