[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics


  • To: Matias Ezequiel Vara Larsen <matiasevara@xxxxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxx>
  • Date: Fri, 17 Jun 2022 19:54:32 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8Axys3FCOi8XwWU+7Riro6JD9DbxQNYpWH34M95p3o8=; b=bCyZwnRvG3q6pHV5oq4AJHts+b/uYKxVa3b/rhlkpH8oTVUgE9SjQU75uVv4OE/xoTbvtMVLgntkT7uTFlDCgHoxGU1zDT26SNA4JSaC06g/e4l8QpmwAwT6dvMfZijD42TPvcnq4ka7GRgzFBa/3BxfUWH4gSYdXt/9FBk+Ges0+l5VE7cMAgOYuxKgZIVb583xZaaNXa++g6XK6uTns0V+8pglmv17NXZQYDjNl00SMu0TU65q41K2lfnFmgjqoeAUYV2ItDcnBax8vVatp/LqJ9evDZeUhR1Sf2vC2j6qjQjNN7FtC5EnKymwQqbCcU6/c9gTDEJyrDfb3XX3zw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Too1Imf7yeV3TLF1KydySpCfWSHDWszPM9iTMBc/xCbxpXAWer6ZLxha+zuLj7Z2SrnQn/tWWuVlMvQA01rsRAGSHnmAHYtUDawHITPAyij9NOOSK7BUD7CZiYwPplShabZ12rf0F+W0QHZ3axMVtAbkD9aWTkN7qKnkNvuiJmuLux8/LPpZh3IqbTRxQR0qRVwfef1A3hMWbkRaac5FGMneZ/a+CpypSFiMOlXFtiguwx/3gsrETjOQsdZvONJzr3uOa4Ry09UwYrpgSCPb9/pMq20VzzXN+m1emTnz/T/t91Y3kL/0YJOB5aPBooNqxqExhbfeVd6/hQyY/XomYg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Matias Ezequiel Vara Larsen <matias.vara@xxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Anthony Perard <anthony.perard@xxxxxxxxxx>
  • Delivery-date: Fri, 17 Jun 2022 19:55:10 +0000
  • Ironport-data: A9a23:zOfsUakz1jpVN5gWD8ReVrzo5gzKJkRdPkR7XQ2eYbSJt16W5oA// 9YtKSrfbaHbJie3LscnK96GQXl27cLSnNM3TQBoqy8xQnkRoJCVVYnHfxb7Y3qcc5DPFR04s 5tCZ4CdfJ0/FCGA90z3a+W9pCIl3P7STLTyWbacakidKeMcpAIJ0HqPzMZl0t4AbaGFPj6wV fPOT+z3MQb/1mN4aTMdt/KNoUwztv3+6G1H5gVgPPlGsQfTy3BEUJ5HKa+PdHapGYM88sxW5 grgIBNV2kuDon/B3/v8yu6TnnUiG+KUZU7U4pZvc/DKbiJq/0Te6Y5mcqtGAatro2/RxYopl owT7cDYpToBZcUgpsxMC3G0LAkmVUF20Oevza+X6JH7I+XuKhMA8t02ZK0EFdRwFtVfWAmiw ccwOjEVBi1vssrtqF6NpkuAsex4RCXjFNt3VniNVlg1B95+KXzIa/2iCdO1QF7cLy2BdBrTT 5NxVNZhUPjPSzZRA21HEpE1pcuHnkT2a2NhsnSX/INitgA/zCQpuFTsGPz8X4XQAOlwwAOfr G+A+HnlCBYHMtDZ0SCC7n+nmu7Im2X8RZ4WE7q7sPVthTV/xERKUEFQCQT9/Kj/0xHmMz5cA xV8Fi4GgqU17kOmCPXgWRmxuFaPvwIGWsoWGOo/gO2I4vWPuVzDXDNfJtJHQOQjsM9tb2Y06 gSytc7HDGVfqJquZ0vIo994qhv3Y0D5N1QqYCYYTAIe7sfquogbgRfGT9IlG6mw5vXlFDe1z z2UoSwWg7QIkdVNx6i95UrAgT+nut7OVAFdzgDeQmOs9UVnbZSsT5Kh9VXAq/haRK6bRFScu HkPm+CF8fsDS5qKkUSlQvgJHbyvz+aINnvbm1EHN4I66z2n9nqnfIZRyDJzPkFkNoADYzCBS FDXkRNc4tlUJnTCRaN5ao2+CsMuzID7CM/oEPvTa7JzjoNZcQaG+GRiYBCW1mW0ykw0y/hgY dGcbNqmCmscBeJ/1j2qSuwB0LgtgCcj2WfUQpO9xBOiuVaDWEOopX4+GAPmRogEAGms+205L /432xO29ihi
  • Ironport-hdrordr: A9a23:D0qmLq2Bfc2mPlMZEo0emgqjBR5yeYIsimQD101hICG9Lfb0qy n+pp4mPEHP4wr5AEtQ4uxpOMG7MBDhHO1OkPMs1NaZLUTbUQ6TQL2KgrGSpAEIdxeeygcZ79 YZT0EcMqy9MbEZt7ed3ODQKb9Jr7e6GeKT9J7jJhxWPGNXgtRbnmNE43GgYyhLrWd9ZaYRJd 653I5qtjCgcXMYYoCQHX8eRdXOoNXNidbPfQMGLwRP0njBsRqYrJrBVzSI1BYXVD1ChZ0493 LergD/7qK/99mm1x7n0XPJ5Zg+oqqh9jIDPr3NtiEmEESvtu+aXvUlZ1REhkFwnAib0idorD ALmWZmAy080QKWQoj/m2qR5+Cp6kdT15al8y7WvZKrm72GeBsqT8VGno5XaR3f9g4pu8x9yr tC2yaDu4NQFg6oplW02zBZPysa6XZcjEBS59L7tUYvGLf2qYUh37A37QdQCtMNDSj64IcoHK 1nC9zd/u9fdRefY2rCtmdizdSwVjBrdy32CHQqq4iQyXxbjXp5x0wXyIgWmWoB7os0T91B6/ 7fOqplmblSRosdbL57Bu0GXcyrY1a9Ci7kISaXOxDqBasHM3XCp9r+56g0/vijfNgSwJ47iP 36ISRlXK4JCjbT4OG1re12G0r2MRSAtBzWu7Jjzok8vKHgT7z2NiDGQEwykqKb0oAiPvE=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYafskDxt9xTUpi0SpWmJ0NCtB9K1UNB8A
  • Thread-topic: [RFC PATCH 0/2] Add a new acquire resource to query vcpu statistics


> On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen <matiasevara@xxxxxxxxx> 
> wrote:
> 
> Hello all,
> 
> The purpose of this RFC is to get feedback about a new acquire resource that
> exposes vcpu statistics for a given domain. The current mechanism to get those
> statistics is by querying the hypervisor. This mechanism relies on a hypercall
> and holds the domctl spinlock during its execution. When a pv tool like 
> xcp-rrdd
> periodically samples these counters, it ends up affecting other paths that 
> share
> that spinlock. By using acquire resources, the pv tool only requires a few
> hypercalls to set the shared memory region and samples are got without issuing
> any other hypercall. The original idea has been suggested by Andrew Cooper to
> which I have been discussing about how to implement the current PoC. You can
> find the RFC patch series at [1]. The series is rebased on top of stable-4.15.
> 
> I am currently a bit blocked on 1) what to expose and 2) how to expose it. For
> 1), I decided to expose what xcp-rrdd is querying, e.g., 
> XEN_DOMCTL_getvcpuinfo.
> More precisely, xcp-rrd gets runstate.time[RUNSTATE_running]. This is a 
> uint64_t
> counter. However, the time spent in other states may be interesting too.
> Regarding 2), I am not sure if simply using an array of uint64_t is enough or 
> if
> a different interface should be exposed. The remaining question is when to get
> new values. For the moment, I am updating this counter during
> vcpu_runstate_change().
> 
> The current series includes a simple pv tool that shows how this new 
> interface is
> used. This tool maps the counter and periodically samples it.
> 
> Any feedback/help would be appreciated.

Hey Matias,

Sorry it’s taken so long to get back to you.  My day-to-day job has shifted 
away from technical things to community management; this has been on my radar 
but I never made time to dig into it.

There are some missing details I’ve had to try to piece together about the 
situation, so let me sketch things out a bit further and see if I understand 
the situation:

* xcp-rrd currently wants (at minimum) to record 
runstate.time[RUNSTATE_running] for each vcpu.  Currently that means calling 
XEN_DOMCTL_getvcpuinfo, which has to hold a single global domctl_lock (!) for 
the entire hypercall; and of course must be iterated over every vcpu in the 
system for every update.

* VCPUOP_get_runstate_info copies out a vcpu_runstate_info struct, which 
contains information on the other runstates.  Additionally, 
VCPUOP_register_runstate_memory_area already does something similar to what you 
want: it passes a virtual address to Xen, which Xen maps, and copies 
information about the various vcpus into (in update_runstate_area()).

* However, the above assumes a domain of “current->domain”: That is a domain 
can call VCPUOP_get_runstate_info on one of its own vcpus, but dom0 cannot call 
it to get information about the vcpus of other domains.

* Additionally, VCPUOP_register_runstate_memory_area registers by *virtual 
address*; this is actually problematic even for guest kernels looking at their 
own vcpus; but would be completely inappropriate for a dom0 userspace 
application, which is what you’re looking at.

Your solution is to expose things via the xenforeignmemory interface instead, 
modelled after the vmtrace_buf functionality.

Does that all sound right?

I think at a high level that’s probably the right way to go.

As you say, my default would be to expose similar information as 
VCPUOP_get_runstate_info.  I’d even consider just using vcpu_runstate_info_t.

The other option would be to try to make the page a more general “foreign vcpu 
info” page, which we could expand with more information as we find it useful.

In this patch, you’re allocating 4k *per vcpu on the entire system* to hold a 
single 64-bit value; even if you decide to use vcpu_runstate_info_t, there’s 
still quite a large wastage.  Would it make sense rather to have this pass back 
an array of MFNs designed to be mapped contiguously, with the vcpus listed as 
an array? This seems to be what XENMEM_resource_ioreq_server does.

The advantage of making the structure extensible is that we wouldn’t need to 
add another interface, and potentially another full page, if we wanted to add 
more functionality that we wanted to export.  On the other hand, every new 
functionality that we add may require adding code to copy things into it; 
making it so that such code is added bit by bit as it’s requested might be 
better.

I have some more comments I’ll give on the 1/2 patch.

 -George






> 
> Thanks, Matias.
> 
> [1] https://github.com/MatiasVara/xen/tree/feature_stats
> 
> Matias Ezequiel Vara Larsen (2):
>  xen/memory : Add stats_table resource type
>  tools/misc: Add xen-stats tool
> 
> tools/misc/Makefile         |  5 +++
> tools/misc/xen-stats.c      | 83 +++++++++++++++++++++++++++++++++++++
> xen/common/domain.c         | 42 +++++++++++++++++++
> xen/common/memory.c         | 29 +++++++++++++
> xen/common/sched/core.c     |  5 +++
> xen/include/public/memory.h |  1 +
> xen/include/xen/sched.h     |  5 +++
> 7 files changed, 170 insertions(+)
> create mode 100644 tools/misc/xen-stats.c
> 
> --
> 2.25.1
> 

Attachment: signature.asc
Description: Message signed with OpenPGP


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.