[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: [RFC][PATCH] Per-cpu xentrace buffers


  • To: xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
  • Date: Wed, 20 Jan 2010 17:38:31 +0000
  • Cc:
  • Delivery-date: Wed, 20 Jan 2010 09:38:55 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=tLvPuJHeHFrZ/RSpQJ5tuPObJ0UBioCJ16/ysxskWAYoyQ+z2Mwa5arZ6RgxmqPFAY zz58sEdvhgDYyrGC4C7UNafVRsmltp2J6yGuZg146CoUFlIO4wiY8aUTE8P8xVgo85Oi i9qw7Ab0Wxwq5rKQ6xMdkrEi0t2FFAJkzGPg8=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Keir, would you mind commenting on this new design in the next few
days?  If it looks like a good design, I'd like to do some more
testing and get this into our next XenServer release.

 -George

On Thu, Jan 7, 2010 at 3:13 PM, George Dunlap <dunlapg@xxxxxxxxx> wrote:
> In the current xentrace configuration, xentrace buffers are all
> allocated in a single contiguous chunk, and then divided among logical
> cpus, one buffer per cpu.  The size of an allocatable chunk is fairly
> limited, in my experience about 128 pages (512KiB).  As the number of
> logical cores increase, this means a much smaller maximum per-cpu
> trace buffer per cpu; on my dual-socket quad-core nehalem box with
> hyperthreading (16 logical cpus), that comes to 8 pages per logical
> cpu.
>
> The attached patch addresses this issue by allocating per-cpu buffers
> separately.  This allows larger trace buffers; however, it requires an
> interface change to xentrace, which is why I'm making a Request For
> Comments.  (I'm not expecting this patch to be included in the 4.0
> release.)
>
> The old interface to get trace buffers was fairly simple: you ask for
> the info, and it gives you:
> * the mfn of the first page in the buffer allocation
> * the total size of the trace buffer
>
> The tools then mapped [mfn,mfn+size), calculated where the per-pcpu
> buffers were, and went on to consume records from them.
>
> -- Interface --
>
> The proposed interface works as follows.
>
> * XEN_SYSCTL_TBUFOP_get_info still returns an mfn and a size (so no
> changes to the library).  However, this new are is to a trace buffer
> info area  (t_info), allocated once at boot time.  The trace buffer
> info area contains mfns of the per-pcpu buffers.
> * The t_info struct contains an array of "offset pointers", one per
> pcpu.  These are an offset into the t_info data area of an array of
> mfns for that pcpu.  So logically, the layout looks like this:
> struct {
>  int16_t tbuf_size; /* Number of pages per cpu */
>  int16_t offset[NR_CPUS]; /* Offset into the t_info area of the array */
>  uint32_t mfn[NR_CPUS][TBUF_SIZE];
> };
>
> So if NR_CPUS was 16, and TBUF_SIZE was 32, we'd have:
> struct {
>  int16_t tbuf_size; /* Number of pages per cpu */
>  int16_t offset[16]; /* Offset into the t_info area of the array */
>  uint32_t p0_mfn_list[32];
>  uint32_t p1_mfn_list[32];
>  ...
>  uint32_t p15_mfn_list[32];
> };
> * So the new way to map trace buffers is as follows:
>  + Call TBUFOP_get_info to get the mfn and size of the t_info area, and map 
> it.
>  + Get the number of cpus
>  + For each cpu:
>  - Calculate the offset into the t_info area thus: unsigned long
> *mfn_list = ((unsigned long*)t_info)+(t_info->cpu_offset[cpu]))
>  - Map t_info->tbuf_size mfns from mfn_list using xc_map_foreign_batch()
>
> In the current implementation, the t_info size is fixed at 2 pages,
> allowing about 2000 pages total to be mapped.  For a 32-way system,
> this would allow up to 63 pages per cpu (256MiB).  Bumping this up to
> 4 would allow even larger systems if required.
>
> The current implementation also allocates each trace buffer
> contiguously, since that's the easiest way to get contiguous virtual
> address space.  But this interface allows Xen the flexibility, in the
> future, to allocate buffers in several chunks if necessary, without
> having to change the interface again.
>
> -- Implementation notes --
>
> The t_info area is allocated once at boot.  Trace buffers are
> allocated either at boot (if a parameter is passed) or when
> TBUFOP_set_size is called.  Due to the complexity of tracking pages
> mapped by dom0, unmapping or resizing trace buffers is not supported.
>
> I introduced a new per-cpu spinlock guarding trace data and buffers.
> This allows per-cpu data to be safely accessed and modified without
> tracing with current tracing events.  The per-cpu spinlock is grabbed
> whenever a trace event is generated; but in the (very very very)
> common case, the lock should be in the cache already.
>
> Feedback welcome.
>
>  -George
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.