[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] xmalloc: add support for checking the pool integrity



On 16/12/14 19:33, Mihai DonÈu wrote:
> Implemented xmem_pool_check(), xmem_pool_check_locked() and
> xmem_pool_check_unlocked() to verity the integrity of the TLSF matrix.
>
> Signed-off-by: Mihai DonÈu <mdontu@xxxxxxxxxxxxxxx>
>
> ---
> Changes since v3:
>  - try harder to respect the 80 column limit
>  - use 'unsigned int' instead of 'int' where possible
>  - made the logged messages even shorter
>  - dropped the use of goto (didn't really make sense)
>
> Changes since v2:
>  - print the name of the corrupted pool
>  - adjusted the messages to better fit within 80 columns
>  - minor style change (a ? a : b -> a ?: b)
>
> Changes since v1:
>  - fixed the codingstyle
>  - swaped _locked/_unlocked naming
>  - reworked __xmem_pool_check_locked() a bit
>  - used bool_t where appropriate
>  - made xmem_pool_check() take a pool argument which can be NULL
> ---
>  xen/common/xmalloc_tlsf.c | 121 
> +++++++++++++++++++++++++++++++++++++++++++++-
>  xen/include/xen/xmalloc.h |   7 +++
>  2 files changed, 126 insertions(+), 2 deletions(-)
>
> diff --git a/xen/common/xmalloc_tlsf.c b/xen/common/xmalloc_tlsf.c
> index a5769c9..eca4f1c 100644
> --- a/xen/common/xmalloc_tlsf.c
> +++ b/xen/common/xmalloc_tlsf.c
> @@ -120,9 +120,122 @@ struct xmem_pool {
>      char name[MAX_POOL_NAME_LEN];
>  };
>  
> +static struct xmem_pool *xenpool;
> +
> +static inline void MAPPING_INSERT(unsigned long r, int *fl, int *sl);
> +
>  /*
>   * Helping functions
>   */
> +#ifndef NDEBUG
> +static bool_t xmem_pool_check_size(const struct xmem_pool *pool,
> +                                   int fl, int sl)
> +{
> +    const struct bhdr *b = pool->matrix[fl][sl];
> +
> +    while ( b )
> +    {
> +        int __fl;
> +        int __sl;
> +
> +        MAPPING_INSERT(b->size, &__fl, &__sl);
> +        if ( __fl != fl || __sl != sl )
> +        {
> +            printk(XENLOG_ERR
> +                   "xmem_pool: %s: misplaced block %p:%u ({%d,%d} -> 
> {%d,%d})\n",

Is it liable to get confusing with a hex number and a decimal number
combined with just a colon?

Is b even useful?  You have the pool name, indicies and size which seem
to be the salient information.

> +                   pool->name, b, b->size, fl, sl, __fl, __sl);
> +            return 0;
> +        }
> +        b = b->ptr.free_ptr.next;
> +    }
> +    return 1;
> +}
> +
> +/*
> + * This function must be called from a context where pool->lock is
> + * already acquired.
> + *
> + * Returns true if the pool has been corrupted, false otherwise
> + */
> +#define xmem_pool_check_locked(pool) \
> +    __xmem_pool_check_locked(__FILE__, __LINE__, pool)
> +static bool_t __xmem_pool_check_locked(const char *file, int line,
> +                                       const struct xmem_pool *pool)
> +{
> +    unsigned int i;
> +    static bool_t once = 1;

What is this static doing?  Surely corruption corruption in one pool has
no effect on corruption in a separate pool (other than the usual side
effects of general memory corruption, which tend to be bad).

It looks as if it wants to be an extra field in an xmem_pool.

> +
> +    if ( !once )
> +        return 1;
> +    for ( i = 0; i < REAL_FLI; i++ )
> +    {
> +        int fl = (pool->fl_bitmap & (1 << i)) ? i : -1;
> +
> +        if ( fl >= 0 )
> +        {
> +            unsigned int j;
> +
> +            if ( !pool->sl_bitmap[fl] )
> +            {
> +                printk(XENLOG_ERR
> +                       "xmem_pool: %s: TLSF bitmap corrupt (!empty FL, empty 
> SL)\n",
> +                       pool->name);
> +                __warn(file, line);

The __warn()s here will currently do a full register dump, as well as
stack dump and stack trace.  It occurs to me that only the stack trace
itself is useful at this point.

__bug and __warn() are currently unused (I am struggling to work out
exactly when they were orphaned; they do not form part of regular
BUG()/WARN() operations, which are handled in do_invalid_op()).  They
are also not fantastically useful as they will always identify
themselves in the printed information, not their caller.  I suspect they
can just be dropped completely.

I suspect you also would be better, and certainly more brief, with
"run_in_exception_handler(show_stack)" instead, which will just print a
stack trace, but nothing more.

> +                once = 0;
> +                break;
> +            }
> +            for ( j = 0; j < MAX_SLI; j++ )
> +            {
> +                int sl = (pool->sl_bitmap[fl] & (1 << j)) ? j : -1;
> +
> +                if ( sl < 0 )
> +                    continue;
> +                if ( !pool->matrix[fl][sl] )
> +                {
> +                    printk(XENLOG_ERR
> +                           "xmem_pool: %s: TLSF bitmap corrupt 
> (!matrix[%d][%d])\n",
> +                           pool->name, fl, sl);
> +                    __warn(file, line);
> +                    once = 0;
> +                    break;
> +                }
> +                if ( !xmem_pool_check_size(pool, fl, sl) )
> +                {
> +                    printk(XENLOG_ERR "xmem_pool: %s: TLSF chunk matrix 
> corrupt\n",
> +                           pool->name);
> +                    __warn(file, line);
> +                    once = 0;
> +                    break;
> +                }
> +            }
> +            if ( !once )
> +                break;
> +        }
> +    }
> +    return !once;
> +}
> +
> +#define xmem_pool_check_unlocked(pool) \
> +    __xmem_pool_check_unlocked(__FILE__, __LINE__, pool)
> +static bool_t __xmem_pool_check_unlocked(const char *file, int line,
> +                                         struct xmem_pool *pool)
> +{
> +    bool_t oops;
> +
> +    spin_lock(&pool->lock);
> +    oops = __xmem_pool_check_locked(file, line, pool);
> +    spin_unlock(&pool->lock);
> +    return oops;
> +}
> +
> +bool_t __xmem_pool_check(const char *file, int line, struct xmem_pool *pool)
> +{
> +    return __xmem_pool_check_unlocked(file, line, pool ?: xenpool);

Why should a NULL pool be tolerated here?  This is debug code only, so
we can require and trust that we are called appropriately.

> +}
> +#else
> +#define xmem_pool_check_locked(pool) ((void)(pool))
> +#define xmem_pool_check_unlocked(pool) ((void)(pool))
> +#endif
>  
>  /**
>   * Returns indexes (fl, sl) of the list used to serve request of size r
> @@ -381,6 +494,8 @@ void *xmem_pool_alloc(unsigned long size, struct 
> xmem_pool *pool)
>      int fl, sl;
>      unsigned long tmp_size;
>  
> +    xmem_pool_check_unlocked(pool);
> +
>      if ( pool->init_region == NULL )
>      {
>          if ( (region = pool->get_mem(pool->init_size)) == NULL )
> @@ -442,11 +557,13 @@ void *xmem_pool_alloc(unsigned long size, struct 
> xmem_pool *pool)
>  
>      pool->used_size += (b->size & BLOCK_SIZE_MASK) + BHDR_OVERHEAD;
>  
> +    xmem_pool_check_locked(pool);
>      spin_unlock(&pool->lock);
>      return (void *)b->ptr.buffer;
>  
>      /* Failed alloc */
>   out_locked:
> +    xmem_pool_check_locked(pool);
>      spin_unlock(&pool->lock);
>  
>   out:
> @@ -464,6 +581,7 @@ void xmem_pool_free(void *ptr, struct xmem_pool *pool)
>      b = (struct bhdr *)((char *) ptr - BHDR_OVERHEAD);
>  
>      spin_lock(&pool->lock);
> +    xmem_pool_check_locked(pool);
>      b->size |= FREE_BLOCK;
>      pool->used_size -= (b->size & BLOCK_SIZE_MASK) + BHDR_OVERHEAD;
>      b->ptr.free_ptr = (struct free_ptr) { NULL, NULL};
> @@ -500,6 +618,7 @@ void xmem_pool_free(void *ptr, struct xmem_pool *pool)
>      tmp_b->size |= PREV_FREE;
>      tmp_b->prev_hdr = b;
>   out:
> +    xmem_pool_check_locked(pool);
>      spin_unlock(&pool->lock);
>  }
>  
> @@ -512,8 +631,6 @@ int xmem_pool_maxalloc(struct xmem_pool *pool)
>   * Glue for xmalloc().
>   */
>  
> -static struct xmem_pool *xenpool;
> -
>  static void *xmalloc_pool_get(unsigned long size)
>  {
>      ASSERT(size == PAGE_SIZE);
> diff --git a/xen/include/xen/xmalloc.h b/xen/include/xen/xmalloc.h
> index 24a99ac..ad48930 100644
> --- a/xen/include/xen/xmalloc.h
> +++ b/xen/include/xen/xmalloc.h
> @@ -123,4 +123,11 @@ unsigned long xmem_pool_get_used_size(struct xmem_pool 
> *pool);
>   */
>  unsigned long xmem_pool_get_total_size(struct xmem_pool *pool);
>  
> +#ifndef NDEBUG
> +#define xmem_pool_check(pool) __xmem_pool_check(__FILE__, __LINE__, pool)
> +bool_t __xmem_pool_check(const char *file, int line, struct xmem_pool *pool);
> +#else
> +#define xmem_pool_check(pool) ((void)0)

This needs to be ((void)(pool)) to evaluate the potential side effects. 
However, I think you need to conditionally change __xmem_pool_check()
and move #define xmem_pool_check outside the conditional, to cover
future consumers who might choose to call __xmem_pool_check() directly.


What are the overheads of pool consistency checking?  It looks to be
moderately high, given a full inspection of the matrix on each
allocation and free.  Would it be possible to have one easy-to-alter
compile time knob so the default can be overridden in a single location?

Perhaps something like:

#ifndef NDEBUG
#define XMEM_POOL_CHECKS
#endif

#ifdef XMEM_POOL_CHECKS
...
#else
...
#endif

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.