[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] rwlock: add per-cpu reader-writer locks



On 05/11/15 13:48, Marcos E. Matsunaga wrote:
> Hi Malcolm,
> 
> I tried your patches against staging yesterday and as soon as I started a 
> guest, it panic. I have
> lock_profile enabled and applied your patches against:

I tested with a non debug version of Xen (because I was analysing the 
performance of Xen) and thus
those ASSERTS were never run.

The ASSERTS can be safely removed, the rwlock behaviour is slightly different 
in that it's possible
for a writer to hold the write lock whilst a reader is progressing through the 
read critical
section, this is safe because the writer is waiting for the percpu variables to 
clear before
actually progressing through it's own critical section.

I have an updated version of the patch series which fixes this. Do you want me 
to post it or are you
happy to remove the ASSERTS yourself ( or switch to non-debug build of Xen)

Sorry for not catching this before it hit the list.

Malcolm

> 
> 6f04de658574833688c3f9eab310e7834d56a9c0 x86: cleanup of early cpuid handling
> 
> 
> 
> (XEN) HVM1 save: CPU
> (XEN) HVM1 save: PIC
> (XEN) HVM1 save: IOAPIC
> (XEN) HVM1 save: LAPIC
> (XEN) HVM1 save: LAPIC_REGS
> (XEN) HVM1 save: PCI_IRQ
> (XEN) HVM1 save: ISA_IRQ
> (XEN) HVM1 save: PCI_LINK
> (XEN) HVM1 save: PIT
> (XEN) HVM1 save: RTC
> (XEN) HVM1 save: HPET
> (XEN) HVM1 save: PMTIMER
> (XEN) HVM1 save: MTRR
> (XEN) HVM1 save: VIRIDIAN_DOMAIN
> (XEN) HVM1 save: CPU_XSAVE
> (XEN) HVM1 save: VIRIDIAN_VCPU
> (XEN) HVM1 save: VMCE_VCPU
> (XEN) HVM1 save: TSC_ADJUST
> (XEN) HVM1 restore: CPU 0
> [  394.163143] loop: module loaded
> (XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
> (XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82d08011019e>] do_grant_table_op+0x63f/0x2e04
> (XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor (d0v0)
> (XEN) rax: 0000000000000000   rbx: ffff83400f9dc9e0   rcx: 0000000000000000
> (XEN) rdx: 0000000000000001   rsi: ffff82d080342b10   rdi: ffff83400819b784
> (XEN) rbp: ffff8300774ffef8   rsp: ffff8300774ffdf8   r8: 0000000000000002
> (XEN) r9:  0000000000000002   r10: 0000000000000002   r11: 00000000ffffffff
> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: ffff83400819b780
> (XEN) r15: ffff83400f9d0000   cr0: 0000000080050033   cr4: 00000000001526e0
> (XEN) cr3: 000001007f613000   cr2: ffff8800746182b8
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen stack trace from rsp=ffff8300774ffdf8:
> (XEN)    ffff8300774ffe08 ffff82d000000000 ffff8300774ffef8 ffff82d08017fc9b
> (XEN)    ffff82d080342b28 ffff83400f9d8600 ffff82d080342b10 0000000000000000
> (XEN)    ffff83400f9dca20 ffff832100000000 ffff834008188000 0000000000000001
> (XEN)    00000001772ee000 ffff8801e98d03e0 ffff8300774ffe88 0000000000000000
> (XEN)    0000000000000000 ffff8300774fff18 00000021d0269c10 000000000001001a
> (XEN)    ffffffff00000001 0000000000000000 0000000000000246 00007ff7de45a407
> (XEN)    0000000000000100 00007ff7de45a407 0000000000000033 ffff8300772ee000
> (XEN)    ffff8801eb0e3c00 ffff880004bf57e8 ffff8801e98d03e0 ffff8801eb0a5938
> (XEN)    00007cff88b000c7 ffff82d08023d952 ffffffff8100128a 0000000000000014
> (XEN)    0000000000000000 0000000000000001 ffff8801f6e18388 ffffffff81d3d740
> (XEN)    ffff8801efb7bd40 ffff88000542e780 0000000000000282 0000000000000000
> (XEN)    ffff8801e98d03a0 ffff8801efe07000 0000000000000014 ffffffff8100128a
> (XEN)    0000000000000001 ffff8801e98d03e0 0000000000000000 0001010000000000
> (XEN)    ffffffff8100128a 000000000000e033 0000000000000282 ffff8801efb7bce0
> (XEN)    000000000000e02b 0000000000000000 0000000000000000 0000000000000000
> (XEN)    0000000000000000 0000000000000000 ffff8300772ee000 0000000000000000
> (XEN)    0000000000000000
> (XEN) Xen call trace:
> (XEN)    [<ffff82d08011019e>] do_grant_table_op+0x63f/0x2e04
> (XEN)    [<ffff82d08023d952>] lstar_enter+0xe2/0x13c
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
> (XEN) ****************************************
> (XEN)
> (XEN) Manual reset required ('noreboot' specified)
> 
> 
> Thanks for your help.
> 
> On 11/03/2015 12:58 PM, Malcolm Crossley wrote:
>> Per-cpu read-write locks allow for the fast path read case to have low 
>> overhead
>> by only setting/clearing a per-cpu variable for using the read lock.
>> The per-cpu read fast path also avoids locked compare swap operations which 
>> can
>> be particularly slow on coherent multi-socket systems, particularly if there 
>> is
>> heavy usage of the read lock itself.
>>
>> The per-cpu reader-writer lock uses a global variable to control the read 
>> lock
>> fast path. This allows a writer to disable the fast path and ensure the 
>> readers
>> use the underlying read-write lock implementation.
>>
>> Once the writer has taken the write lock and disabled the fast path, it must
>> poll the per-cpu variable for all CPU's which have entered the critical 
>> section
>> for the specific read-write lock the writer is attempting to take. This 
>> design
>> allows for a single per-cpu variable to be used for read/write locks 
>> belonging
>> to seperate data structures as long as multiple per-cpu read locks are not
>> simultaneously held by one particular cpu. This also means per-cpu
>> reader-writer locks are not recursion safe.
>>
>> Slow path readers which are unblocked set the per-cpu variable and drop the
>> read lock. This simplifies the implementation and allows for fairness in the
>> underlying read-write lock to be taken advantage of.
>>
>> There may be slightly more overhead on the per-cpu write lock path due to
>> checking each CPUs fast path read variable but this overhead is likely be 
>> hidden
>> by the required delay of waiting for readers to exit the critical section.
>> The loop is optimised to only iterate over the per-cpu data of active readers
>> of the rwlock.
>>
>> Signed-off-by: Malcolm Crossley <malcolm.crossley@xxxxxxxxxx>
>> ---
>>   xen/common/spinlock.c        | 32 ++++++++++++++++++++++++++++++++
>>   xen/include/asm-arm/percpu.h |  5 +++++
>>   xen/include/asm-x86/percpu.h |  6 ++++++
>>   xen/include/xen/percpu.h     |  4 ++++
>>   xen/include/xen/spinlock.h   | 37 +++++++++++++++++++++++++++++++++++++
>>   5 files changed, 84 insertions(+)
>>
>> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
>> index 7f89694..a526216 100644
>> --- a/xen/common/spinlock.c
>> +++ b/xen/common/spinlock.c
>> @@ -492,6 +492,38 @@ int _rw_is_write_locked(rwlock_t *lock)
>>       return (lock->lock == RW_WRITE_FLAG); /* writer in critical section? */
>>   }
>>   +void percpu_write_lock(rwlock_t **per_cpudata, bool_t *writer_activating,
>> +        rwlock_t *rwlock)
>> +{
>> +    int cpu;
>> +    cpumask_t active_readers;
>> +
>> +    cpumask_copy(&active_readers, &cpu_online_map);
>> +    /* First take the write lock to protect against other writers */
>> +    write_lock(rwlock);
>> +
>> +    /* Now set the global variable so that readers start using read_lock */
>> +    *writer_activating = true;
>> +    smp_mb();
>> +
>> +    /* Check if there are any percpu readers in progress on our rwlock*/
>> +    do
>> +    {
>> +        for_each_cpu(cpu, &active_readers)
>> +        {
>> +            /* Remove any percpu readers not contending
>> +             * from our check mask */
>> +            if ( per_cpu_ptr(per_cpudata, cpu) != rwlock )
>> +                cpumask_clear_cpu(cpu, &active_readers);
>> +        }
>> +        /* Check if we've cleared all percpu readers */
>> +        if ( cpumask_empty(&active_readers) )
>> +            break;
>> +        /* Give the coherency fabric a break */
>> +        cpu_relax();
>> +    } while ( 1 );
>> +}
>> +
>>   #ifdef LOCK_PROFILE
>>     struct lock_profile_anc {
>> diff --git a/xen/include/asm-arm/percpu.h b/xen/include/asm-arm/percpu.h
>> index 71e7649..c308a56 100644
>> --- a/xen/include/asm-arm/percpu.h
>> +++ b/xen/include/asm-arm/percpu.h
>> @@ -27,6 +27,11 @@ void percpu_init_areas(void);
>>   #define __get_cpu_var(var) \
>>       (*RELOC_HIDE(&per_cpu__##var, READ_SYSREG(TPIDR_EL2)))
>>   +#define per_cpu_ptr(var, cpu)  \
>> +    (*RELOC_HIDE(&var, __per_cpu_offset[cpu]))
>> +#define __get_cpu_ptr(var) \
>> +    (*RELOC_HIDE(&var, READ_SYSREG(TPIDR_EL2)))
>> +
>>   #define DECLARE_PER_CPU(type, name) extern __typeof__(type) per_cpu__##name
>>     DECLARE_PER_CPU(unsigned int, cpu_id);
>> diff --git a/xen/include/asm-x86/percpu.h b/xen/include/asm-x86/percpu.h
>> index 604ff0d..51562b9 100644
>> --- a/xen/include/asm-x86/percpu.h
>> +++ b/xen/include/asm-x86/percpu.h
>> @@ -20,4 +20,10 @@ void percpu_init_areas(void);
>>     #define DECLARE_PER_CPU(type, name) extern __typeof__(type) 
>> per_cpu__##name
>>   +#define __get_cpu_ptr(var) \
>> +    (*RELOC_HIDE(var, get_cpu_info()->per_cpu_offset))
>> +
>> +#define per_cpu_ptr(var, cpu)  \
>> +    (*RELOC_HIDE(var, __per_cpu_offset[cpu]))
>> +
>>   #endif /* __X86_PERCPU_H__ */
>> diff --git a/xen/include/xen/percpu.h b/xen/include/xen/percpu.h
>> index abe0b11..c896863 100644
>> --- a/xen/include/xen/percpu.h
>> +++ b/xen/include/xen/percpu.h
>> @@ -16,6 +16,10 @@
>>   /* Preferred on Xen. Also see arch-defined per_cpu(). */
>>   #define this_cpu(var)    __get_cpu_var(var)
>>   +#define this_cpu_ptr(ptr)    __get_cpu_ptr(ptr)
>> +
>> +#define get_per_cpu_var(var)  (per_cpu__##var)
>> +
>>   /* Linux compatibility. */
>>   #define get_cpu_var(var) this_cpu(var)
>>   #define put_cpu_var(var)
>> diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h
>> index fb0438e..f929f1b 100644
>> --- a/xen/include/xen/spinlock.h
>> +++ b/xen/include/xen/spinlock.h
>> @@ -3,6 +3,7 @@
>>     #include <asm/system.h>
>>   #include <asm/spinlock.h>
>> +#include <xen/stdbool.h>
>>     #ifndef NDEBUG
>>   struct lock_debug {
>> @@ -261,4 +262,40 @@ int _rw_is_write_locked(rwlock_t *lock);
>>   #define rw_is_locked(l)               _rw_is_locked(l)
>>   #define rw_is_write_locked(l)         _rw_is_write_locked(l)
>>   +static inline void percpu_read_lock(rwlock_t **per_cpudata, bool_t 
>> *writer_activating,
>> +        rwlock_t *rwlock)
>> +{
>> +    /* Indicate this cpu is reading */
>> +    this_cpu_ptr(per_cpudata) = rwlock;
>> +    smp_mb();
>> +    /* Check if a writer is waiting */
>> +    if ( unlikely(*writer_activating) )
>> +    {
>> +        /* Let the waiting writer know we aren't holding the lock */
>> +        this_cpu_ptr(per_cpudata) = NULL;
>> +        /* Wait using the read lock to keep the lock fair */
>> +        read_lock(rwlock);
>> +        /* Set the per CPU data again and continue*/
>> +        this_cpu_ptr(per_cpudata) = rwlock;
>> +        /* Drop the read lock because we don't need it anymore */
>> +        read_unlock(rwlock);
>> +    }
>> +}
>> +
>> +static inline void percpu_read_unlock(rwlock_t **per_cpudata)
>> +{
>> +    this_cpu_ptr(per_cpudata) = NULL;
>> +    smp_wmb();
>> +}
>> +
>> +/* Don't inline percpu write lock as it's a complex function */
>> +void percpu_write_lock(rwlock_t **per_cpudata, bool_t *writer_activating,
>> +        rwlock_t *rwlock);
>> +
>> +static inline void percpu_write_unlock(bool_t *writer_activating, rwlock_t 
>> *rwlock)
>> +{
>> +    *writer_activating = false;
>> +    write_unlock(rwlock);
>> +}
>> +
>>   #endif /* __SPINLOCK_H__ */
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.