[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] rwlock: allow recursive read locking when already locked in write mode
On Fri, Feb 21, 2020 at 04:15:39PM +0100, Jan Beulich wrote: > On 21.02.2020 15:58, Roger Pau Monné wrote: > > On Fri, Feb 21, 2020 at 03:52:28PM +0100, Jan Beulich wrote: > >> On 21.02.2020 15:49, Roger Pau Monné wrote: > >>> On Fri, Feb 21, 2020 at 03:41:59PM +0100, Jan Beulich wrote: > >>>> On 21.02.2020 15:26, Roger Pau Monné wrote: > >>>>> On Fri, Feb 21, 2020 at 02:36:52PM +0100, Jan Beulich wrote: > >>>>>> On 21.02.2020 10:10, Roger Pau Monné wrote: > >>>>>>> On Thu, Feb 20, 2020 at 07:20:06PM +0000, Julien Grall wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> On 20/02/2020 17:31, Roger Pau Monne wrote: > >>>>>>>>> Allow a CPU already holding the lock in write mode to also lock it > >>>>>>>>> in > >>>>>>>>> read mode. There's no harm in allowing read locking a rwlock that's > >>>>>>>>> already owned by the caller (ie: CPU) in write mode. Allowing such > >>>>>>>>> accesses is required at least for the CPU maps use-case. > >>>>>>>>> > >>>>>>>>> In order to do this reserve 14bits of the lock, this allows to > >>>>>>>>> support > >>>>>>>>> up to 16384 CPUs. Also reduce the write lock mask to 2 bits: one to > >>>>>>>>> signal there are pending writers waiting on the lock and the other > >>>>>>>>> to > >>>>>>>>> signal the lock is owned in write mode. Note the write related data > >>>>>>>>> is using 16bits, this is done in order to be able to clear it (and > >>>>>>>>> thus release the lock) using a 16bit atomic write. > >>>>>>>>> > >>>>>>>>> This reduces the maximum number of concurrent readers from 16777216 > >>>>>>>>> to > >>>>>>>>> 65536, I think this should still be enough, or else the lock field > >>>>>>>>> can be expanded from 32 to 64bits if all architectures support > >>>>>>>>> atomic > >>>>>>>>> operations on 64bit integers. > >>>>>>>> > >>>>>>>> FWIW, arm32 is able to support atomic operations on 64-bit integers. > >>>>>>>> > >>>>>>>>> static inline void _write_unlock(rwlock_t *lock) > >>>>>>>>> { > >>>>>>>>> - /* > >>>>>>>>> - * If the writer field is atomic, it can be cleared directly. > >>>>>>>>> - * Otherwise, an atomic subtraction will be used to clear it. > >>>>>>>>> - */ > >>>>>>>>> - atomic_sub(_QW_LOCKED, &lock->cnts); > >>>>>>>>> + /* Since the writer field is atomic, it can be cleared > >>>>>>>>> directly. */ > >>>>>>>>> + ASSERT(_is_write_locked_by_me(atomic_read(&lock->cnts))); > >>>>>>>>> + BUILD_BUG_ON(_QR_SHIFT != 16); > >>>>>>>>> + write_atomic((uint16_t *)&lock->cnts, 0); > >>>>>>>> > >>>>>>>> I think this is an abuse to cast an atomic_t() directly into a > >>>>>>>> uint16_t. You > >>>>>>>> would at least want to use &lock->cnts.counter here. > >>>>>>> > >>>>>>> Sure, I was wondering about this myself. > >>>>>>> > >>>>>>> Will wait for more comments, not sure whether this can be fixed upon > >>>>>>> commit if there are no other issues. > >>>>>> > >>>>>> It's more than just adding another field specifier here. A cast like > >>>>>> this one is endianness-unsafe, and hence a trap waiting for a big > >>>>>> endian port attempt to fall into. At the very least this should cause > >>>>>> a build failure on big endian systems, even better would be if it was > >>>>>> endianness-safe. > >>>>> > >>>>> Why don't we leave the atomic_dec then? > >>>> > >>>> Because you need to invoke smp_processor_id() to calculate the value > >>>> to use in the subtraction. I'm not meaning to say I'm entirely > >>>> opposed (seeing how much of a discussion we're having), but the > >>>> "simple write of zero" approach is certainly appealing. > >>> > >>> Well, we could avoid the smp_processor_id() call and instead use: > >>> > >>> atomic_sub(atomic_read(&lock->cnts) & 0xffff, &lock->cnts); > >> > >> Which would make me suggest atomic_and() again. > > > > I'm certainly not opposed to that, but in order to get this regression > > fixed I would argue that such atomic_sub is no worse than what's > > currently done. > > It's one more read of a potentially heavily contended memory location. > > > I can look into adding an atomic_and operation to all arches, but this > > is likely to take time and I would like to get this sorted sooner > > rather than later. > > Because of it presumably taking time it was that I also brought up to > consider reverting. Then again, the Arm atomic_and() can, afaict, be > trivially cloned from their "add" counterparts, by replacing the > middle 'd' both in the function names and in the insn mnemonics. It's > just that there shouldn't be atomic_and_return() (for not being > useful, as it doesn't allow reconstructing the original value). I've added atomic_and in a pre-patch and I'm currently waiting for gitlab to finish testing to assert that I'm not breaking the build. Will post once that's finished if successful. FWIW, the branch is at: http://xenbits.xen.org/gitweb/?p=people/royger/xen.git;a=shortlog;h=refs/heads/rwlock3 Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |