Xen project Mailing List

Re: [Xen-devel] Question about sharing spinlock_t among VMs in Xen

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Wed, 15 Jun 2016 11:28:51 -0400

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 15 Jun 2016 15:29:01 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, Jun 14, 2016 at 12:01 PM, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote: > > On 14/06/16 03:13, Meng Xu wrote: > > On Mon, Jun 13, 2016 at 6:54 PM, Andrew Cooper > > <andrew.cooper3@xxxxxxxxxx> wrote: > >> On 13/06/2016 18:43, Meng Xu wrote: > >>> Hi, > >>> > >>> I have a quick question about using the Linux spin_lock() in Xen > >>> environment to protect some host-wide shared (memory) resource among > >>> VMs. > >>> > >>> *** The question is as follows *** > >>> Suppose I have two Linux VMs sharing the same spinlock_t lock (through > >>> the sharing memory) on the same host. Suppose we have one process in > >>> each VM. Each process uses the linux function spin_lock(&lock) [1] to > >>> grab & release the lock. > >>> Will these two processes in the two VMs have race on the shared lock? > >> "Race" is debatable. (After all, the point of a lock is to have > >> serialise multiple accessors). But yes, this will be the same lock. > >> > >> The underlying cache coherency fabric will perform atomic locked > >> operations on the same physical piece of RAM. > > The experiment we did is on a computer that is not NUMA. > > Why do you think this makes any difference? Unless you have a > uni-processor system from ages ago, there will be cache coherency being > done in hardware. > > > So it should not be caused by the sync. issue in hardware. > > I do not understand what you are trying to say here. I was thinking if the x86 memory consistency model, i.e., TSO, will cause any issue? Should we use some memory barrier to sync. the memory operation? > > > > > >> The important question is whether the two difference VMs have an > >> identical idea of what a spinlock_t is. If not, this will definitely fail. > > I see the key point here now. However, I'm not that sure about if the > > two VMs have an *identical idea* of what a spinlock_t is. > > If you are not sure, then the answer is almost certainly no. Fair enough... > > > > In otherwords, how to tell "if two VMs have an identical idea of what a > > spinlock_t is"? > > Is struct spinlock_t, and all functions which modify it, identical > between all VMs trying to participate in the use of this shared memory > spinlock? Yes. The spinlock_t and all functions which modify it are identical between all VMs. Does this mean they have the identical idea of what a spinlock_t is? > > > > > > The current situation is as follows: > > Both VMs are using the same memory area for the spinlock_t variable. > > The spin_lock() in both VMs are operating on the same spinlock_t > > variable. So IMHO, the spinlock_t should be identical to these two > > VMs? > > Please correct me if I'm wrong. (I guess my understanding of the > > "identical idea of spinlock_t" may probably be incorrect. :-( ) > > > >>> My speculation is that it should have the race on the shard lock when > >>> the spin_lock() function in *two VMs* operate on the same lock. > >>> > >>> We did some quick experiment on this and we found one VM sometimes see > >>> the soft lockup on the lock. But we want to make sure our > >>> understanding is correct. > >>> > >>> We are exploring if we can use the spin_lock to protect the shared > >>> resources among VMs, instead of using the PV drivers. If the > >>> spin_lock() in linux can provide the host-wide atomicity (which will > >>> surprise me, though), that will be great. Otherwise, we probably have > >>> to expose the spin_lock in Xen to the Linux? > >> What are you attempting to protect like this? > > For example, if two VMs are sharing a chunk of memory with both read > > and write permissions, a VM has to grab the lock before it can operate > > on the shared memory. > > If we want a VM directly operate on the shared resource, instead of > > using the PV device model, we may need to use spinlock to protect the > > access to the shared resource. That's why we are looking at the > > spinlock. > > > >> Anything which a guest can spin on like this is a recipe for disaster, > >> as you observe, as the guest which holds the lock will get scheduled out > >> in favour of the guest attempting to take the lock. > > It is true in general. The reason why we choose to let it spin is > > because some people in academia propose the protocols to access the > > shared resource through spinlock. In order to apply their theory, we > > may need to follow the system model they assumed. The theory did > > consider the situation when a guest/VCPU that is spinning on a lock is > > schedule out. The theory has to consider the extra delay caused by > > this situation. [OK. This is the reason why we did like this. But we > > are also thinking if we can do better in terms of the overall system > > performance.] > > > > BTW, I agree with you that letting guest spin like this could be a > > problem for the overall system performance. > > > >> Alternatively, two > >> different guests with a different idea of how to manage the memory > >> backing a spinlock_t. > > Just to confirm: > > Did you mean that different guests will use different policies to > > handle the same spinlock_t? > > This may mean that we need to have some special locking protocol, > > instead of the ticket_lock to handle the spin_lock? > > > > For example, a very simple and probably naive idea is that we may let > > a guest not be scheduled out before it releases the lock. I just want > > to use this simple example to make sure I understood the "alternative" > > idea here. :-) > > A guest is not in control of when it gets descheduled, and you cant yank > a lock while the guest is in a critical region. Unless we don't commit the change until the end of the critical region. (But it will make this like a transaction. OK. Let's avoid this first.) > > If you want to proceed down this route, you will want to look at the > PVspinlock impelementation where you block on an event channel while > waiting for a lock held by a different vcpu, which frees up execution > resource for the holder of the lock to complete. I will have a look at the pvspinlock then. Thank you very much for your suggestions and advices, Andrew! :-) Best Regards, Meng ----------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/ _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.