[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about sharing spinlock_t among VMs in Xen



On 14/06/16 03:13, Meng Xu wrote:
> On Mon, Jun 13, 2016 at 6:54 PM, Andrew Cooper
> <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 13/06/2016 18:43, Meng Xu wrote:
>>> Hi,
>>>
>>> I have a quick question about using the Linux spin_lock() in Xen
>>> environment to protect some host-wide shared (memory) resource among
>>> VMs.
>>>
>>> *** The question is as follows ***
>>> Suppose I have two Linux VMs sharing the same spinlock_t lock (through
>>> the sharing memory) on the same host. Suppose we have one process in
>>> each VM. Each process uses the linux function spin_lock(&lock) [1] to
>>> grab & release the lock.
>>> Will these two processes in the two VMs have race on the shared lock?
>> "Race" is debatable.  (After all, the point of a lock is to have
>> serialise multiple accessors).  But yes, this will be the same lock.
>>
>> The underlying cache coherency fabric will perform atomic locked
>> operations on the same physical piece of RAM.
> The experiment we did is on a computer that is not NUMA.

Why do you think this makes any difference?  Unless you have a
uni-processor system from ages ago, there will be cache coherency being
done in hardware.

> So it should not be caused by the sync. issue in hardware.

I do not understand what you are trying to say here.

>
>> The important question is whether the two difference VMs have an
>> identical idea of what a spinlock_t is.  If not, this will definitely fail.
> I see the key point here now. However, I'm not that sure about if the
> two VMs have an *identical idea* of what a spinlock_t is.

If you are not sure, then the answer is almost certainly no.

> In otherwords, how to tell "if two VMs have an identical idea of what a
> spinlock_t is"?

Is struct spinlock_t, and all functions which modify it, identical
between all VMs trying to participate in the use of this shared memory
spinlock?

>
> The current situation is as follows:
> Both VMs are using the same memory area for the spinlock_t variable.
> The spin_lock() in both VMs are operating on the same spinlock_t
> variable. So IMHO, the spinlock_t should be identical to these two
> VMs?
> Please correct me if I'm wrong. (I guess my understanding of the
> "identical idea of spinlock_t" may probably be incorrect. :-( )
>
>>> My speculation is that it should have the race on the shard lock when
>>> the spin_lock() function in *two VMs* operate on the same lock.
>>>
>>> We did some quick experiment on this and we found one VM sometimes see
>>> the soft lockup on the lock. But we want to make sure our
>>> understanding is correct.
>>>
>>> We are exploring if we can use the spin_lock to protect the shared
>>> resources among VMs, instead of using the PV drivers. If the
>>> spin_lock() in linux can provide the host-wide atomicity (which will
>>> surprise me, though), that will be great. Otherwise, we probably have
>>> to expose the spin_lock in Xen to the Linux?
>> What are you attempting to protect like this?
> For example, if two VMs are sharing a chunk of memory with both read
> and write permissions, a VM has to grab the lock before it can operate
> on the shared memory.
> If we want a VM directly operate on the shared resource, instead of
> using the PV device model, we may need to use spinlock to protect the
> access to the shared resource. That's why we are looking at the
> spinlock.
>
>> Anything which a guest can spin on like this is a recipe for disaster,
>> as you observe, as the guest which holds the lock will get scheduled out
>> in favour of the guest attempting to take the lock.
> It is true in general. The reason why we choose to let it spin is
> because some people in academia propose the protocols to access the
> shared resource through spinlock. In order to apply their theory, we
> may need to follow the system model they assumed. The theory did
> consider the situation when a guest/VCPU that is spinning on a lock is
> schedule out. The theory has to consider the extra delay caused by
> this situation. [OK. This is the reason why we did like this. But we
> are also thinking if we can do better in terms of the overall system
> performance.]
>
> BTW, I agree with you that letting guest spin like this could be a
> problem for the overall system performance.
>
>> Alternatively, two
>> different guests with a different idea of how to manage the memory
>> backing a spinlock_t.
> Just to confirm:
> Did you mean that different guests will use different policies to
> handle the same spinlock_t?
> This may mean that we need to have some special locking protocol,
> instead of the ticket_lock to handle the spin_lock?
>
> For example, a very simple and probably naive idea is that we may let
> a guest not be scheduled out before it releases the lock. I just want
> to use this simple example to make sure I understood the "alternative"
> idea here. :-)

A guest is not in control of when it gets descheduled, and you cant yank
a lock while the guest is in a critical region.

If you want to proceed down this route, you will want to look at the
PVspinlock impelementation where you block on an event channel while
waiting for a lock held by a different vcpu, which frees up execution
resource for the holder of the lock to complete.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.