|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 12/12] evtchn: convert domain event lock to an r/w one
On Mon, Sep 28, 2020 at 01:02:43PM +0200, Jan Beulich wrote:
> Especially for the use in evtchn_move_pirqs() (called when moving a vCPU
> across pCPU-s) and the ones in EOI handling in PCI pass-through code,
> serializing perhaps an entire domain isn't helpful when no state (which
> isn't e.g. further protected by the per-channel lock) changes.
>
> Unfortunately this implies dropping of lock profiling for this lock,
> until r/w locks may get enabled for such functionality.
>
> While ->notify_vcpu_id is now meant to be consistently updated with the
> per-channel lock held for writing, an extension applies to ECS_PIRQ: The
> field is also guaranteed to not change with the per-domain event lock
> held. Therefore the unlink_pirq_port() call from evtchn_bind_vcpu() as
> well as the link_pirq_port() one from evtchn_bind_pirq() could in
> principle be moved out of the per-channel locked regions, but this
> further code churn didn't seem worth it.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> RFC:
> * In evtchn_bind_vcpu() the question is whether limiting the use of
> write_lock() to just the ECS_PIRQ case is really worth it.
IMO I would just use use write_lock() at the top of the function in
place of the current spin_lock. The more fine grained change should be
done as a follow up patch if it's worth it. TBH event channels
shouldn't change vCPU that frequently that using a more fine grained
approach matters much.
> * In flask_get_peer_sid() the question is whether we wouldn't better
> switch to using the per-channel lock.
>
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -465,7 +465,7 @@ int msixtbl_pt_register(struct domain *d
> int r = -EINVAL;
>
> ASSERT(pcidevs_locked());
> - ASSERT(spin_is_locked(&d->event_lock));
> + ASSERT(rw_is_write_locked(&d->event_lock));
FWIW, we could switch rw_is_write_locked to use
_is_write_locked_by_me (or introduce rw_is_write_locked_by_me, albeit
I think all users of rw_is_write_locked care about the lock being
taken by them).
> @@ -1098,7 +1108,7 @@ int evtchn_reset(struct domain *d, bool
> if ( d != current->domain && !d->controller_pause_count )
> return -EINVAL;
>
> - spin_lock(&d->event_lock);
> + read_lock(&d->event_lock);
>
> /*
> * If we are resuming, then start where we stopped. Otherwise, check
> @@ -1109,7 +1119,7 @@ int evtchn_reset(struct domain *d, bool
> if ( i > d->next_evtchn )
> d->next_evtchn = i;
Using the read lock to write to d->next_evtchn here...
>
> - spin_unlock(&d->event_lock);
> + read_unlock(&d->event_lock);
>
> if ( !i )
> return -EBUSY;
> @@ -1121,14 +1131,14 @@ int evtchn_reset(struct domain *d, bool
> /* NB: Choice of frequency is arbitrary. */
> if ( !(i & 0x3f) && hypercall_preempt_check() )
> {
> - spin_lock(&d->event_lock);
> + write_lock(&d->event_lock);
> d->next_evtchn = i;
... but the write lock here instead seems inconsistent.
> - spin_unlock(&d->event_lock);
> + write_unlock(&d->event_lock);
> return -ERESTART;
> }
> }
>
> - spin_lock(&d->event_lock);
> + write_lock(&d->event_lock);
>
> d->next_evtchn = 0;
>
> @@ -1557,7 +1568,7 @@ static void domain_dump_evtchn_info(stru
> "Polling vCPUs: {%*pbl}\n"
> " port [p/m/s]\n", d->domain_id, d->max_vcpus, d->poll_mask);
>
> - spin_lock(&d->event_lock);
> + read_lock(&d->event_lock);
Since this is a debug key, I would suggest using read_trylock in order
to prevent blocking if a CPU is stuck while holding the event_lock in
write mode.
> --- a/xen/drivers/passthrough/io.c
> +++ b/xen/drivers/passthrough/io.c
> @@ -105,7 +105,7 @@ static void pt_pirq_softirq_reset(struct
> {
> struct domain *d = pirq_dpci->dom;
>
> - ASSERT(spin_is_locked(&d->event_lock));
> + ASSERT(rw_is_write_locked(&d->event_lock));
>
> switch ( cmpxchg(&pirq_dpci->state, 1 << STATE_SCHED, 0) )
> {
> @@ -162,7 +162,7 @@ static void pt_irq_time_out(void *data)
> const struct hvm_irq_dpci *dpci;
> const struct dev_intx_gsi_link *digl;
>
> - spin_lock(&irq_map->dom->event_lock);
> + read_lock(&irq_map->dom->event_lock);
Is it fine to use the lock in read mode here? It's likely to change
the flags by adding HVM_IRQ_DPCI_EOI_LATCH, and hence should use the
lock in write mode?
As I think that's the lock that's supposed to protect changes to the
flags field?
> static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci
> *pirq_dpci)
> @@ -893,7 +893,7 @@ static void hvm_dirq_assist(struct domai
> return;
> }
>
> - spin_lock(&d->event_lock);
> + read_lock(&d->event_lock);
It's also not clear to me that a read lock can be used here, since you
increase a couple of counters of hvm_pirq_dpci which doesn't seem to
be protected by any other lock?
> if ( test_and_clear_bool(pirq_dpci->masked) )
> {
> struct pirq *pirq = dpci_pirq(pirq_dpci);
> @@ -947,7 +947,7 @@ static void hvm_dirq_assist(struct domai
> }
>
> out:
> - spin_unlock(&d->event_lock);
> + read_unlock(&d->event_lock);
> }
>
> static void hvm_pirq_eoi(struct pirq *pirq,
> @@ -1012,7 +1012,7 @@ void hvm_dpci_eoi(struct domain *d, unsi
>
> if ( is_hardware_domain(d) )
> {
> - spin_lock(&d->event_lock);
> + read_lock(&d->event_lock);
> hvm_gsi_eoi(d, guest_gsi, ent);
> goto unlock;
> }
> @@ -1023,7 +1023,7 @@ void hvm_dpci_eoi(struct domain *d, unsi
> return;
> }
>
> - spin_lock(&d->event_lock);
> + read_lock(&d->event_lock);
> hvm_irq_dpci = domain_get_irq_dpci(d);
>
> if ( !hvm_irq_dpci )
> @@ -1033,7 +1033,7 @@ void hvm_dpci_eoi(struct domain *d, unsi
> __hvm_dpci_eoi(d, girq, ent);
__hvm_dpci_eoi will call hvm_pirq_eoi and that seems to require a
write lock, as it modifies pirq_dpci.
>
> unlock:
> - spin_unlock(&d->event_lock);
> + read_unlock(&d->event_lock);
> }
>
> /*
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -883,7 +883,7 @@ static int pci_clean_dpci_irqs(struct do
> if ( !is_hvm_domain(d) )
> return 0;
>
> - spin_lock(&d->event_lock);
> + write_lock(&d->event_lock);
> hvm_irq_dpci = domain_get_irq_dpci(d);
> if ( hvm_irq_dpci != NULL )
> {
> @@ -901,14 +901,14 @@ static int pci_clean_dpci_irqs(struct do
> ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
> if ( ret )
> {
> - spin_unlock(&d->event_lock);
> + read_unlock(&d->event_lock);
This should be a write_unlock AFAICT.
> return ret;
> }
>
> hvm_domain_irq(d)->dpci = NULL;
> free_hvm_irq_dpci(hvm_irq_dpci);
> }
> - spin_unlock(&d->event_lock);
> + write_unlock(&d->event_lock);
> return 0;
> }
>
> --- a/xen/drivers/passthrough/vtd/x86/hvm.c
> +++ b/xen/drivers/passthrough/vtd/x86/hvm.c
> @@ -54,7 +54,7 @@ void hvm_dpci_isairq_eoi(struct domain *
> if ( !is_iommu_enabled(d) )
> return;
>
> - spin_lock(&d->event_lock);
> + read_lock(&d->event_lock);
I think this also needs to be a write lock, as you modify pirq_dpci
bits in _hvm_dpci_isairq_eoi.
>
> dpci = domain_get_irq_dpci(d);
>
> @@ -63,5 +63,5 @@ void hvm_dpci_isairq_eoi(struct domain *
> /* Multiple mirq may be mapped to one isa irq */
> pt_pirq_iterate(d, _hvm_dpci_isairq_eoi, (void *)(long)isairq);
> }
> - spin_unlock(&d->event_lock);
> + read_unlock(&d->event_lock);
> }
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -373,7 +373,7 @@ struct domain
> unsigned int xen_evtchns;
> /* Port to resume from in evtchn_reset(), when in a continuation. */
> unsigned int next_evtchn;
> - spinlock_t event_lock;
> + rwlock_t event_lock;
It would be nice to add a comment regarding what fields does
event_lock protect. It's kind of a very generic lock name that I think
has been abused a bit.
Not that it needs to be done in that patch.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |