|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] x86/vpt: Do not take pt_migrate rwlock in some cases
On 29.03.2021 12:40, Roger Pau Monné wrote:
> On Mon, Mar 29, 2021 at 11:56:56AM +0200, Jan Beulich wrote:
>> On 27.03.2021 02:51, Boris Ostrovsky wrote:
>>> Commit 8e76aef72820 ("x86/vpt: fix race when migrating timers between
>>> vCPUs") addressed XSA-336 by introducing a per-domain rwlock that was
>>> intended to protect periodic timer during VCPU migration. Since such
>>> migration is an infrequent event no performance impact was expected.
>>>
>>> Unfortunately this turned out not to be the case: on a fairly large
>>> guest (92 VCPUs) we've observed as much as 40% TPCC performance
>>> regression with some guest kernels. Further investigation pointed to
>>> pt_migrate read lock taken in pt_update_irq() as the largest contributor
>>> to this regression. With large number of VCPUs and large number of VMEXITs
>>> (from where pt_update_irq() is always called) the update of an atomic in
>>> read_lock() is thought to be the main cause.
>>>
>>> Stephen Brennan analyzed locking pattern and classified lock users as
>>> follows:
>>>
>>> 1. Functions which read (maybe write) all periodic_time instances
>>> attached to a particular vCPU. These are functions which use pt_vcpu_lock()
>>> after the commit, such as pt_restore_timer(), pt_save_timer(), etc.
>>> 2. Functions which want to modify a particular periodic_time object.
>>> These guys lock whichever vCPU the periodic_time is attached to, but
>>> since the vCPU could be modified without holding any lock, they are
>>> vulnerable to the bug. Functions in this group use pt_lock(), such as
>>> pt_timer_fn() or destroy_periodic_time().
>>> 3. Functions which not only want to modify the periodic_time, but also
>>> would like to modify the =vcpu= fields. These are create_periodic_time()
>>> or pt_adjust_vcpu(). They create the locking imbalance bug for group 2,
>>> but we can't simply hold 2 vcpu locks due to the deadlock risk.
>>>
>>> Roger Monné then pointed out that group 1 functions don't really need
>>> to hold the pt_migrate rwlock and that group 3 should be hardened by
>>> holding appropriate vcpu's tm_lock when adding or deleting a timer
>>> from its list.
>>
>> I'm struggling some with the latter aspect: Is this to mean there is
>> something wrong right now?
>
> There's nothing wrong right now AFAICT , because per-vcpu users (ie:
> type 1) also hold the rw lock in read mode when iterating over the
> per-vcpu list.
>
>> Or does "harden" really mean "just to be
>> on the safe side" here?
>
> If the per-domain rw lock is no longer read-locked by type 1 users
> then type 2 and type 3 users need to make sure the per-vcpu lock is
> taken before adding or removing items from the per-vcpu lists, or else
> a type 1 user could see list corruption as a result of modifications
> done by type 2 or 3 without holding the per-vcpu lock.
Ah, right. Boris, may I then ask to avoid the somewhat ambiguous word
"harden" here then, and instead make clear that the new locking is in
fact "balancing" removal of locks elsewhere?
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |