[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/smp: Speed up on_selected_cpus()


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Date: Mon, 7 Feb 2022 17:06:41 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uvVhDJQWqMKWPbwaT60cpH0Ssdw9h6aA84VtREa7C+o=; b=NVzjI5vev8wq1rJt7jZsnuaT7rK3sRXD24KlnZlI0bn/5FINj48D2+PNO1wG2BbIAgenfLzWmUoLwMG6gZYb7X4uSQkZDm+yNH3fEbHsRxaOd7j9v/tRH6lSENtNJemzPZHfAHivJa9xvTJWundkv43PVVb9j+8/QVsNLTNGc8F+PtUYZvBGCZIJbVmWZZdFgpGUB61VI5gsadEBpB56rAKoCRNKhShFd8+MEfP4gW2D/0wuQmIypkLggKXjRvptME3BGD31GKxqiinDJ/4w03Frro8fxsfoDOukp/0mYQm7WK7R8GtMOOFjI9rVwON6Bzk9nTT6a/7uy4+T2mi3sg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=atl5Os6GcVO+jz4RNiwKeClZmShtlNLzJyhV2npDcHfP8aoIeF0tQR2O0+2g5yCsM8tiVA0ou2d+9rynsWmIBGtjF0XgCb2ZN0UW9IrM5p/39PoQwZKEezqsosVg9QI6z25lJwERoLBibI3zV5t9smItTTGAG8ptLTYa0+3jCFQAMXCfoT1J5aSs/8SPppVztMHCMErZty/PemqA8OAWIyUDvq0X+xNJ9n7HPZs9T9KZaLXXtEZnQRu7QOE/aO9+FET+T4ht4Y+qeWDlLSj3brBOUT2ncWeJDNsOrnVRB4L+GstJkBi8Mh7Xaut5u4qkJSV9rJmMCeMyl1q58COdDQ==
  • Authentication-results: esa1.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, "Juergen Gross" <jgross@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "Julien Grall" <julien@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 07 Feb 2022 17:06:50 +0000
  • Ironport-data: A9a23:w5UhUqP5m7ZliTLvrR3rkcFynXyQoLVcMsEvi/4bfWQNrUoghGYPy mtKDGuBOqmCNGDxe4pxat7n8UoOsZaAx4Q2HQto+SlhQUwRpJueD7x1DKtR0wB+jCHnZBg6h ynLQoCYdKjYdpJYz/uUGuCJQUNUjMlkfZKhTr6UUsxNbVU8En1500o8w7RRbrNA2rBVPSvc4 bsenOWHULOV82Yc3rU8sv/rRLtH5ZweiRtA1rAMTakjUGz2zhH5OKk3N6CpR0YUd6EPdgKMq 0Qv+5nilo/R109F5tpICd8XeGVSKlLZFVDmZna7x8FOK/WNz8A/+v9TCRYSVatYoziojtxel JYOjLabVSsHB63yx9otTyANRkmSPYUekFPGCX22sMjVxEzaaXr8hf5pCSnaP6VBpLwxWzsXs 6VFdnZdNXhvhMrvqF6/YsBqit4uM4/AO4QHt2s75TrYEewnUdbIRKCiCdpwgmto2JEUQau2i 8wxd2dqSCXRQyF2HmgYOLRnzOK1xVOkfGgNwL6SjfVuuDWCpOBr65DTN97Sds2PVN9itE+Sr WLb/Ez0GhgfcteYzFKt8G+oh+LJtTP2XsQVDrLQ3u5nhhify3IeDDUSVECnur+ph0imQdVdJ kcIvC00osAa/kGxUsP0WRH+pXeepwMdQPJZCeh84waIooLL5y6JC25CSSROAPQkvsIrQT0h1 neSgsjkQzdotdW9UmmB/72ZqTezPyk9LmIYYyIACwwf7LHLoo4piQnUZs1+C6PzhdrwcQwc2 BjT8nJ43e9Ky5dWiePrpjgrng5AuLDEbx85yVmUHVuPxR9VSrelWdOW2UTiuKMowJmicnGNu 30Nms675e8IDI2QmCHlfNjhDI1F9N7ebmSC3AcH840Jsm30piX9Jdw4DCRWeR8xWvvobwMFd 6M6Vel5wJZIdEWnYqZsC25aI5R7lPOwfTgJux29UzavXnSTXFLdlM2NTRTJt4wIrKTLuftuU ap3ie72UR4n5V1PlVJavds1374x3TwZzmjOX539xBnP+ePAOCLMGO5cbgrUN7tRAEa4TOL9q Yc3Cid3408HDL2Wjtf/reb/0mzm3VBkXMur+qS7h8aIIxZ8GXFJNhMi6ehJRmCRpIwMzr2g1 ijkAidwkQOj7VWaeVTiQi0yM9vHAMcgxVplZnZEFQjzhBAejXOHsf53m20fJuJ8qoSODJdcE pE4Ril3Kq4TE2ScpWVBMsWVQU4LXE3DuD9i9hGNOVAXV5VhWxbI6pnjeA7u/zMJFS25qY01p LjI6+8RacZrq91KAJmEZfSx4Um2uHRByut+U1GReotYeVn28ZgsICv016dlL8YJIBTF5z2by wfJXktI+biT+9c4oIvTmKSJj4a1CO8iTEBUKHbWsOStPi7A82v9nYIZCLSUfSrQXX/f8bm5Y bkH1On1NfAKxQ4Ys4d1H7tx47g54t/j++1Twgh+RS2ZZFW3ELJwZHKB2JAX5KFKw7ZYvyqwW 16OpYYGaenYZpu9HQdIdgQ/b+mF2fUFoRXo7Kw4cBfg+St63LubSkEObROCvzNQceluO4Q/z OZ/5MNPs16jigAnO8qthzxP8zjeNWQJVqgqu81IAILvjQZ3mFhObYaFV33z6ZCLLd5NLlMrM nmfg6+b3+ZQwU/LcnwSE3nR3LUC2cRS6U4SlFJSdU6Untflh+Ms2EwD+Ds6eQ1Z0xFb3r8hI WNsLUB0ef2D8joAaBKvhIxw99Wt3CGkx3E=
  • Ironport-hdrordr: A9a23:mwhwX6tUUY71ffoZPXv9h6g67skC0oMji2hC6mlwRA09TyXGra +TdaUguSMc1gx9ZJh5o6H8BEGBKUmskKKceeEqTPmftXrdyReVxeZZnMrfKlzbamLDH4tmu5 uIHJIOceEYYWIK7voSpTPIaerIo+P3sJxA592ut0uFJDsCA8oLjmdE40SgYzZLrWF9dMAE/f Gnl656Tk+bCBIqh7OAdx44tob41r/2vaOjRSRDKw8s6QGIgz/twqX9CQKk0hAXVC4K6as+8E De+jaJpZmLgrWe8FvxxmXT55NZlJ/K0d1YHvGBjcATN3HFlhuoXoJ8QLeP1QpF4t1HqWxa1e UkkS1QePib2EmhOF1dZiGdgjUI5Qxer0MKD2Xo2UcL7/aJHw7SQPAx+r6xOiGplXbI+usMjZ 6jlljpx6a+R3n77VXAzsmNWBdwmkWup30+1eYVknxESIMbLKRctIoF4SpuYd099Q/Bmcga+d NVfYrhDTdtACenRmGcunMqzM2nX3w1EBvDSk8eutaN2zwTmHxi1UMXyMEWg39FrfsGOtV5zv WBNr4tmKBFT8cQY644DOAdQdGvAmiIRR7XKmqdLVnuCalCMXPQrJz85qkz+YiRCdE15Yp3nI 6EXEJTtGY0dU6rAcqS3IdT+hSIW2m5VSSF8LAW23G4gMyLeFPGC1zwdLkeqbrWnxxEOLypZx +aAuMiP8Pe
  • Ironport-sdr: FZYgv2BIVj0YSGpZ88KL4Jgc+PHDz7HrGESVVev2sq8orw9uI3TQgSFSa8bG0CVgqytTg+gCiM HMQsHZp/UQsOPM+PWGEcYaEfOmxOIiqhoADJMumsR8izaKlKsUetkkvOzUx6rPKwMSc0sNGnvR ShTorvbw4NP2rRZjVOkdLlaYR0ExK/c7xi/oJ991kFIOacGSHp1mV2MIyldvay4wzdOhppeMqB VPf7u8wu5uCXmJ2Vm/sg4kqR8P2/Ki9o7bov+59K4Bk+S0RZNjcVsrfFx5MK217ESLviPjekeH 3GdFfY9SAXzXaMqRT15cZSMl
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYGgY/F0hg0t0KmkKX/IIzwuTua6yHwIWAgACVkwA=
  • Thread-topic: [PATCH] xen/smp: Speed up on_selected_cpus()

On 07/02/2022 08:11, Jan Beulich wrote:
> On 04.02.2022 21:31, Andrew Cooper wrote:
>> cpumask_weight() is a horribly expensive way to find if no bits are set, made
>> worse by the fact that the calculation is performed with the global call_lock
>> held.
>>
>> Switch to using cpumask_empty() instead, which will short circuit as soon as
>> it find any set bit in the cpumask.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> May I suggest to drop "horribly"? How expensive one is compared to the other
> depends on the number of CPUs actually enumerated in the system.

In absolute terms perhaps, but they both scale as O(nr_cpus).  Hamming
weight has a far larger constant.

>  (And of
> course I still have that conversion to POPCNT alternatives patching pending,
> where Roger did ask for some re-work in reply to v2, but where it has
> remained unclear whether investing time into that wouldn't be in vein,
> considering some of your replies on v1. Thus would have further shrunk the
> difference, without me meaning to say the change here isn't a good one.)

There is a perfectly clear and simple way forward.  It's the one which
doesn't fight the optimiser and actively regress the code generation in
the calling functions, and add an unreasonable quantity technical debt
into the marginal paths.

I will ack a version where you're not adding complexity for negative gains.

~Andrew

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.