[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Recent upgrade of 4.13 -> 4.14 issue


  • To: Juergen Gross <JGross@xxxxxxxx>, "frederic.pierret@xxxxxxxxxxxx" <frederic.pierret@xxxxxxxxxxxx>, "George.Dunlap@xxxxxxxxxx" <George.Dunlap@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Dario Faggioli <dfaggioli@xxxxxxxx>
  • Date: Sat, 31 Oct 2020 02:34:32 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=O8GQhi0UJ3LHYdAnrkP/toqkgeVbB7a547rgTaiTSe4=; b=V3DtZL4IgV6tjefX4n9cnBKFXvbKHJfwTVYeGT4WVjjKQFck6/mplScKT1EhMWZHaagNUjssnLbegGJc+KlnyE3fnGbVRcwqnMgbjw4F52Hl58MQrP0uAVhjT3PhPtyGjWscve9+7q56XrGDYOSEuZCTthxdrjeHZr8iZZFSV2yElgC04xK8nofG7/hE/eLsJFNHyVouNGc0oGrwMNEW0Fv84tf8aRLXlrcIx0z+LrhEz6vzNCM2wjyzj6anYy122DadbVlfmEnorKzqQORtg0+wMzxhrGoBRGHX5H+PL740x//PdQotTaKdnJYTkQptcerTDNOqTM+RzFyqmICXJw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ey/S6+iz7PKSfPM7M9hB9To23tgGQayqNm2SFauW4X8FWKbYHGLFY8QdnzkZrAuULvRKVeYWB8xeaDHctG46pwTx4I3x5qt4fqLdumXz7D3c5qs7QxPe+rzxGFg12FDyHrtghmethJR5iwL+E3lavHa3UCtoZpDXCoZ4JjKDH5Adn0C9ef4pO9vUNfBJLbmW45db1N+wMNDcjuFWkWcDL1M0a2BQMoajd1L7SjFYzNyxGoouGfKyVBFCwuptw/9vuQFqDp7cGQTkLFtP1tesLDYNf5UKJGbqYBpR4kPqqtgeC5+xxZ0xG7D/aAR4o6wtyKJ+5eGWH7rCd3d4FjrCZw==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=suse.com;
  • Cc: "marmarek@xxxxxxxxxxxxxxxxxxxxxx" <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Sat, 31 Oct 2020 02:34:58 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWq6SD/uyafOnyXU2hV/eab0QUF6mq9TSAgADbCLP//87fAIAFZkwA
  • Thread-topic: Recent upgrade of 4.13 -> 4.14 issue

On Tue, 2020-10-27 at 17:06 +0100, Frédéric Pierret wrote:
> 
> Ok the server got frozen just few minutes after my mail and I got
> now:
> 'r': https://gist.github.com/fepitre/78541f555902275d906d627de2420571
>
From the scheduler point of view, things seems fine:

(XEN) sched_smt_power_savings: disabled
(XEN) NOW=770188952085
(XEN) Online Cpus: 0-15
(XEN) Cpupool 0:
(XEN) Cpus: 0-15
(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource
(XEN) Scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Active queues: 2
(XEN)   default-weight     = 256
(XEN) Runqueue 0:
(XEN)   ncpus              = 8
(XEN)   cpus               = 0-7
(XEN)   max_weight         = 256
(XEN)   pick_bias          = 1
(XEN)   instload           = 7
(XEN)   aveload            = 2021119 (~770%)
(XEN)   idlers: 00000000,00000000
(XEN)   tickled: 00000000,00000000
(XEN)   fully idle cores: 00000000,00000000
(XEN) Runqueue 1:
(XEN)   ncpus              = 8
(XEN)   cpus               = 8-15
(XEN)   max_weight         = 256
(XEN)   pick_bias          = 9
(XEN)   instload           = 8
(XEN)   aveload            = 2097259 (~800%)
(XEN)   idlers: 00000000,00000000
(XEN)   tickled: 00000000,00000200
(XEN)   fully idle cores: 00000000,00000000

The system is pretty busy, but not in overload.

Below we see that CPU 3 is running the idle vCPU, but it's not marked
as neither idle nor tickled.

It may be running a tasklet (the one that dumps the debug key output, I
guess).

Credits are fine, I don't see any strange values that may indicate
anomalies or something.

All the CPUs are executing a vCPU, and there should be nothing that
prevent them making progresses.

There is one vCPU which apparetnly want to run at 100% in pretty much
all guests, and more than one in dom0.

And I think I saw some spin_lock() in the call stacks, in the partial
report of '*' debug-key?

Maybe they're stuck in the kernel, not in Xen? Thoughs ?


(XEN) Domain info:
(XEN)   Domain: 0 w 256 c 0 v 16
(XEN)     1: [0.0] flags=0 cpu=5 credit=10553147 [w=256] load=17122
(~6%)
(XEN)     2: [0.1] flags=0 cpu=4 credit=10570606 [w=256] load=13569
(~5%)
(XEN)     3: [0.2] flags=0 cpu=7 credit=10605188 [w=256] load=13465
(~5%)
(XEN)     4: [0.3] flags=2 cpu=11 credit=9998469 [w=256] load=262144
(~100%)
(XEN)     5: [0.4] flags=0 cpu=0 credit=10533686 [w=256] load=13619
(~5%)
(XEN)     6: [0.5] flags=a cpu=9 credit=1101 [w=256] load=0 (~0%)
(XEN)     7: [0.6] flags=2 cpu=2 credit=10621802 [w=256] load=13526
(~5%)
(XEN)     8: [0.7] flags=2 cpu=1 credit=10670607 [w=256] load=13453
(~5%)
(XEN)     9: [0.8] flags=2 cpu=7 credit=10649858 [w=256] load=13502
(~5%)
(XEN)    10: [0.9] flags=0 cpu=3 credit=10550566 [w=256] load=13477
(~5%)
(XEN)    11: [0.10] flags=2 cpu=4 credit=10644321 [w=256] load=13539
(~5%)
(XEN)    12: [0.11] flags=2 cpu=1 credit=10602374 [w=256] load=13471
(~5%)
(XEN)    13: [0.12] flags=0 cpu=6 credit=10617262 [w=256] load=13801
(~5%)
(XEN)    14: [0.13] flags=2 cpu=8 credit=9998664 [w=256] load=262144
(~100%)
(XEN)    15: [0.14] flags=0 cpu=3 credit=10603305 [w=256] load=17020
(~6%)
(XEN)    16: [0.15] flags=0 cpu=5 credit=10591312 [w=256] load=13523
(~5%)
(XEN)   Domain: 1 w 256 c 0 v 2
(XEN)    17: [1.0] flags=2 cpu=6 credit=4916769 [w=256] load=262144
(~100%)
(XEN)    18: [1.1] flags=0 cpu=13 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 2 w 256 c 0 v 2
(XEN)    19: [2.0] flags=2 cpu=5 credit=4982064 [w=256] load=262144
(~100%)
(XEN)    20: [2.1] flags=0 cpu=14 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 3 w 256 c 0 v 2
(XEN)    21: [3.0] flags=2 cpu=1 credit=5200781 [w=256] load=262144
(~100%)
(XEN)    22: [3.1] flags=0 cpu=5 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 4 w 256 c 0 v 2
(XEN)    23: [4.0] flags=12 cpu=0 credit=5395149 [w=256] load=262144
(~100%)
(XEN)    24: [4.1] flags=0 cpu=8 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 5 w 256 c 0 v 2
(XEN)    25: [5.0] flags=2 cpu=2 credit=5306461 [w=256] load=262144
(~100%)
(XEN)    26: [5.1] flags=0 cpu=15 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 6 w 256 c 0 v 8
(XEN)    27: [6.0] flags=12 cpu=10 credit=7915602 [w=256] load=262144
(~100%)
(XEN)    28: [6.1] flags=0 cpu=10 credit=10500000 [w=256] load=0 (~0%)
(XEN)    29: [6.2] flags=0 cpu=15 credit=10500000 [w=256] load=0 (~0%)
(XEN)    30: [6.3] flags=0 cpu=8 credit=10500000 [w=256] load=0 (~0%)
(XEN)    31: [6.4] flags=0 cpu=0 credit=10500000 [w=256] load=0 (~0%)
(XEN)    32: [6.5] flags=0 cpu=6 credit=10500000 [w=256] load=0 (~0%)
(XEN)    33: [6.6] flags=0 cpu=11 credit=10500000 [w=256] load=0 (~0%)
(XEN)    34: [6.7] flags=0 cpu=9 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 7 w 256 c 0 v 2
(XEN)    35: [7.0] flags=2 cpu=4 credit=5297013 [w=256] load=262144
(~100%)
(XEN)    36: [7.1] flags=0 cpu=11 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 8 w 256 c 0 v 4
(XEN)    37: [8.0] flags=0 cpu=14 credit=10500000 [w=256] load=0 (~0%)
(XEN)    38: [8.1] flags=2 cpu=7 credit=5240630 [w=256] load=262144
(~100%)
(XEN)    39: [8.2] flags=0 cpu=13 credit=10500000 [w=256] load=0 (~0%)
(XEN)    40: [8.3] flags=0 cpu=8 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 9 w 256 c 0 v 2
(XEN)    41: [9.0] flags=0 cpu=0 credit=10500000 [w=256] load=0 (~0%)
(XEN)    42: [9.1] flags=12 cpu=13 credit=7910266 [w=256] load=262144
(~100%)
(XEN)   Domain: 10 w 256 c 0 v 2
(XEN)    43: [10.0] flags=12 cpu=12 credit=8045458 [w=256] load=262144
(~100%)
(XEN)    44: [10.1] flags=0 cpu=8 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 11 w 256 c 0 v 2
(XEN)    45: [11.0] flags=12 cpu=14 credit=7575284 [w=256] load=262144
(~100%)
(XEN)    46: [11.1] flags=0 cpu=12 credit=10500000 [w=256] load=0
(~0%)
(XEN)   Domain: 12 w 256 c 0 v 2
(XEN)    47: [12.0] flags=2 cpu=15 credit=8014099 [w=256] load=262144
(~100%)
(XEN)    48: [12.1] flags=0 cpu=6 credit=10500000 [w=256] load=0 (~0%)
(XEN)   Domain: 13 w 256 c 0 v 2
(XEN)    49: [13.0] flags=0 cpu=7 credit=10500000 [w=256] load=0 (~0%)
(XEN)    50: [13.1] flags=0 cpu=15 credit=10500000 [w=256] load=0
(~0%)
(XEN) Runqueue 0:
(XEN) CPU[00] runq=0, sibling={0}, core={0-7}
(XEN)   run: [4.0] flags=2 cpu=0 credit=5255200 [w=256] load=262144
(~100%)
(XEN) CPU[01] runq=0, sibling={1}, core={0-7}
(XEN)   run: [3.0] flags=2 cpu=1 credit=5057668 [w=256] load=262144
(~100%)
(XEN) CPU[02] runq=0, sibling={2}, core={0-7}
(XEN)   run: [5.0] flags=12 cpu=2 credit=5180785 [w=256] load=262144
(~100%)
(XEN) CPU[03] runq=0, sibling={3}, core={0-7}
(XEN) CPU[04] runq=0, sibling={4}, core={0-7}
(XEN)   run: [7.0] flags=2 cpu=4 credit=5215323 [w=256] load=262144
(~100%)
(XEN) CPU[05] runq=0, sibling={5}, core={0-7}
(XEN)   run: [2.0] flags=2 cpu=5 credit=4816142 [w=256] load=262144
(~100%)
(XEN) CPU[06] runq=0, sibling={6}, core={0-7}
(XEN)   run: [1.0] flags=2 cpu=6 credit=4755772 [w=256] load=262144
(~100%)
(XEN) CPU[07] runq=0, sibling={7}, core={0-7}
(XEN)   run: [8.1] flags=12 cpu=7 credit=5175342 [w=256] load=262144
(~100%)
(XEN) RUNQ:
(XEN) Runqueue 1:
(XEN) CPU[08] runq=1, sibling={8}, core={8-15}
(XEN)   run: [0.13] flags=2 cpu=8 credit=9998664 [w=256] load=262144
(~100%)
(XEN) CPU[09] runq=1, sibling={9}, core={8-15}
(XEN)   run: [0.5] flags=a cpu=9 credit=1101 [w=256] load=0 (~0%)
(XEN) CPU[10] runq=1, sibling={10}, core={8-15}
(XEN)   run: [6.0] flags=2 cpu=10 credit=7764532 [w=256] load=262144
(~100%)
(XEN) CPU[11] runq=1, sibling={11}, core={8-15}
(XEN)   run: [0.3] flags=2 cpu=11 credit=9998469 [w=256] load=262144
(~100%)
(XEN) CPU[12] runq=1, sibling={12}, core={8-15}
(XEN)   run: [10.0] flags=2 cpu=12 credit=7967846 [w=256] load=262144
(~100%)
(XEN) CPU[13] runq=1, sibling={13}, core={8-15}
(XEN)   run: [9.1] flags=12 cpu=13 credit=7832232 [w=256] load=262144
(~100%)
(XEN) CPU[14] runq=1, sibling={14}, core={8-15}
(XEN)   run: [11.0] flags=2 cpu=14 credit=7509378 [w=256] load=262144
(~100%)
(XEN) CPU[15] runq=1, sibling={15}, core={8-15}
(XEN)   run: [12.0] flags=2 cpu=15 credit=7971164 [w=256] load=262144
(~100%)
(XEN) RUNQ:
(XEN) CPUs info:
(XEN) CPU[00] current=d4v0, curr=d4v0, prev=NULL
(XEN) CPU[01] current=d3v0, curr=d3v0, prev=NULL
(XEN) CPU[02] current=d5v0, curr=d5v0, prev=NULL
(XEN) CPU[03] current=d[IDLE]v3, curr=d[IDLE]v3, prev=NULL
(XEN) CPU[04] current=d7v0, curr=d7v0, prev=NULL
(XEN) CPU[05] current=d2v0, curr=d2v0, prev=NULL
(XEN) CPU[06] current=d1v0, curr=d1v0, prev=NULL
(XEN) CPU[07] current=d8v1, curr=d8v1, prev=NULL
(XEN) CPU[08] current=d0v13, curr=d0v13, prev=NULL
(XEN) CPU[09] current=d0v5, curr=d0v5, prev=NULL
(XEN) CPU[10] current=d6v0, curr=d6v0, prev=NULL
(XEN) CPU[11] current=d0v3, curr=d0v3, prev=NULL
(XEN) CPU[12] current=d10v0, curr=d10v0, prev=NULL
(XEN) CPU[13] current=d9v1, curr=d9v1, prev=NULL
(XEN) CPU[14] current=d11v0, curr=d11v0, prev=NULL
(XEN) CPU[15] current=d12v0, curr=d12v0, prev=NULL

-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.