[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: MTRR init sequence in Xen
- To: Jürgen Groß <jgross@xxxxxxxx>
- From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
- Date: Thu, 22 Jan 2026 18:18:45 +0100
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7E7COv+Wh1EVe34eonap1ehBQJL4ChnMDwNg1wlY5Do=; b=Gtj4VAW3jlSlrWO2Rb/M5Ms1XzSbHdjU34qqJswv9A/L2lCnqOM3ZmdRU3f+JCSS+f0eMKY/D5ORJf6XUNy71ZdJfX/9M15O31I3NCD1xXa3hsj1xd4Mu9znsQt/UuQBFz+LgBEsMO2WgMu6ExX9CdCgbNj8Mzq8smVR9pcO/Vjp+1g9sJWvHU6ZW1I5cOiSipU/OWYSJmJYpP2vjJBuvdY4JWjp/Zs3dAe1jyjE6dzci9+TYiDcyEhXNUylk84D1uAV+yyVClLINwM0u1pDBHPycDK3xpiBu0KjW753Z7/Xert/jxJ1TW7v2WiLgQpXQHvhovB0oQvGxJvWuf3Rzw==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=on1x4OYXdZeOfQrii8spj2vTE58rY39g1iIUC3f543Rt97xocYCyhfUtzRA94alUmBqtoJK770KDOzWpBPtqsHzk+fZiCAbwSoT+tQmcJjm1iMojK4wM0PqpfHgAT9sW4PwZ+jYXCxxiujWijq2Hsx/1OTvAzoCZupEucF1kvngRUu7oYagus5FC3FUkOz97BPG9j96yJfuBdprERzuN04yVv+naFGTxVsP5xuU0aQ0woDM6biJ1s7cNAovNlmnWqPgOI3/ivnOD3TO3718KoRhJADA4eP4EaG/e8jALQ8U50XKC0B4MiSJYorC4PzvBHLoB2wTOzBInhltyK0ADtw==
- Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
- Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
- Delivery-date: Thu, 22 Jan 2026 17:19:03 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On Thu, Jan 22, 2026 at 04:56:24PM +0100, Jürgen Groß wrote:
> Just as a heads up: a hardware partner of SUSE has seen hard lockups
> of the Linux kernel during boot on a new machine. This machine has
> 8 NUMA nodes and 960 CPUs. The hang occurs in roughly 1.5% of the boot
> attempts in MTRR initialization of the APs.
Do you know why you get hard lockups? Is some watchdog triggering on
Linux? Otherwise I think it should just be slow, but ultimately
succeed?
> I have sent a small patch series to LKML which seems to fix the problem:
> https://lore.kernel.org/lkml/20260121141106.755458-1-jgross@xxxxxxxx/
>
> As Xen MTRR handling is taken from the Linux kernel, I guess the same
> problem could happen in Xen, too.
>
> As the hang always occurred while waiting for the lock, which is
> serializing the single CPUs doing MTRR initialization, my solution was
> to eliminate the lock, allowing all APs to init MTRRs in parallel.
>
> Maybe we want to do the same in Xen.
Hm, yes, I think Xen would be equally affected with regards to being
contented on a lock while updating MTRRs. The MTRR initialization is
deferred until all APs are up, and serialized on the
set_atomicity_lock lock.
Regards, Roger.
|