[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MTRR init sequence in Xen


  • To: Jürgen Groß <jgross@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 22 Jan 2026 18:18:45 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7E7COv+Wh1EVe34eonap1ehBQJL4ChnMDwNg1wlY5Do=; b=Gtj4VAW3jlSlrWO2Rb/M5Ms1XzSbHdjU34qqJswv9A/L2lCnqOM3ZmdRU3f+JCSS+f0eMKY/D5ORJf6XUNy71ZdJfX/9M15O31I3NCD1xXa3hsj1xd4Mu9znsQt/UuQBFz+LgBEsMO2WgMu6ExX9CdCgbNj8Mzq8smVR9pcO/Vjp+1g9sJWvHU6ZW1I5cOiSipU/OWYSJmJYpP2vjJBuvdY4JWjp/Zs3dAe1jyjE6dzci9+TYiDcyEhXNUylk84D1uAV+yyVClLINwM0u1pDBHPycDK3xpiBu0KjW753Z7/Xert/jxJ1TW7v2WiLgQpXQHvhovB0oQvGxJvWuf3Rzw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=on1x4OYXdZeOfQrii8spj2vTE58rY39g1iIUC3f543Rt97xocYCyhfUtzRA94alUmBqtoJK770KDOzWpBPtqsHzk+fZiCAbwSoT+tQmcJjm1iMojK4wM0PqpfHgAT9sW4PwZ+jYXCxxiujWijq2Hsx/1OTvAzoCZupEucF1kvngRUu7oYagus5FC3FUkOz97BPG9j96yJfuBdprERzuN04yVv+naFGTxVsP5xuU0aQ0woDM6biJ1s7cNAovNlmnWqPgOI3/ivnOD3TO3718KoRhJADA4eP4EaG/e8jALQ8U50XKC0B4MiSJYorC4PzvBHLoB2wTOzBInhltyK0ADtw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Thu, 22 Jan 2026 17:19:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Jan 22, 2026 at 04:56:24PM +0100, Jürgen Groß wrote:
> Just as a heads up: a hardware partner of SUSE has seen hard lockups
> of the Linux kernel during boot on a new machine. This machine has
> 8 NUMA nodes and 960 CPUs. The hang occurs in roughly 1.5% of the boot
> attempts in MTRR initialization of the APs.

Do you know why you get hard lockups?  Is some watchdog triggering on
Linux?  Otherwise I think it should just be slow, but ultimately
succeed?

> I have sent a small patch series to LKML which seems to fix the problem:
> https://lore.kernel.org/lkml/20260121141106.755458-1-jgross@xxxxxxxx/
> 
> As Xen MTRR handling is taken from the Linux kernel, I guess the same
> problem could happen in Xen, too.
> 
> As the hang always occurred while waiting for the lock, which is
> serializing the single CPUs doing MTRR initialization, my solution was
> to eliminate the lock, allowing all APs to init MTRRs in parallel.
> 
> Maybe we want to do the same in Xen.

Hm, yes, I think Xen would be equally affected with regards to being
contented on a lock while updating MTRRs.  The MTRR initialization is
deferred until all APs are up, and serialized on the
set_atomicity_lock lock.

Regards, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.