[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/msr: expose MSR_FAM10H_MMIO_CONF_BASE on AMD



On Fri, Feb 21, 2025 at 01:44:00PM +0100, Jan Beulich wrote:
> On 21.02.2025 13:04, Roger Pau Monne wrote:
> > The MMIO_CONF_BASE reports the base of the MCFG range on AMD systems.
> > Currently Linux is unconditionally attempting to read the MSR without a
> > safe MSR accessor, and since Xen doesn't allow access to it Linux reports
> > the following error:
> > 
> > unchecked MSR access error: RDMSR from 0xc0010058 at rIP: 
> > 0xffffffff8101d19f (xen_do_read_msr+0x7f/0xa0)
> > Call Trace:
> >  <TASK>
> >  ? ex_handler_msr+0x11e/0x150
> >  ? fixup_exception+0x81/0x300
> >  ? exc_general_protection+0x138/0x410
> >  ? asm_exc_general_protection+0x22/0x30
> >  ? xen_do_read_msr+0x7f/0xa0
> >  xen_read_msr+0x1e/0x30
> >  amd_get_mmconfig_range+0x2b/0x80
> >  quirk_amd_mmconfig_area+0x28/0x100
> >  ? quirk_system_pci_resources+0x2b/0x150
> >  pnp_fixup_device+0x39/0x50
> >  __pnp_add_device+0xf/0x150
> >  pnp_add_device+0x3d/0x100
> >  ? __pfx_pnpacpi_allocated_resource+0x10/0x10
> >  ? __pfx_pnpacpi_allocated_resource+0x10/0x10
> >  ? acpi_walk_resources+0xbb/0xd0
> >  pnpacpi_add_device_handler+0x1f9/0x280
> >  acpi_ns_get_device_callback+0x104/0x1c0
> >  ? _raw_spin_unlock_irqrestore+0x18/0x20
> >  ? down_timeout+0x3a/0x60
> >  ? _raw_spin_lock_irqsave+0x14/0x40
> >  acpi_ns_walk_namespace+0x1d0/0x260
> >  ? _raw_spin_unlock_irqrestore+0x18/0x20
> >  ? __pfx_acpi_ns_get_device_callback+0x10/0x10
> >  acpi_get_devices+0x8a/0xb0
> >  ? __pfx_pnpacpi_add_device_handler+0x10/0x10
> >  ? __pfx_pnpacpi_init+0x10/0x10
> >  pnpacpi_init+0x50/0x80
> >  do_one_initcall+0x46/0x2e0
> >  kernel_init_freeable+0x1da/0x2f0
> >  ? __pfx_kernel_init+0x10/0x10
> >  kernel_init+0x16/0x1b0
> >  ret_from_fork+0x30/0x50
> >  ? __pfx_kernel_init+0x10/0x10
> >  ret_from_fork_asm+0x1b/0x30
> >  </TASK>
> 
> Across all the halfway recent Linux versions I've never seen this. The MSR
> access therefore can't be entirely unconditional, I expect. Or is this new
> in 6.14, which I haven't tried yet?

Hm, no, the above report is from 6.6.  It is gated on the presence of
a device with PnP ID "PNP0c01", which triggers the execution of the
quirk_amd_mmconfig_area() function.  Such PnP ID seems to be "System
Board".  The system is currently busy, so I can't gather more
information about the device that triggers it.  The issue was
originally seem on a Dell system with AMD Naples (Zen) CPU (possibly
other CPUs also, that's the one I tested the fix on).

> > Fix by allowing access to the MSR on AMD systems, returning 0 for
> > unprivileged domains (MMIO configuration space disabled), and the native
> > value for the hardware domain.
> > 
> > The non hardware domain logic will need to be adjusted if in the future we
> > expose an MCFG region to such domains.
> > 
> > Write attempts to the MSR will still result in #GP for all domain types.
> > 
> > Fixes: 84e848fd7a16 ('x86/hvm: disallow access to unknown MSRs')
> > Fixes: 322ec7c89f66 ('x86/pv: disallow access to unknown MSRs')
> 
> Hmm, if we consider this a bug fix, then perhaps we'd need to go quite a bit
> farther with what MSRs we permit at least read access for. More generally in
> this event I'd wonder whether for any MSR that's in principle writable we
> shouldn't silently ignore same-value writes.

Yeah, I also wondered whether to silently ignore writes of the same
value.  I would initially leave as-is initially.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.