[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problems in PV dom0 on recent x86 hardware


  • To: Jason Andryuk <jason.andryuk@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • From: Juergen Gross <jgross@xxxxxxxx>
  • Date: Fri, 12 Jul 2024 15:46:56 +0200
  • Authentication-results: smtp-out1.suse.de; none
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 12 Jul 2024 13:47:14 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 12.07.24 12:35, Jürgen Groß wrote:
On 09.07.24 15:08, Jason Andryuk wrote:
On 2024-07-09 06:56, Jürgen Groß wrote:
On 09.07.24 09:01, Jan Beulich wrote:
On 09.07.2024 08:36, Jürgen Groß wrote:
On 09.07.24 08:24, Jan Beulich wrote:
On 08.07.2024 23:30, Jason Andryuk wrote:
   From the backtrace, it looks like the immediate case is just trying to
read a 4-byte version:

   >>>> [   44.575541]  ucsi_acpi_dsm+0x53/0x80
   >>>> [   44.575546]  ucsi_acpi_read+0x2e/0x60
   >>>> [   44.575550]  ucsi_register+0x24/0xa0
   >>>> [   44.575555]  ucsi_acpi_probe+0x162/0x1e3

int ucsi_register(struct ucsi *ucsi)
{
           int ret;

           ret = ucsi->ops->read(ucsi, UCSI_VERSION, &ucsi->version,
                                 sizeof(ucsi->version));

->read being ucsi_acpi_read()

However, the driver also appears write to adjacent addresses.

There are also corresponding write functions in the driver, yes, but
ucsi_acpi_async_write() (used directly or indirectly) similarly calls
ucsi_acpi_dsm(), which wires through to acpi_evaluate_dsm(). That's
ACPI object evaluation, which isn't obvious without seeing the
involved AML whether it might write said memory region.

I guess an ACPI dump would help here?

Perhaps, yes.

It is available in the bug report:

https://bugzilla.opensuse.org/show_bug.cgi?id=1227301

After acpixtract & iasl:

$ grep -ir FEEC *
dsdt.dsl:   OperationRegion (ECMM, SystemMemory, 0xFEEC2000, 0x0100)
ssdt16.dsl: OperationRegion (SUSC, SystemMemory, 0xFEEC2100, 0x30)


from the DSDT:
     Scope (\_SB.PCI0.LPC0.EC0)
     {
         OperationRegion (ECMM, SystemMemory, 0xFEEC2000, 0x0100)
         Field (ECMM, AnyAcc, Lock, Preserve)
         {
             TWBT,   2048
         }

         Name (BTBF, Buffer (0x0100)
         {
              0x00                                             // .
         })
         Method (BTIF, 0, NotSerialized)
         {
             BTBF = TWBT /* \_SB_.PCI0.LPC0.EC0_.TWBT */
             Return (BTBF) /* \_SB_.PCI0.LPC0.EC0_.BTBF */
         }
     }

 From SSDT16:
DefinitionBlock ("", "SSDT", 2, "LENOVO", "UsbCTabl", 0x00000001)
{
     External (_SB_.PCI0.LPC0.EC0_, DeviceObj)

     Scope (\_SB)
     {
         OperationRegion (SUSC, SystemMemory, 0xFEEC2100, 0x30)
         Field (SUSC, ByteAcc, Lock, Preserve)
         {


This embedded controller (?) seems to live at 0xfeec2xxx.

What is the takeaway from that?

Is this a firmware bug (if yes, pointers to a specification saying that
this is an illegal configuration would be nice), or do we need a way to
map this page from dom0?

I've found the following in the AMD IOMMU spec [1]:

  Received DMA requests without PASID in the 0xFEEx_xxxx address range are
  treated as MSI interrupts and are processed using interrupt remapping rather
  than address translation.

To me this sounds as if there wouldn't be a major risk letting dom0 map
physical addresses in this area, as long as "normal" I/Os to this area would
result in DMA requests with a PASID. OTOH I'm not familiar with Xen IOMMU
handling, so I might be completely wrong.

Another question would be whether a device having resources in this area can
even work through an IOMMU.


Juergen

[1]: https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/specifications/48882_IOMMU.pdf

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.