[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/20] Add SMMUv3 Stage 1 Support for XEN guests


  • To: Julien Grall <julien.grall.oss@xxxxxxxxx>
  • From: Milan Djokic <milan_djokic@xxxxxxxx>
  • Date: Wed, 13 Aug 2025 12:04:57 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8tBlqXZx0U7sf4hcYNtpSiCENvKVuXui+HqEWO8CTxI=; b=MVpS3pL7Egrsb+b/j40H/VA8+7ZhEnKeNmq/9yonRO1FS/To5+7aGQKidJANWHV414UdzA/DPjlVuvvxnqvgpnCtpyvC96JOQmWBLyIrzNULw3FXGKqQ43xgiCUxezzWZdiTXEGjqHjpeOWeiJApI2NqeJOB4fT8khE2c5oP9oJMtJjLa0X8eJRdVN/P95NO37VJFWC9kk8ivMQ7JWKp2eO/5OE+sjfHMSVHaNjyRYpI9U8pYL3fW4Uh2DMFaLZABXjAHJKOJmrga3CCdBogzpq5MKKB1ZBuehS4VCg8n4wDAQPErKtlnvsUtRmedC26HNRqGFY0cq11tXadVCE+PA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=C/f1wrCunYSi9hEmbrVqL6x+yr68nkJiAa28wkHCbZh4v1WHJOiT8R7Zo8Bn2I4XLhRYkmt4+qxn0rOdqOC+CiOy6mGchjsiEBpCH9E6aFnMzTQ5MiyHmL9qcyfeK1zw2Vt59aL0206CJ2FQ6Cf1mo49BtldljTnfkhtSMWmomcOgzghaDAE4SYEwS9wQ+hYYn53MLS1LQf/ZeJTMHE7766pY0Qi36+KWdZ0LZXG7wtMDrYfuUTtCFpKSX/hcTKJ0/yAOvBoiuj8hhaoXCV82TsIAjI3yfgrox+2DK4B17/bQPc8a3O89m2yoiXttpuBjGxNY1qTw3Z3sL31Ow2qLA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Nick Rosbrook <enr0n@xxxxxxxxxx>, George Dunlap <gwd@xxxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Wed, 13 Aug 2025 10:05:23 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 8/7/25 19:58, Julien Grall wrote:
Hi Milan,

On Thu, 7 Aug 2025 at 17:55, Milan Djokic <milan_djokic@xxxxxxxx <mailto:milan_djokic@xxxxxxxx>> wrote:

    This patch series represents a rebase of an older patch series
    implemented and
    sumbitted by Rahul Singh as an RFC: https://patchwork.kernel.org/
    project/xen-devel/cover/cover.1669888522.git.rahul.singh@xxxxxxx/
    <https://eur01.safelinks.protection.outlook.com/?
    url=https%3A%2F%2Fpatchwork.kernel.org%2Fproject%2Fxen-
    
devel%2Fcover%2Fcover.1669888522.git.rahul.singh%40arm.com%2F&data=05%7C02%7Cmilan_djokic%40epam.com%7C03265dfcc1a94a11e83f08ddd5dc0edc%7Cb41b72d04e9f4c268a69f949f367c91d%7C1%7C0%7C638901863296475715%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=bdsPyXoIqvzwWIWk0Ot3BDOu8yAaF%2Bq3Vrs4wsmZJEA%3D&reserved=0>.
    Original patch series content is aligned with the latest xen
    structure in terms of common/arch-specific code structuring.
    Some minor bugfixes are also applied:
    - Sanity checks / error handling
    - Non-pci devices support for emulated iommu



    Overall description of stage-1 support is available in the original
    patch series cover letter. Original commits structure with detailed
    explanation for each commit
    functionality is maintained.


I am a bit surprised not much has changed. Last time we asked a document to explain the overall design of the vSMMU including some details on the security posture. I can’t remember if this was ever posted.

If not, then you need to start with that. Otherwise, if is going to be pretty difficult to review this series.

Cheers,
Hello Julien,

We have prepared a design document and it will be part of the updated patch series (added in docs/design). I'll also extend cover letter with details on implementation structure to make review easier. Following is the design document content which will be provided in updated patch series:

Design Proposal: Add SMMUv3 Stage-1 Support for XEN Guests
==========================================================

Author: Milan Djokic <milan_djokic@xxxxxxxx>
Date:   2025-08-07
Status: Draft

Introduction
------------

The SMMUv3 supports two stages of translation. Each stage of translation
can be independently enabled. An incoming address is logically
translated from VA to IPA in stage 1, then the IPA is input to stage 2
which translates the IPA to the output PA. Stage 1 translation support
is required to provide isolation between different devices within the OS.

Xen already supports Stage 2 translation but there is no support for
Stage 1 translation. This design proposal outlines the introduction of
Stage-1 SMMUv3 support in Xen for ARM guests.

Motivation
----------

ARM systems utilizing SMMUv3 require Stage-1 address translation to
ensure correct and secure DMA behavior inside guests.

This feature enables:
- Stage-1 translation in guest domain
- Safe device passthrough under secure memory translation

Design Overview
---------------

These changes provide emulated SMMUv3 support:

- SMMUv3 Stage-1 Translation: stage-1 and nested translation support in
  SMMUv3 driver
- vIOMMU Abstraction: virtual IOMMU framework for guest Stage-1 handling
- Register/Command Emulation: SMMUv3 register emulation and command
  queue handling
- Device Tree Extensions: adds iommus and virtual SMMUv3 nodes to
  device trees for dom0 and dom0less scenarios
- Runtime Configuration: introduces a 'viommu' boot parameter for
  dynamic enablement

Security Considerations
------------------------

viommu security benefits:
- Stage-1 translation ensures guest devices cannot perform unauthorized
  DMA
- Emulated SMMUv3 for domains removes dependency on host hardware while
  maintaining isolation

Observations and Potential Risks
--------------------------------

1. Observation:
Support for Stage-1 translation introduces new data structures
(s1_cfg and s2_cfg) and logic to write both Stage-1 and Stage-2 entries
in the Stream Table Entry (STE), including an abort field for partial
config states.

Risk:
A partially applied Stage-1 configuration might leave guest DMA
mappings in an inconsistent state, enabling unauthorized access or
cross-domain interference.

Mitigation (Handled by design):
Both s1_cfg and s2_cfg are written atomically. The abort field ensures
Stage-1 config is only used when fully applied. Incomplete configs are
ignored by the hypervisor.

2. Observation:
Guests can now issue Stage-1 cache invalidations.

Risk:
Failure to propagate invalidations could leave stale mappings, enabling
data leakage or misrouting.

Mitigation (Handled by design):
Guest invalidations are forwarded to the hardware to ensure IOMMU
coherency.

3. Observation:
The feature introduces large functional changes including the vIOMMU
framework, vsmmuv3 devices, command queues, event queues, domain
handling, and Device Tree modifications.

Risk:
Increased attack surface with risk of race conditions, malformed
commands, or misconfiguration via the device tree.

Mitigation:
- Improved sanity checks and error handling
- Feature is marked as Tech Preview and self-contained to reduce risk
  to unrelated code

4. Observation:
The implementation supports nested and standard translation modes,
using guest command queues (e.g. CMD_CFGI_STE) and events.

Risk:
Malicious commands could bypass validation and corrupt SMMUv3 state or
destabilize dom0.

Mitigation (Handled by design):
Command queues are validated, and only permitted configuration changes
are accepted. Handled in vsmmuv3 and cmdqueue logic.

5. Observation:
Device Tree changes inject iommus and vsmmuv3 nodes via libxl.

Risk:
Malicious or incorrect DT fragments could result in wrong device
assignments or hardware access.

Mitigation:
Only vetted and sanitized DT fragments are allowed. libxl limits what
guests can inject.

6. Observation:
The feature is enabled per-guest via viommu setting.

Risk:
Guests without viommu may behave differently, potentially causing
confusion, privilege drift, or accidental exposure.

Mitigation:
Ensure downgrade paths are safe. Perform isolation audits in
multi-guest environments to ensure correct behavior.

Performance Impact
------------------

Hardware-managed translations are expected to have minimal overhead.
Emulated vIOMMU may introduce some latency during initialization or
event processing.

Testing
-------

- QEMU-based testing for Stage-1 and nested translation
- Hardware testing on Renesas SMMUv3-enabled ARM systems
- Unit tests for translation accuracy (not yet implemented)

Migration and Compatibility
---------------------------

This feature is optional and disabled by default (viommu="") to ensure
backward compatibility.

References
----------

- Original implementation by Rahul Singh:
https://patchwork.kernel.org/project/xen-devel/cover/cover.1669888522.git.rahul.singh@xxxxxxx/
- ARM SMMUv3 architecture documentation
- Existing vIOMMU code in Xen


BR,
Milan





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.