Xen project Mailing List

Re: [PATCH] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Fri, 16 May 2025 09:55:46 +0200

Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Fri, 16 May 2025 07:55:58 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 16.05.2025 09:43, Roger Pau Monné wrote: > On Fri, May 16, 2025 at 08:54:52AM +0200, Jan Beulich wrote: >> On 15.05.2025 12:04, Roger Pau Monné wrote: >>> On Wed, Mar 22, 2023 at 07:50:09AM +0100, Jan Beulich wrote: >>>> We don't permit use of uncachable memory types elsewhere unless a domain >>>> meets certain criteria. Enforce this also during registration of pinned >>>> cache attribute ranges. >>>> >>>> Furthermore restrict cache flushing to just uncachable range registration. >>>> While there, also >>>> - take CPU self-snoop as well as IOMMU snoop into account (albeit the >>>> latter still is a global property rather than a per-domain one), >>>> - avoid flushes when the domain isn't running yet (which ought to be the >>>> common case). >>>> >>>> Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >>>> --- >>>> At the expense of yet larger a diff it would be possible to get away >>>> without any "goto", by moving the whole "new entry" handling into the >>>> switch(). Personally I'd prefer that, but the larger diff may be >>>> unwelcome. >>>> >>>> I have to admit that I can't spot what part of epte_get_entry_emt() the >>>> comment refers to that is being deleted. The function does use >>>> hvm_get_mem_pinned_cacheattr(), yes, but there's nothing there that talks >>>> about cache flushes (and their avoiding) in any way. >>>> >>>> Is it really sensible to add/remove ranges once the guest is already >>>> running? (If it is, limiting the scope of the flush would be nice, but >>>> would require knowing dirtyness for the domain wrt the caches, which >>>> currently we don't track.) >>>> >>>> This is kind of amending XSA-428. >>>> >>>> --- a/xen/arch/x86/hvm/mtrr.c >>>> +++ b/xen/arch/x86/hvm/mtrr.c >>>> @@ -589,6 +589,7 @@ int hvm_set_mem_pinned_cacheattr(struct >>>> { >>>> struct hvm_mem_pinned_cacheattr_range *range, *newr; >>>> unsigned int nr = 0; >>>> + bool flush = false; >>>> int rc = 1; >>>> >>>> if ( !is_hvm_domain(d) ) >>>> @@ -612,31 +613,35 @@ int hvm_set_mem_pinned_cacheattr(struct >>>> >>>> type = range->type; >>>> call_rcu(&range->rcu, free_pinned_cacheattr_entry); >>>> - p2m_memory_type_changed(d); >>>> switch ( type ) >>>> { >>>> - case X86_MT_UCM: >>>> + case X86_MT_WB: >>>> + case X86_MT_WP: >>>> + case X86_MT_WT: >>>> /* >>>> - * For EPT we can also avoid the flush in this case; >>>> - * see epte_get_entry_emt(). >>>> + * Flush since we don't know what the cachability is >>>> going >>>> + * to be. >>>> */ >>>> - if ( hap_enabled(d) && cpu_has_vmx ) >>>> - case X86_MT_UC: >>>> - break; >>>> - /* fall through */ >>>> - default: >>>> - flush_all(FLUSH_CACHE); >>>> + if ( is_iommu_enabled(d) || cache_flush_permitted(d) ) >>>> + flush = true; >>>> break; >>>> } >>>> - return 0; >>>> + rc = 0; >>>> + goto finish; >>>> } >>>> domain_unlock(d); >>>> return -ENOENT; >>>> >>>> case X86_MT_UCM: >>>> case X86_MT_UC: >>>> - case X86_MT_WB: >>>> case X86_MT_WC: >>>> + /* Flush since we don't know what the cachability was. */ >>>> + if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) ) >>>> + return -EPERM; >>>> + flush = true; >>>> + break; >>>> + >>>> + case X86_MT_WB: >>>> case X86_MT_WP: >>>> case X86_MT_WT: >>>> break; >>>> @@ -689,8 +694,12 @@ int hvm_set_mem_pinned_cacheattr(struct >>>> >>>> xfree(newr); >>>> >>>> + finish: >>>> p2m_memory_type_changed(d); >>>> - if ( type != X86_MT_WB ) >>>> + >>>> + if ( flush && d->creation_finished && >>>> + (!boot_cpu_has(X86_FEATURE_XEN_SELFSNOOP) || >>>> + (is_iommu_enabled(d) && !iommu_snoop)) ) >>>> flush_all(FLUSH_CACHE); >>> >>> I think it would be better if we could add those checks to >>> memory_type_changed() rather than open-coding them here, and just call >>> memory_type_changed() then, which would also avoid the goto AFAICT. >> >> Hmm, with this last remark, what does "those checks" cover then? > > I have a patches I was going to send today (done some overnight > testing) that do: > > if ( cache_flush_permitted(d) && > d->vcpu && d->vcpu[0] && p2m_memory_type_changed(d) && > /* > * Do the p2m type-change, but skip the cache flush if the domain is > * not yet running. The check for creation_finished must strictly be > * done after the call to p2m_memory_type_changed(). > */ > d->creation_finished && > /* > * The cache flush should be done if either: CPU doesn't have > * self-snoop in which case there could be aliases left in the cache, > * or IOMMUs cannot force all DMA device accesses to be snooped. > */ > (!boot_cpu_has(X86_FEATURE_XEN_SELFSNOOP) || > (is_iommu_enabled(d) && !iommu_snoop)) ) > { > flush_all(FLUSH_CACHE); > } > > As to attempt to limit the cache flushing done in > memory_type_changed(). > >> I first >> read it as meaning the conditions in just this if(), but the "goto" is >> needed for a different reason. > > Maybe I'm missing something, but couldn't > hvm_set_mem_pinned_cacheattr() just call memory_type_changed() (with > the proposed checks above added) if and return then, instead of doing > a goto? As per later replies to your patch - yes, looks like that's possible. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.