[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/4] VT-d / x86: re-arrange cache syncing


  • To: Andrew Cooper <amc96@xxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 2 Dec 2021 10:19:04 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2n5PImIlT9tbs9yJ0pbbJF3BrhRvqpW0KMCJAPX0tWg=; b=lPN+4naCG54ZI9P5el9gyGy/ovB5eMsmg9i6OPOomfCPCmMHi3LGR8lvQq7nixafWczz0kabAEKtgqtZZezd2W/GOTaofmFZ1zsP6v/I9jSZduDB+IRvr69FJDY6yYOqU3PwSXI6NwCHt5Iso4JMc0l+DjDZpFTaPuTo8nMoC9wSSGslR315V03b+xlP0kTsDVcdS79j3/4CEh1D66YGAOWEgGff/IoM5k9dGgtfaMfjHDIt6WaAJ2JmsTMnYKbo9/6hsU6xomLKE0/uuBDVULbGfmUn9yN25LcvSAaOYC63LmdvOlzO0/H9VjKSt6ciQWxRvWaPzksOqwQNF+QUYA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VFhdlEEwvlXldfCV7IuWc6aaufxfo59+S1Bwc5c3PGxeEP9BEwBrqEAvXKsXG0sbaoVUg3Zr2B2FKlHZKnGU1qQtBRS5y73Z+mkIguc1eISv8LA8GSCkVdFBrLyesNtjNstvVppvrBxWvu3LYkTpm9hoMZMrBSByIR3E+Y2DOfPxsNI26ywWr0jxUYvEMk5sHhz3FmSc5co9r6lqTSMWj7zQLzuboyqT/w+XeJPVPmBQ8aRj3a+7d3Mw8jsVcpxIEOabiTbTs1VhRgh53O9lygx7C3wUwxxmGK33s+74OOooymqT0ul3GvZVcMeHURTzorQYORh4c0euH6/b+5RE9w==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Paul Durrant <paul@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 02 Dec 2021 09:19:28 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.12.2021 15:39, Andrew Cooper wrote:
> On 01/12/2021 09:40, Jan Beulich wrote:
>> The actual function should always have lived in core x86 code; move it
>> there, replacing get_cache_line_size() by readily available (except very
>> early during boot; see the code comment) data.
>>
>> Drop the respective IOMMU hook, (re)introducing a respective boolean
>> instead. Replace a true and an almost open-coding instance of
>> iommu_sync_cache().
> 
> Coherency (or not) is a per-IOMMU property, not a global platform
> property.  iGPU IOMMUs are very different to those the uncore, and
> there's no reason to presume that IOMMUs in the PCH would have the same
> properties as those in the uncore.
> 
> Given how expensive sync_cache() is, we cannot afford to be using it for
> coherent IOMMUs in a mixed system.
> 
> This wants to be a boolean in arch_iommu.

That's a valid consideration, but may not be as easy as it may seem on
the surface. Certainly not something I could promise to find time for
soon. And definitely separate from the specific change here.

>> ---
>> Placing the function next to flush_area_local() exposes a curious
>> asymmetry between the SFENCE placements: sync_cache() has it after the
>> flush, while flush_area_local() has it before it. I think the latter one
>> is misplaced.
> 
> Wow this is a mess.
> 
> First and foremost, AMD state that on pre-CLFLUSHOPT parts, CLFLUSH is
> unordered with ~anything (including SFENCE), and need bounding with
> MFENCE on both sides.  We definitely aren't doing this correctly right now.
> 
> 
> AMD explicitly states that SFENCE drains the store and WC buffers (i.e.
> make stuff instantaneously globally visible).  Intel does not, and
> merely guarantees ordering.
> 
> A leading SFENCE would only make sense if there were WC concerns, but
> both manuals say that the memory type doesn't matter, so I can't see a
> justification for it.
> 
> What does matter, from the IOMMU's point of view, is that the line has
> been written back (or evicted on pre-CLWB parts) before the IOTLB
> invalidation occurs.  The invalidation will be a write to a different
> address, which is why the trailing SFENCE is necessary, as CLFLUSHOPT
> isn't ordered with respect to unaliasing writes.

IOW for the purposes of this change all is correct, and everything else
will require taking care of separately.

> On a more minor note, both Intel and AMD say that CLFLUSH* are permitted
> to target an execute-only code segment, so we need a fix in
> hvmemul_cache_op()'s use of hvmemul_virtual_to_linear(...,
> hvm_access_read, ...) which will currently #GP in this case.

Hmm, this specific case would seem to require to simply use hvm_access_none
(like hvmemul_tlb_op() already does), except for CLWB.

But then hvm_vcpu_virtual_to_linear() also doesn't look to handle
hvm_access_insn_fetch correctly for data segments. Changing of which would
in turn require to first audit all use sites, to make sure we don't break
anything.

I'll see about doing both.

>> --- a/xen/arch/x86/flushtlb.c
>> +++ b/xen/arch/x86/flushtlb.c
>> @@ -11,6 +11,7 @@
>>  #include <xen/sched.h>
>>  #include <xen/smp.h>
>>  #include <xen/softirq.h>
>> +#include <asm/cache.h>
>>  #include <asm/flushtlb.h>
>>  #include <asm/invpcid.h>
>>  #include <asm/nops.h>
>> @@ -265,6 +266,57 @@ unsigned int flush_area_local(const void
>>      return flags;
>>  }
>>  
>> +void sync_cache(const void *addr, unsigned int size)
> 
> Can we name this cache_writeback()?  sync is very generic, and it really
> doesn't want confusing cache_flush().

Oh, sure, can do. As long as you don't insist on also changing
iommu_sync_cache(): While purely mechanical, this would bloat the
patch by quite a bit.

Bottom line: This last item is the only change to the actual patch;
everything else will require further work yielding separate patches.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.