Xen project Mailing List

Re: [Xen-devel] [PATCH 3/4] xen/x86: Replace incorrect mandatory barriers with SMP barriers

To: "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Mon, 05 Dec 2016 07:03:00 -0700

Delivery-date: Mon, 05 Dec 2016 14:03:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 05.12.16 at 14:29, <andrew.cooper3@xxxxxxxxxx> wrote: > On 05/12/16 11:47, Jan Beulich wrote: >>>>> On 05.12.16 at 11:05, <andrew.cooper3@xxxxxxxxxx> wrote: >>> --- a/xen/arch/x86/acpi/cpu_idle.c >>> +++ b/xen/arch/x86/acpi/cpu_idle.c >>> @@ -391,9 +391,9 @@ void mwait_idle_with_hints(unsigned int eax, unsigned >>> int ecx) >>> >>> if ( boot_cpu_has(X86_FEATURE_CLFLUSH_MONITOR) ) >>> { >>> - mb(); >>> + smp_mb(); >>> clflush((void *)&mwait_wakeup(cpu)); >>> - mb(); >>> + smp_mb(); >>> } >> Both need to stay as they are afaict: In order for the clflush() to do >> what we want we have to order it wrt earlier as well as later writes, >> regardless of SMP-ness. Or wait - the SDM has changed in that >> respect (and a footnote describes the earlier specified behavior now). >> AMD, otoh, continues to require MFENCE for ordering purposes. > > mb() == smp_mb(). They are both mfence instructions. Of course. But still smp_mb() would be less correct from an abstract perspective, as here we care only about the local CPU. That said, ... > However, if AMD specifically requires mfence, we should explicitly use > that rather than relying on the implementation details of smp_mb(). ... I'd be fine with this. >>> --- a/xen/arch/x86/cpu/mcheck/amd_nonfatal.c >>> +++ b/xen/arch/x86/cpu/mcheck/amd_nonfatal.c >>> @@ -175,7 +175,7 @@ static void mce_amd_work_fn(void *data) >>> /* Counter enable */ >>> value |= (1ULL << 51); >>> mca_wrmsr(MSR_IA32_MCx_MISC(4), value); >>> - wmb(); >>> + smp_wmb(); >>> } >>> } >>> >>> @@ -240,7 +240,7 @@ void amd_nonfatal_mcheck_init(struct cpuinfo_x86 *c) >>> value |= (1ULL << 51); >>> wrmsrl(MSR_IA32_MCx_MISC(4), value); >>> /* serialize */ >>> - wmb(); >>> + smp_wmb(); >>> printk(XENLOG_INFO "MCA: Use hw thresholding to adjust >>> polling frequency\n"); >>> } >>> } >> These will need confirming by AMD engineers. > > I was uncertain whether these were necessary at all, but as identified > in the commit message, this is no functional change as Xen currently has > rmb/wmb as plain barriers, not fence instructions. And may hence be subtly broken, if this code was lifted from Linux? >>> @@ -433,7 +433,7 @@ mctelem_cookie_t >>> mctelem_consume_oldest_begin(mctelem_class_t which) >>> } >>> >>> mctelem_processing_hold(tep); >>> - wmb(); >>> + smp_wmb(); >>> spin_unlock(&processing_lock); >> Don't we imply unlocks to be write barriers? > > They are, as an unlock is necessarily a write, combined with x86's > ordering guarantees. > > Then again, I am not sure how this would interact with TSX, so am not > sure if we should assume or rely on such behaviour. Isn't architectural state at the end of a transactional region indistinguishable as to whether TSX was actually used or the abort path taken (assuming the two code patch don't differ in their actions)? >>> --- a/xen/arch/x86/io_apic.c >>> +++ b/xen/arch/x86/io_apic.c >>> @@ -1486,7 +1486,7 @@ static int __init timer_irq_works(void) >>> unsigned long t1, flags; >>> >>> t1 = pit0_ticks; >>> - mb(); >>> + smp_mb(); >>> >>> local_save_flags(flags); >>> local_irq_enable(); >>> @@ -1501,7 +1501,7 @@ static int __init timer_irq_works(void) >>> * might have cached one ExtINT interrupt. Finally, at >>> * least one tick may be lost due to delays. >>> */ >>> - mb(); >>> + smp_mb(); >>> if (pit0_ticks - t1 > 4) >>> return 1; >> I think these two can be barrier(). > > These were originally in the previous patch, but I wasn't so certain and > moved them back here. > > Thinking about it further, pit0_ticks is only ever updated by the > current cpu, so so-long as the compiler doesn't elide the read, it > should be fine. That was exactly my thinking when giving the comment. >>> @@ -124,7 +124,7 @@ static void synchronize_tsc_master(unsigned int slave) >>> for ( i = 1; i <= 5; i++ ) >>> { >>> tsc_value = rdtsc_ordered(); >>> - wmb(); >>> + smp_wmb(); >>> atomic_inc(&tsc_count); >> Same question as above wrt the following LOCKed instruction. > > I'm not sure the locked instruction is relevant here. C's > ordering-model is sufficient to make this correct. I don't follow - the two involved variables are distinct, so I don't see how C's ordering model helps here at all. We need (at the machine level) the write to tsc_value to precede the increment of tsc_count, and I don't think C alone guarantees any such ordering. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.