Xen project Mailing List

Re: [Xen-devel] [PATCH v4 0/7] unsafe big.LITTLE support

From: Peng Fan <van.freenix@xxxxxxxxx>

Date: Fri, 9 Mar 2018 17:05:29 +0800

Cc: Peng Fan <peng.fan@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Fri, 09 Mar 2018 09:10:38 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Mar 08, 2018 at 03:13:50PM +0000, Julien Grall wrote: >Hi, > >On 08/03/18 12:43, Peng Fan wrote: >>>>>>I am not sure whether this issue cause DomU big/Little not work. >>>>> >>>>>Well, I would recommend to speak with NXP whether this errata affects >>>>>TLB flush for Hypervisor Page-Table or Stage-2 Page-Table. >>>> >>>>I tried the following, but no help. Not sure my patch is correct. I >>>>think it affects stage2 TLB. >>>> >>>>--- a/xen/include/asm-arm/arm64/flushtlb.h >>>>+++ b/xen/include/asm-arm/arm64/flushtlb.h >>>>@@ -6,7 +6,7 @@ static inline void flush_tlb_local(void) >>>> { >>>> asm volatile( >>>> "dsb sy;" >>>>- "tlbi vmalls12e1;" >>>>+ "tlbi alle1;" >>>> "dsb sy;" >>>> "isb;" >>>> : : : "memory"); >>>>@@ -17,7 +17,7 @@ static inline void flush_tlb(void) >>>> { >>>> asm volatile( >>>> "dsb sy;" >>>>- "tlbi vmalls12e1is;" >>>>+ "tlbi alle1;" >>> >>>I am not sure why you drop the innershareable here? >>Just want to invalid all the tlb, innershareable could be kept. >>This is not a formal patch, just my trying to narrow the issue. > >alle1 will only flush the TLBs of the local processor. The flush will >not get propagated to the other CPUs of the system. So you definitely >want this to be innershareable to avoid the other processors >containing stale TLBs. > >>> >>>> "dsb sy;" >>>> "isb;" >>>> : : : "memory"); >>>>@@ -39,7 +39,7 @@ static inline void flush_tlb_all(void) >>>> { >>>> asm volatile( >>>> "dsb sy;" >>>>- "tlbi alle1is;" >>>>+ "tlbi alle1;" >>> >>>Ditto. >>> >>>> "dsb sy;" >>>> "isb;" >>>> : : : "memory"); >>>>--- a/xen/include/asm-arm/arm64/page.h >>>>+++ b/xen/include/asm-arm/arm64/page.h >>>>@@ -74,14 +74,16 @@ static inline void flush_xen_data_tlb_local(void) >>>> /* Flush TLB of local processor for address va. */ >>>> static inline void __flush_xen_data_tlb_one_local(vaddr_t va) >>>> { >>>>- asm volatile("tlbi vae2, %0;" : : "r" (va>>PAGE_SHIFT) : "memory"); >>>>+ flush_xen_data_tlb_local(); >>>>+ //asm volatile("tlbi vae2, %0;" : : "r" (va>>PAGE_SHIFT) : >>>>+ "memory"); >>>> } >>>> >>>> /* Flush TLB of all processors in the inner-shareable domain for >>>> * address va. */ >>>> static inline void __flush_xen_data_tlb_one(vaddr_t va) >>>> { >>>>- asm volatile("tlbi vae2is, %0;" : : "r" (va>>PAGE_SHIFT) : "memory"); >>>>+ flush_xen_data_tlb_local(); >>> >>>Why do you replace an innershareable call to a local call? Is it part of the >>>errata? >> >>No. Just my trying to narrow down. > >Then you should keep the innershareable. See above. > >>> >>>>+ //asm volatile("tlbi vae2is, %0;" : : "r" (va>>PAGE_SHIFT) : >>>>+ "memory"); >>>> } >>>> >>>>> >>>>>>So wonder has this patchset been tested on Big/Little Hardware? >>>>> >>>>>This series only adds facility to report the correct MIDR to the guest. >>>>>If your platform requires more, then it would be necessary send a patch for >>>Xen. >>>> >>>>Do you have any suggestions? Besides MIDR/ACTLR/Cacheline, are there more >>>needed? >>> >>>Having a bit more details from your side would be helpful. At the moment, I >>>have >>>no clue what's going on. >> >>As from the linux kernel commit: >> on i.MX8QM TO1.0, there is an issue: the bus width between A53-CCI-A72 >> is limited to 36bits.TLB maintenance through DVM messages over AR >> channel, >> some bits will be forced(truncated) to zero as the followings: >> >> ASID[15:12] is forced to 0 >> VA[48:45] is forced to 0 >> VA[44:41] is forced to 0 >> VA[39:36] is forced to 0 >> >> This issue will result in the TLB aintenance across the clusters not >> working >> as expected due to some VA and ASID bits get truncated and forced to be >> zero. >> >> The SW workaround is: use the vmalle1is if VA larger than 36bits or >> ASID[15:12] is not zero, otherwise, we use original TLB maintenance path. >> >>When doing tlb maintenance through DVM from A53 to A72, some bits are forced >>to 0, this means TLB may not be really invalidated from A72 perspective. >> >>Currently I am trying a domu with big/little capability, but not allowing >>big/little vcpu >>migration. >> >>I am not sure whether this hardware issue impacts DomU or not. Or it is >>software issue. >>As you could see dom0 has 6 vcpus, I did a stress test and not found issue on >>dom0. > >There are a major difference between Dom0 and DomU in your setup. >Dom0 vCPUs are pinned to a specific pCPU, so they can't move around. >For DomU, each vCPU are pinned to a set of pCPUs, so they can move >around. > >But, did you check the DomU has the workaround enabled? I am asking >that because it looks like to me the way to detect the workaround is >based on a device (scu) and not processor. So I am not convinced that >DomU is actually using your workaround. Just checked this. Because xen toolstack create device tree with compatible "compatible = "xen,xenvm-4.10", "xen,xenvm";", but the linux code use "fsl,imx8qm" to detect soc, then call scu to get revision of chip. After add an entry in linux side "{ .compatible = "xen,xenvm", .data = &imx8qm_soc_data, }," It seems works. Passed a map/unmap stress test which easily fail without the tlb workaround. Wonder is it ok to specific machine compatible in domu.cfg and let xen stack use this machine compatible other than "xen,xenvm"? Is this acceptable by community? Also in domu kernel booting, there is waring. [ 0.201323] Invalid sched_group_energy for CPU3 [ 0.201341] Invalid sched_group_energy for Cluster3 [ 0.201353] Invalid sched_group_energy for CPU2 [ 0.201365] Invalid sched_group_energy for Cluster2 [ 0.201376] Invalid sched_group_energy for CPU1 [ 0.201387] Invalid sched_group_energy for Cluster1 [ 0.201398] Invalid sched_group_energy for CPU0 [ 0.201409] Invalid sched_group_energy for Cluster0 This is because no cpu0/1/2/3 is not under cluster node in dts. As I am using big/little guest, I think need create two cluster nodes ,one for vcpu0-1, the other for vcpu2-3. But this also needs xen toolstack change (: Thanks, Peng. > >Cheers, > >-- >Julien Grall -- _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.