[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
Ian Jackson writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD cpus ?"): > Andrew Cooper writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD > cpus ?"): > > It will be because of Gen1 SVM which doesn't have NRIP support. This > > case requires emulation of the invlpg instruction, rather than just > > using the information provided by the intercept. > > So it seems that the xtf test is not effective at detecting the Xen > bug except on old hardware ? Is there some way it could be improved ? > > It's obviously not desirable that we should have tests which pass in > the production colo and fail in the ancient Citrix Cambridge instance. Andrew and I discussed this IRL. I thought it worth writing down what was said so that we can refer to it later. This test failure is due to genuine bug(s) in Xen 4.5, in that it doesn't have various fixes (see the rest of the thread). The bugs are only exposed on old hardware, which uses different codepaths in Xen. On new hardware Xen takes a different approach. This is why the test failure appears in the Citrix (Cambridge) osstest but not in the Xen Project (Massachusetts) instance. Xen decides which approach to take based on hardware features. There is not currently any way to tell Xen not to use these hardware features (at least, not in this case - the AMD SVM NextRIP feature) if they are available. Andrew has a long-term plan to add more of such a facility - but that is not going to be available any time soon. In this particular case, the old hardware uses the Xen instruction emulator where newer hardware uses hardware support. (Andrew tells me that without NextRIP support, Xen must use the instruction emulator when handling `invlpg` instructions on behalf of the guest, to calculate how many bytes to move the instruction pointer forward by. And it is the emulator which has the bug here.) So FEP could be used to cause the bug to manifest even on new hardware and indeed where FEP is available, XTF does then use FEP to run exactly the same set of tests. However, FEP is not available in Xen 4.5 and there are good reasons for not backporting it there. It would be possible to backport the bugfixes to Xen 4.5. However, the bugs address only very rare problems. Andrew thinks the bugs are, insofar they are bugs which might cause lossage, more likely to bbe roughly "crashes obscure or very oddly-behaved guests" than "crashes commonly used guests but only with very low probability. The latter kind of bug would be worth a backport; the former much less so (especially in a very old stable release, and especially when the fixes involve behavioural changes). The fixes would also provide an unquantified performance improvement on AMD hardware, due to avoiding extraneous TLB flushes, but Andrew says he doubts that's worth caring about. We discussed host stickiness, host-specific bug detection, and regression detection, in osstest. I reassured Andrew that I think the current osstest algorithms will deal with this situation tolerably well (if not perfectly). The conclusion is that there is nothing to be done, at least in the short term. There are good reasons for the bug to persist in 4.5 and good reasons for it being hard to detect on newer hardware. Ian. (Thanks to Andrew for the IRL explanation and for review of this email.) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |