[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V2] arch: arm: vgic-v3: fix GICD_ISACTIVER range



On 28.11.19 09:14, Julien Grall wrote:
Hi Stefano,

On 28/11/2019 01:12, Stefano Stabellini wrote:
On Wed, 27 Nov 2019, Jürgen Groß wrote:
On 27.11.19 10:31, Peng Fan wrote:
Subject: Re: [Xen-devel] [PATCH V2] arch: arm: vgic-v3: fix GICD_ISACTIVER
range

On 27.11.19 01:01, Julien Grall wrote:
Hi,

On 26/11/2019 23:17, Stefano Stabellini wrote:
On Tue, 26 Nov 2019, Julien Grall wrote:
Hi,

On 26/11/2019 20:43, Stefano Stabellini wrote:
+ Juergen

I missed that you weren't in CC to the original patch, sorry.
I think this patch should go in, as otherwise Linux 5.4 could run
into problems. It is also a pretty straightforward 4 lines patch.

5.5 (or 5.6) is not going to run on Xen for other reasons (still in
the vGIC)... So I would not view this as critical.

5.5 is not out yet, in fact, the dev window has just opened. Isn't
your statement a bit premature?

The GICv4.1 work [1] is going to prevent Linux booting on all current
versions of Xen. While I can't confirm this is going to be merged in
5.5, I can tell you this will break.


In any case, even if potential future Linux releases could have other additional issues, I don't think it should change our current view on
this specific issue which affects 5.4, just released.

The patch is definitely not as straightforward as you may think.
Please refer to the discussion we had on the first version. I voiced
concern about this approach and gave point what could go wrong with
happen.

This patch may be better than the current state (i.e crashing), but
this wasn't tested enough to confirm this is the correct things to do
and no other bug will appear (I don't believe reading I*ACTIVER was
ever tested before).

It is an annoying bug, but this is only affecting 5.4 which has just
been released. It feels to me this is a fairly risky choice to merge
it qutie late in the release without a good graps of the problem (see
above).

So I would definitly, prefer if this patch is getting through backport
once we get more testing.

We can still document the bug in the release note and point people to
the patch.

Anyway, this is Juergen choice here. But at least now he has the full
picture...

Cheers,

[1]
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flwn.

net%2FArticles%2F800494%2F&data=02%7C01%7Cpeng.fan%40nxp.co
m%7Cdca

dfb39240749ee675e08d772fcd3ba%7C686ea1d3bc2b4c6fa92cd99c5c30163
5%7C0%7

C0%7C637104302519996592&sdata=7Jv2IhI8HZgBTSuYzkEplFyhX1lzmv
d73xb5
2d6ERVQ%3D&reserved=0


Thanks, Julien, for sharing your opinion.

With that statement I'd like to defer this patch to 4.14.

But without this patch, 5.4 kernel will crash. So you prefer
we develop the solution as Julien suggested for 4.13?

I certainly won't take a patch for 4.13 when a maintainer of the
related code has reservations against it.

I think the best thing to do is to develop a proper patch the
maintainers are happy with and don't try to force it into 4.13 now.
Such a patch can still be backported to 4.13 later.

I chatted with Juergen and he explained to me something I didn't know
before. The release manager can only *block* a patch from being
committed, he/she cannot actually decide if the patch should be
committed or not for a given release. He/she cannot overrule a
maintainer either.

In this case, Juergen cannot make the decision on whether the patch
should go in 4.13 or not.

Although I couldn't reproduce the problem on Xilinx boards, I have to
take the community angle on this, and I would like to make sure our
releases work properly on any hardware, including NXP. Thus, I'll make
the case one more time, hoping that Julien might change his mind :-)

We had promise that patches to support NXP will be upstreamed, but this was never done. If you look at [1], there are a lot of patches on top of it.

So I don't think NXP boot out-of-box and therefore I don't think we should make the decision based on this.


We know that the bug fix won't introduce any regressions because, as
Julien wrote, this code path was never used before. Also because of
that, waiting for the backport and more OSSTest runs won't make much of
a difference because OSSTest won't exercise this code path.

Conversely, we don't know how many regression this is going to be introducing for Linux 5.4 because this was only reproduced once and we know the implementation is incorrect.

But OSSTest is also testing different version of Linux and  5.4 should be tested soon (if not already), so we should also be able to exercise this code path.


It is true that the original code handling GICD_ISACTIVER was never spec
compliant, and it should be fixed properly. However, that is not what
this patch addresses. That code, in addition from not being spec
compliant by design, it also happens to have a typo. Fixing the typo at
this stage of the release is appropriate at least to get a consistent
behavior in the handling of GICD_ISACTIVER*, and also for Linux 5.4 as
guest. Not to give a false impression to users, the warning ensures that
the underlying Xen behavior is flagged appropriately.

Again, you have no promise that this will make the right thing for Linux 5.4.

This has been my point for the past week and ignored on the ground that some of the I*ACTIVER registers were implemented like that so we must continue to spread the false.


In short, I think the patch should go in now and there are no downsides
to it. That's it, I rest my case. Julien, I hope you'll reconsider.
I don't really see the point to try to allow Linux 5.4 booting on Xen 4.13 without knowing whether we are not going to uncovered more BUG around I*ACTIVER.

Sorry, but this is a rather weird statement.

IIUC you are saying that a typo blocking boot of Linux 5.4 should not be
fixed as you are not sure there are no other bugs?

If you really want this patch in Xen 4.13, then you should read my mail on the first version and trying to answer me why this 5.4 is going to be safe running on Xen 4.13.

Or do you think that with the typo fixed and running Linux 5.4 guests
will destabilize the host?

If yes, what about meeting in the middle?

Add a Kconfig option to enable the typo fix and declare it as not
security supported when enabled. This will make it possible to evaluate
whether Linux 5.4 can be run on top of Xen without doing any harm.

If you can't answer that, then there are no reason to get this patch Xen 4.13.

For the initial 4.13 release it is too late anyway.

I would have been happy to get this patch in next (not Xen 4.13), but this thread doesn't give me the confidence that someone sat done and fully understood the problem. So until someone is able to explain me whether this patch is safe:

NAcked-by: Julien Grall <julien@xxxxxxx>

Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.