[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 0/2] MMIO emulation fixes



>>> On 10.08.18 at 18:37, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 10/08/18 17:30, George Dunlap wrote:
>> Sorry, what exactly is the issue here?  Linux has a function called
>> load_unaligned_zeropad() which is reading into a ballooned region?

Yes.

>> Fundamentally, a ballooned page is one which has been allocated to a
>> device driver.  I'm having a hard time coming up with a justification
>> for having code which reads memory owned by B in the process of reading
>> memory owned by A.  Or is there some weird architectural reason that I'm
>> not aware of?

Well, they do this no matter who owns the successive page (or
perhaps at a smaller granularity also the successive allocation).
I guess their goal is to have just a single MOV in the common
case (with the caller ignoring the uninteresting to it high bytes),
while recovering gracefully from #PF should one occur.

> The underlying issue is that the emulator can't cope with a single
> misaligned access which crosses RAM and MMIO.  It gives up and
> presumably throws #UD back.

We wouldn't have observed any problem if there was #UD in
such a case, as Linux'es fault recovery code doesn't care what
kind of fault has occurred. We're getting back a result of all
ones, even for the part of the read that has actually hit the
last few bytes of the present page.

> One longstanding Xen bug is that simply ballooning a page out shouldn't
> be able to trigger MMIO emulation to begin with.  It is a side effect of
> mixed p2m types, and the fix for this to have Xen understand the guest
> physmap layout.

And hence the consideration of mapping in an all zeros page
instead. This is because of the way __hvmemul_read() /
__hvm_copy() work: The latter doesn't tell its caller how many
bytes it was able to read, and hence the former considers the
entire range MMIO (and forwards the request for emulation).
Of course all of this is an issue only because
hvmemul_virtual_to_linear() sees no need to split the request
at the page boundary, due to the balloon driver having left in
place the mapping of the ballooned out page.

Obviously the opposite case (access starting in a ballooned
out page and crossing into an "ordinary" one) would have a
similar issue, which is presumably even harder to fix without
going the map-a-zero-page route (or Paul's suggested
null_handler hack).

> However, the real bug is Linux making such a misaligned access into a
> ballooned out page in the first place.  This is a Linux kernel bug which
> (presumably) manifests in a very obvious way due to shortcomings in
> Xen's emulation handling.

I wouldn't dare to judge whether this is a bug, especially in
light that they recover gracefully from the #PF that might result in
the native case. Arguably the caller has to have some knowledge
about what might live in the following page, as to not inadvertently
hit an MMIO page rather than a non-present mapping. But I'd
leave such judgment to them; our business is to get working a case
that is working without Xen underneath.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.