[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xc_hvm_inject_trap() races



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: 2 November, 2016 10:50
> To: rcojocaru@xxxxxxxxxxxxxxx; Andrei Vlad LUTAS
> <vlutas@xxxxxxxxxxxxxxx>
> Cc: andrew.cooper3@xxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx;
> tamas@xxxxxxxxxxxxx
> Subject: RE: RE: [Xen-devel] xc_hvm_inject_trap() races
>
> >>> On 01.11.16 at 23:17, <vlutas@xxxxxxxxxxxxxxx> wrote:
> > From: Jan Beulich [mailto:jbeulich@xxxxxxxx]
> > Sent: 1 November, 2016 18:40
> >>>> Andrei Vlad LUTAS <vlutas@xxxxxxxxxxxxxxx> 11/01/16 5:13 PM >>>
> >>>First of all, to answer your original question: the injection
> >>>decision is made when the introspection logic needs to inspect a page
> >>>that is not present in the physical memory. We don't really care if
> >>>the current instruction triggers multiple faults or not (and here I'm
> >>>not sure what you mean by that - multiple exceptions, or multiple EPT
> >>>violations - but the answer is still the same), and removing the page
> >>>restrictions after the #PF injection is introspection specific logic
> >>>- the address for which we inject the #PF doesn't have to be related
> >>>in any way to the
> > current instruction.
> >
> >>Ah, that's this no-architectural behavior again.
> >
> > I don't think the HVI #PF injection internals or how the #PF is
> > handled by the OS are relevant here. We are using an existing API that
> > seems to not work quite correct under certain circumstances and we
> > were curious if any of you can shed some light in this regard, and
> > maybe point us to the right direction for cooking up a fix.
> >
> >>What if the OS doesn't fully carry out the page-in, relying on the #PF
> >>to
> > retrigger once the insn for which it got reported has been restarted?
> >
> > Can you be more specific?
>
> Well, perhaps with the answer you gave further down that's not that
> relevant anymore, but consider a #PF handler which handles just the top
> most not-present page table level each time it gets invoked. I.e.
> for a not-present L4 entry it would take 4 re-invocations of the same original
> instruction to resolve all 4 levels.

I see what you're referring to. As I explained to Andrew in a previous mail - 
the #PF injection logic is indeed OS specific, and in our particular case 
(since VM introspection already has to handle a lot of OS specific stuff), we 
don't have to deal with such a behavior on the supported operating systems. 
Anyway, the example you provided would involve significant added performance 
penalty and I don't see why an OS would do that (nor have I heard of any doing 
it), but I understand your concern.

>
> >> Or what if the page gets paged out again before the insn actually
> >> gets to
> > execute (e.g. because a re-schedule happened inside the guest on the
> > way out of the #PF handler)? All of this suggests that you really
> > can't lift >any restrictions _before_ seeing what you need to see.
> >
> > We don't really care when and how the #PF is handled. We don't care if
> > the page is paged out at some random point. What we do know is that at
> > a certain point in the future, the page will be swapped in; how do we
> > know when? The OS will write the guest page tables, at which point we
> > can inspect the physical page itself (so you can see here why we don't
> > care about the page being swapped out sometime in the future). So we
> > really _can_ lift any restriction we want at that point.
>
> Hmm, I'm having difficulty seeing the supposedly broken flow of events
> here: Earlier it was said that #PF injection would be a result of EPT event
> processing. Here you say that the lifting of the restrictions would be a 
> result
> of seeing the guest modify its page tables (which would in turn be a result of
> the #PF actually having arrived in the guest). So if (with this, and as you 
> say
> above) you don't care when the #PF gets handled, where's the original
> problem?

That's not what I wanted to say, sorry if it was unclear. What I'm trying to 
say is that the decision to inject a #PF can be made when handling an EPT 
violation - the accessed page needs not be related in any way with the page for 
which we decide to inject the #PF. For example, we intercept writes in a list 
that describes the loaded module. Whenever a new module is loaded, an entry 
would be inserted into that list, and that would generate an EPT write 
violation. Now, the introspection logic will be able to analyze what module was 
loaded and where, and it may find out that the module headers (which are needed 
by the protection logic) are not present in memory - therefore, it would inject 
a #PF in order to force the OS to swap in said headers. On the other hand, the 
HVI logic may also decide that it doesn't need to watch for modules loading 
anymore (for example, all the interesting modules were loaded), so it will 
remove the write hook from the list of loaded modules. These two (injection of 
the #PF and the removal of the EPT write protection) would be done in the same 
event handler, so we can't rely on the event being re-generated in this case. 
Hopefully this example makes it more clear.

>
> >>>Assuming that we wouldn't remove the restrictions and we would rely
> >>>on re-generating the event - that is not acceptable: first of all
> >>>because the instruction would normally be emulated anyway before
> >>>re-entering the guest,
> >
> >>How would that be a problem?
> >
> > I thought it was obvious without further clarification: how can we
> > expect the exact same event to be generated, if the instruction that
> > triggered it in the first place was emulated or single stepped?
>
> Neither emulation nor single stepping should result in architectural events
> (exceptions) to be missed (or else there's a bug somewhere).
> Non-architectural #PF like you're using of course can't (currently) be
> guaranteed to arrive at any particular point in time.
>
> The fact that {vmx,svm}_inject_trap() combine the new exception with an
> already injected one (and blindly discard events other than hw exceptions),
> otoh, looks like indeed wants to be controllable by the caller: When the
> event comes from the outside (the hypercall), it would clearly seem better to
> simply tell the caller that no injection happened and the event needs to be
> kept pending. The main question then is how to make certain injection gets
> retried at the right point in time (read: once the other interrupt handler 
> IRETs
> back to original context).

Yes, this is basically our problem. Right now, the #PF would overwrite other 
interrupts, which is very bad. On the other hand, it can't return an error (if 
I understand the code correctly), since it can't know if another event will be 
scheduled for injection. As I told Andrew, at least returning an error that 
would indicate the #PF cannot be injected may help us a lot here (I'm sure 
making the injected trap take precedence over other events would not be 
acceptable).

>
> Jan
>
> ________________________
> This email was scanned by Bitdefender

Best regards,
Andrei.

________________________
This email was scanned by Bitdefender

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.