[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/5] xen: Write CR0, CR3 and CR4 in arch_set_info_guest()



On 05/16/2015 02:13 AM, Andrew Cooper wrote:
> On 15/05/2015 21:45, Razvan Cojocaru wrote:
>> On 05/15/2015 06:57 PM, Jan Beulich wrote:
>>>>>> On 06.05.15 at 19:12, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>>>> Arch_set_info_guest() doesn't set CR0, CR3 or CR4. Added code
>>>> that does that.
>>> But you should also say a word on why this is needed, since things
>>> worked fine so far without, and enabling the functions to run
>>> outside of their own vCPU context is not immediately obviously
>>> correct.
>> This is a way to undo malicious CR writes. This is achieved for MSR
>> writes with the deny vm_event response flag patch in this series, but
>> the CR events are being send after the actual write. In such cases,
>> while the VCPU is paused before I put a vm_response in the ring, I can
>> simply write the old value back.
>>
>> I've brought up the issue in the past, and the consensus, IIRC, was that
>> I should not alter existing behaviour (post-write events) - so the
>> alternatives were either to add a new pre-write CR event (which seemed
>> messy), or this (which seemed less intrusive).
> 
> I always found it curious that some events where post and some pre.
> 
> I though I suggested (or at least considered suggesting) at one point
> that it would be sensible to select pre/post for event notification. 
> Pre comes with an ability to modify the outcome, whereas post is simply
> a notification that something happened.
> 
> Once again, if you deem this sensible then now is very definitely the
> time to do something about it :)

You did suggest it here:

http://lists.xen.org/archives/html/xen-devel/2014-10/msg00434.html

In fact, that thread spawned the whole vm_event redesign discussion:

http://lists.xen.org/archives/html/xen-devel/2014-10/threads.html#00438

I took the suggestion to mean at the time that we should have something
like EVENT_CR3_PRE and EVENT_CR3_POST, where basically all we needed was
for all events for which this applicable to be pre-write events. IMHO
that's simpler and sufficient: just send out an event when you know
that, unless you deny it, simply responding to it / unblocking the VCPU
will perform the write. So you know that the write is about to happen,
and by no denying it, that it will happen (or at least, that it's
extremely likely to happen - since some HV check can still fail somewhere).

So, again, just IMHO, the simpler modification would just be to turn all
events where this is applicable into pre-write events.

>> Of course, if it has now become acceptable to reconsider having the CR
>> vm_events consistently pre-write, the deny patch could be extended to them.
>>
>>>> -int hvm_set_cr0(unsigned long value)
>>>> +int hvm_set_cr0(struct vcpu *v, unsigned long value, bool_t with_vm_event)
>>>>  {
>>>> -    struct vcpu *v = current;
>>> This change is covered by neither the title nor the description, but
>>> considering it's you who sends this likely is the meat of the change.
>>> However, considering that the three calls you add to
>>> arch_set_info_guest() pass this in as zero, I even more wonder why
>>> what the title says is needed in the first place.
>>>
>>> I further wonder whether you wouldn't want an event if and only
>>> if v == current (in which case the flag parameter could be dropped).
>> It just seemed useless to send out a vm_event in the case you mention,
>> since presumably the application setting them is very likely the same
>> one receiving the events (though, granted, it doesn't need to be). So in
>> that case, it would be pointless to notify itself that it has done what
>> it knows it's done.
>>
>>>> @@ -3328,12 +3330,11 @@ int hvm_set_cr3(unsigned long value)
>>>>      return X86EMUL_UNHANDLEABLE;
>>>>  }
>>>>  
>>>> -int hvm_set_cr4(unsigned long value)
>>>> +int hvm_set_cr4(struct vcpu *v, unsigned long value, bool_t with_vm_event)
>>>>  {
>>>> -    struct vcpu *v = current;
>>>>      unsigned long old_cr;
>>>>  
>>>> -    if ( value & hvm_cr4_guest_reserved_bits(v, 0) )
>>>> +    if ( value & hvm_cr4_guest_reserved_bits(v, with_vm_event ? 0 : 1) )
>>> Why does this depend on with_vm_event? And if indeed correct,
>>> please simplify to just !with_vm_event.
>> hvm_cr4_guest_reserved_bits(v, 1) has an ASSERT(v != current) that
>> crashes the hypervisor in debug mode (surely for a very good reason). If
>> vm_event is true, then v != current, so there I've tried to avert the crash.
> 
> (As a side note, attempting to work around an ASSERT is never a valid
> option. There will be a reason why it is there in the first place, and
> that reason will most likely invalidate whatever train of logic
> attempted to work around it in the first place.  The other option is
> that the ASSERT is wrong, in which case it should be removed.)
> 
> This ASSERT is muddled up with the cpuid handling.  The second parameter
> flips between querying the domain cpuid policy, and the host cpuid
> policy, as a stopgap solution for some validity checking on domain restore.
> 
> I am in the middle (well - more like just started, too many high
> priority interrupts) of rewriting cpuid handling from scratch in an
> effort to make heterogeneous feature levelling work in a moderately sane
> fashion.
> 
> As it stands, altering this 0 to some function of with_vm_event is not
> valid, as it will change the calculation of which CR4 bits are permitted
> to be modified.  Unfortunately, my best suggestion here is to wire up as
> much logic as you can, and leave it short circuited with a /*TODO fixme
> when domain cpuid handling works properly */ comment.

Thank you, I was hoping that the ASSERT code will get discussed during
the review. I certainly didn't think that the ASSERT was wrong, but my
code did throw a new path to setting CR4 into the mix, so a discussion
was certainly needed here.

Do you mean wire up as much logic as possible inside
hvm_cr4_guest_reserved_bits(), or simply changing the condition to
if ( v == current && value & hvm_cr4_guest_reserved_bits(v, 0) ) ?


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.