[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Locking on vm-event operations (monitor)

To: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
From: Corneliu ZUZU <czuzu@xxxxxxxxxxxxxxx>
Date: Fri, 22 Jul 2016 14:39:06 +0300
Cc: Xen-devel <xen-devel@xxxxxxxxxxxxx>
Delivery-date: Fri, 22 Jul 2016 11:39:05 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 7/22/2016 1:29 PM, Razvan Cojocaru wrote:

On 07/22/2016 01:17 PM, Corneliu ZUZU wrote:

On 7/22/2016 12:55 PM, Razvan Cojocaru wrote:

On 07/22/2016 12:27 PM, Corneliu ZUZU wrote:

Hi,

I've been inspecting vm-event code parts to try and understand when and
why domain pausing/locking is done. If I understood correctly, domain
pausing is done solely to force all the vCPUs of that domain to see a
configuration update and act upon it (e.g. in the case of a
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
domain pausing/unpausing ensures immediate enabling of CR3 load-exiting
for all vCPUs), not to synchronize concurrent operations
(lock-behavior).

As for locking, I see that for example vm_event_init_domain(),
vm_event_cleanup_domain() and monitor_domctl() are all protected by the
domctl-lock, but I don't think that's enough.
Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
resources, but it doesn't seem to take the domctl lock, so it seems
possible for that to happen _while_ those resources are
initialized/cleaned-up
* monitor_guest_request() also calls
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
which attempts a vm_event_ring_lock(ved), which could also be called
_while_ that's initialized (vm_event_enable()) or cleaned-up
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()

there doesn't seem to be taken into account the possibility of a
concurrent monitor_init_domain()/monitor_cleanup_domain()

Am I missing something with these conclusions?

Your conclusions look correct, but I assume that the reason why this has
not been addressed in the past is that introspection applications are
expected to be well-behaved.

I wouldn't think that was the rationale (considering that the risk is
crashing the hypervisor as a whole; also see below), I'd rather think
this simply went unnoticed.

Specifically, in a codebase where the
choice between uint64_t and long int matters speed-wise, and where
unlikely()s also matter, an extra lock may be an issue.

The typical flow of an introspection application is:

1. Initialize everything.
2. Subscribe to relevant events.
3. Event processing loop.
4. Unsubscribe from events.
5. Do a last-run of event processing (already queued in the ring buffer).
6. Uninitialize everything (no events are possible here because of steps
4-5).

And even if an introspection application behaves as sanely as you say,
the current layout of the code still doesn't guarantee bug-free
behavior. That's because for one, there are in-guest events (that
trigger access of vm-events resources) that are completely asynchronous
in relation to 2-6, for example a HVMOP_guest_request_vm_event
(subsequently calling monitor_guest_request()).

Not true. As part of step 4 d->monitor.guest_request_enabled will become
0, so no sensitive code will run after that:

I wasn't being rigorous with 2-6 above (with that I only meant to say'there can't be any vm-events before everything is initialized') and itseems true that at least in the monitor_guest_request() case it mightnot be able to produce a hypervisor crash (although there are a numberof ugly inconsistencies on that code path). But nonetheless a hypervisorcrash can _still_ happen even with a _sane_ toolstack user.


Look @ hvm_do_resume():

1. If you, as a toolstack user, get at sane step no. 6: "Uninitializeeverything (no events are possible here because of steps 4-5)."2. But just before you do that, a hvm_do_resume() happens on anarbitrary vCPU of the domain and gets to this point:


    if ( unlikely(v->arch.vm_event) )
    {

// ---> let's say you are now at this point, just after theabove non-NULL check

        struct monitor_write_data *w = &v->arch.vm_event->write_data;

3. and before proceeding further, the uninitialization sequence_completes_, i.e. this was done in vm_event_cleanup_domain():


    for_each_vcpu ( d, v )
    {
        xfree(v->arch.vm_event);
        v->arch.vm_event = NULL;
    }

4. and then in hvm_do_resume() this gets to be done:

        struct monitor_write_data *w = &v->arch.vm_event->write_data;

        [...]

        if ( w->do_write.msr )
        {
            hvm_msr_write_intercept(w->msr, w->value, 0);
            w->do_write.msr = 0;
        }

then you'll have a beautiful NULL-pointer access hypervisor crash.


129 void monitor_guest_request(void)
130 {
131     struct vcpu *curr = current;
132     struct domain *d = curr->domain;
133
134     if ( d->monitor.guest_request_enabled )
135     {
136         vm_event_request_t req = {
137             .reason = VM_EVENT_REASON_GUEST_REQUEST,
138             .vcpu_id = curr->vcpu_id,
139         };
140
141         monitor_traps(curr, d->monitor.guest_request_sync, &req);
142     }
143 }

The guest _may_ try to request an event after that, but the request will
be ignored.

, and that "subsys_lock"
sounds obscure to me, although I admit that I can't think of a good name
at the moment.

I'm referring to monitor, share & paging as "subsystems" of the whole
"vm-events" subsystem (in that sense the 3 aforementioned are
"sub-subsystems").
To me it's acceptable, but I'm open to better names.

If Tamas agrees with you I'll consider myself outvoted.


Thanks,
Razvan


Corneliu.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] Locking on vm-event operations (monitor)
  - From: Razvan Cojocaru

References:
- [Xen-devel] Locking on vm-event operations (monitor)
  - From: Corneliu ZUZU
- Re: [Xen-devel] Locking on vm-event operations (monitor)
  - From: Razvan Cojocaru
- Re: [Xen-devel] Locking on vm-event operations (monitor)
  - From: Corneliu ZUZU
- Re: [Xen-devel] Locking on vm-event operations (monitor)
  - From: Razvan Cojocaru

Prev by Date: Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain
Next by Date: Re: [Xen-devel] Locking on vm-event operations (monitor)
Previous by thread: Re: [Xen-devel] Locking on vm-event operations (monitor)
Next by thread: Re: [Xen-devel] Locking on vm-event operations (monitor)
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.