[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH-for-4.14] ioreq: handle pending emulation racing with ioreq server destruction



On 08.06.2020 13:04, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@xxxxxxxx>
>> Sent: 08 June 2020 11:58
>> To: Paul Durrant <paul@xxxxxxx>
>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx; Paul Durrant <pdurrant@xxxxxxxxxx>; 
>> Marek Marczykowski-Górecki
>> <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>> Subject: Re: [PATCH-for-4.14] ioreq: handle pending emulation racing with 
>> ioreq server destruction
>>
>> On 08.06.2020 11:46, Paul Durrant wrote:
>>> From: Paul Durrant <pdurrant@xxxxxxxxxx>
>>>
>>> When an emulation request is initiated in hvm_send_ioreq() the guest vcpu is
>>> blocked on an event channel until that request is completed. If, however,
>>> the emulator is killed whilst that emulation is pending then the ioreq
>>> server may be destroyed. Thus when the vcpu is awoken the code in
>>> handle_hvm_io_completion() will find no pending request to wait for, but 
>>> will
>>> leave the internal vcpu io_req.state set to IOREQ_READY and the vcpu 
>>> shutdown
>>> deferall flag in place (because hvm_io_assist() will never be called). The
>>> emulation request is then completed anyway. This means that any subsequent 
>>> call
>>> to hvmemul_do_io() will find an unexpected value in io_req.state and will
>>> return X86EMUL_UNHANDLEABLE, which in some cases will result in continuous
>>> re-tries.
>>>
>>> This patch fixes the issue by moving the setting of io_req.state and 
>>> clearing
>>> of shutdown deferral (as will as MSI-X write completion) out of 
>>> hvm_io_assist()
>>> and directly into handle_hvm_io_completion().
>>>
>>> Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>>> Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx>
>>
>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>> with a question:
>>
>>> --- a/xen/arch/x86/hvm/ioreq.c
>>> +++ b/xen/arch/x86/hvm/ioreq.c
>>> @@ -109,15 +109,7 @@ static void hvm_io_assist(struct hvm_ioreq_vcpu *sv, 
>>> uint64_t data)
>>>      ioreq_t *ioreq = &v->arch.hvm.hvm_io.io_req;
>>>
>>>      if ( hvm_ioreq_needs_completion(ioreq) )
>>> -    {
>>> -        ioreq->state = STATE_IORESP_READY;
>>>          ioreq->data = data;
>>> -    }
>>> -    else
>>> -        ioreq->state = STATE_IOREQ_NONE;
>>> -
>>> -    msix_write_completion(v);
>>> -    vcpu_end_shutdown_deferral(v);
>>>
>>>      sv->pending = false;
>>>  }
>>
>> With this, is the function worth keeping at all?
> 
> I did think about that, but it is called in more than one place. So, in the 
> interest of trying to make back-porting straightforward, I thought it best to 
> keep it simple.
> 
>>
>> Also I assume the patch has your implied R-a-b?
>>
> 
> Since it has your R-b now, yes :-)

Thanks. As said in the other thread, I think the change here will mask
a problem elsewhere (presumably in qemu). I'm therefore unsure whether
we want to apply it right away, rather than first understanding the
root cause of Marek's specific problem.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.