[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3] CODING_STYLE: Document how to handle unexpected conditions



On 06.01.2020 17:19, George Dunlap wrote:
> On 12/10/19 1:46 PM, Jan Beulich wrote:
>> On 10.12.2019 11:56, George Dunlap wrote:
>>> On 12/9/19 1:50 PM, Jan Beulich wrote:
>>>> On 09.12.2019 12:29, George Dunlap wrote:
>>>>> --- a/CODING_STYLE
>>>>> +++ b/CODING_STYLE
>>>>> @@ -133,3 +133,97 @@ the end of files.  It should be:
>>>>>   * indent-tabs-mode: nil
>>>>>   * End:
>>>>>   */
>>>>> +
>>>>> +Handling unexpected conditions
>>>>> +------------------------------
>>>>> +
>>>>> +GUIDELINES:
>>>>> +
>>>>> +Passing errors up the stack should be used when the caller is already
>>>>> +expecting to handle errors, and the state when the error was
>>>>> +discovered isn’t broken, or isn't too hard to fix.
>>>>> +
>>>>> +domain_crash() should be used when passing errors up the stack is too
>>>>> +difficult, and/or when fixing up state of a guest is impractical, but
>>>>> +where fixing up the state of Xen will allow Xen to continue running.
>>>>> +This is particularly appropriate when the guest is exhibiting behavior
>>>>> +well-behaved guest should.
>>>>
>>>> DYM "shouldn't"?
>>>
>>> Indeed.
>>
>> (Btw, noticing only now - there's also either an "a" missing, or it
>> wants to be "guests".)
>>
>>>>> +- domain_crash() is similar to BUG_ON(), but with a more limited
>>>>> +effect: it stops that domain immediately.  In situations where
>>>>> +continuing might cause guest or hypervisor corruption, but destroying
>>>>> +the guest allows the hypervisor to continue, this can change a more
>>>>> +serious bug into a guest denial-of-service.  But in situations where
>>>>> +returning an error might be safe, then domain_crash() can change a
>>>>> +benign failure into a guest denial-of-service.
>>>>
>>>> Perhaps further put emphasis on the call tree still getting unwound
>>>> normally, which may imply further actions on the (now dying) domain
>>>> taken. Unfortunately it's not unusual for people to forget this; I
>>>> think the IOMMU code in particular was (hopefully isn't so much
>>>> anymore) a "good" example of this.
>>>
>>> Can you expand on this?  Do you mean to advise that care should be taken
>>> when returning up the callstack that the domain which was running before
>>> may now be dying, and to behave appropriately?
>>
>> One issue is with functions returning void, where the caller won't
>> even know that something may have gone wrong. Another is that
>> typically error paths are less commonly used, and crashing a
>> domain would typically be accompanied by indicating an error to
>> the upper layers. Hence such crashing may trigger unrelated bugs.
>> A third aspect is that, indeed, dying guests may need special
>> treatment (see the already existing ->is_dying checks we have).
>>
>> I mentioned the call tree unwinding in particular because earlier
>> on we had domain_crash_synchronous(), which was there specifically
>> to avoid issues with errors (and the changed state) bubbling back
>> up. But this model had other issues, hence our movement away from
>> it.
> 
> How about a paragraph like this:
> 
> ---
> Note however that domain_crash() has its own traps: callers far up the
> call stack may not realize that the domain is now dying as a result of
> an innocuous-looking operation, particularly if somewhere between the
> failure and the operation, no error is returned.  Using it requires
> careful inspection and documentation of the code to make sure all
> callers at the stack handle a newly-dead domain gracefully.
> ---

Sounds good, thanks.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.