[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetBSD dom0 PVH: hardware interrupts stalls



On 27.11.2020 14:31, Manuel Bouyer wrote:
> On Fri, Nov 27, 2020 at 02:18:54PM +0100, Jan Beulich wrote:
>> On 27.11.2020 14:13, Manuel Bouyer wrote:
>>> On Fri, Nov 27, 2020 at 12:29:35PM +0100, Jan Beulich wrote:
>>>> On 27.11.2020 11:59, Roger Pau Monné wrote:
>>>>> --- a/xen/arch/x86/hvm/irq.c
>>>>> +++ b/xen/arch/x86/hvm/irq.c
>>>>> @@ -187,6 +187,10 @@ void hvm_gsi_assert(struct domain *d, unsigned int 
>>>>> gsi)
>>>>>       * to know if the GSI is pending or not.
>>>>>       */
>>>>>      spin_lock(&d->arch.hvm.irq_lock);
>>>>> +    if ( gsi == TRACK_IRQ )
>>>>> +        debugtrace_printk("hvm_gsi_assert irq %u trig %u assert count 
>>>>> %u\n",
>>>>> +                          gsi, trig, hvm_irq->gsi_assert_count[gsi]);
>>>>
>>>> This produces
>>>>
>>>> 81961 hvm_gsi_assert irq 34 trig 1 assert count 1
>>>>
>>>> Since the logging occurs ahead of the call to assert_gsi(), it
>>>> means we don't signal anything to Dom0, because according to our
>>>> records there's still an IRQ in flight. Unfortunately we only
>>>> see the tail of the trace, so it's not possible to tell how / when
>>>> we got into this state.
>>>>
>>>> Manuel - is this the only patch you have in place? Or did you keep
>>>> any prior ones? Iirc there once was one where Roger also suppressed
>>>> some de-assert call.
>>>
>>> Yes, I have some of the previous patches (otherwise Xen panics).
>>> Attached is the diffs I currently have 
>>
>> I think you want to delete the hunk dropping the call to
>> hvm_gsi_deassert() from pt_irq_time_out(). Iirc it was that
>> addition which changed the behavior to just a single IRQ ever
>> making it into Dom0. And it ought to be only the change to
>> msix_write() which is needed to avoid the panic.
> 
> yes, I did keep the hvm_gsi_deassert() patch because I expected it
> to make things easier, as it allows to interract with Xen without changing
> interrupt states.

Right, but then we'd need to see the beginning of the trace,
rather than it starting at (in this case) about 95,000. Yet ...

> I removed it, here's a new trace
> 
> http://www-soc.lip6.fr/~bouyer/xen-log12.txt

... hmm, odd - no change at all:

95572 hvm_gsi_assert irq 34 trig 1 assert count 1

I was sort of expecting that this might be where we fail to
set the assert count back to zero. Will need further
thinking, if nothing else than how to turn down the verbosity
without hiding crucial information. Or maybe Roger has got
some idea ...

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.