[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] panic: "IO-APIC + timer doesn't work" - several people have reproduced



On 17.03.2020 14:48, Jason Andryuk wrote:
> On Wed, Mar 4, 2020 at 11:06 AM Jason Andryuk <jandryuk@xxxxxxxxx> wrote:
>>
>> On Wed, Feb 19, 2020 at 3:25 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>>>
>>> On 18.02.2020 22:45, Andrew Cooper wrote:
>>>> On 18/02/2020 18:43, Jason Andryuk wrote:
>>>>> On Mon, Feb 17, 2020, 8:22 PM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> 
>>>>> wrote:
>>>>>> On 17/02/2020 20:41, Jason Andryuk wrote:
>>>>>>> On Mon, Feb 17, 2020 at 2:46 PM Andrew Cooper 
>>>>>>> <andrew.cooper3@xxxxxxxxxx> wrote:
>>>>>>>> We have multiple bugs.
>>>>>>>>
>>>>>>>> First and foremost, Xen seems totally broken when running in ExtINT
>>>>>>>> mode.  This needs addressing, and ought to be sufficient to let Xen
>>>>>>>> boot, at which point we can try to figure out why it is trying to fall
>>>>>>>> back into 486(ish) compatibility mode.
>>>>> Xen has "enabled ExtINT on CPU#0" while linux has "masked ExtINT on
>>>>> CPU#0" so linux isn't using ExtINT?
>>>>
>>>> It would appear not.  Even more concerningly, on my Kabylake box,
>>>>
>>>> # xl dmesg | grep ExtINT
>>>> (XEN) enabled ExtINT on CPU#0
>>>> (XEN) masked ExtINT on CPU#1
>>>> (XEN) masked ExtINT on CPU#2
>>>> (XEN) masked ExtINT on CPU#3
>>>> (XEN) masked ExtINT on CPU#4
>>>> (XEN) masked ExtINT on CPU#5
>>>> (XEN) masked ExtINT on CPU#6
>>>> (XEN) masked ExtINT on CPU#7
>>>>
>>>> which at first glance suggests that we have something asymmetric being
>>>> set up.
>>>
>>> That's perfectly normal - ExtINT may be enabled on just one CPU,
>>> and that's CPU0 in our case (until such time that we would want
>>> to be able to offline CPU0).
>>
>> Thanks, Jan.  Linux prints masked ExtINT for all 8 CPU threads.
>>
>> I inserted __print_IO_APIC() before the "IO-APIC + timer doesn't work" panic.
>>
>> Using vector-based indexing
>> IRQ to ping mappings:
>> IRQ240 -> 0:2
>>
>> where Linux prints
>> IRQ0 -> 0:2
>>
>> That may just be the difference between Xen printing the Vector vs.
>> Linux printing the IRQ number.
>>
>> Any pointers to what I should investigate?
> 
> I got it to boot past "IO-APIC + timer doesn't work".  I programmed
> the HPET to provide a periodic timer in hpet_resume() on T0.  When I
> actually got it programmed properly, it worked to increment
> pit0_ticks.  I also made timer_interrupt() unconditionally
> pit0_ticks++ though that may not matter.

Hmm, at the first glance I would imply the system gets handed to Xen
with a HPET state that we don't (and probably also shouldn't) expect.
Could you provide HPET_CFG as well as all HPET_Tn_CFG and
HPET_Tn_ROUTE values as hpet_resume() finds them before doing any
adjustments to them? What are the components / parties involved in
getting Xen loaded and started?

> Now it panics in pv_destroy_gdt() when it fails "ASSERT(v == current
> || !vcpu_cpu_dirty(v));" when building dom0.  I haven't investigated
> that yet.

This would seem entirely unrelated to me.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.