[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: system freeze when processor.ko is loaded during boot



On 03/31/2011 08:23 AM, Haitao Shan wrote:

> I have checked your dump info via debug key. I saw that the EIPs
> remained the same between two successive dump. However, without the
> symbols I could not identify which code kernel was hanging on. Is it
> possible that you can find this information by disassembling the kernel
> binaries (with symbols). 

I think  Jan did just that already, I am attaching his analysis again.

> Or could you please repeat your test using an
> upstreaming Xen and kernel so that I could compile a same kernel just as
> you would be using?

Can do that but it needs some time.

> And I see you CPU is a very old model, UP without 64 bit support and no
> PAE? Right?

It has PAE, but it is UP and has no 64bit nor VT-x.

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 13
model name      : Intel(R) Pentium(R) M processor 2.13GHz
stepping        : 8
cpu MHz         : 800.000
cache size      : 2048 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
clflush dts acpi mmx fxsr sse sse2 ss tm pbe up bts est tm2
bogomips        : 1596.45
clflush size    : 64
cache_alignment : 64
address sizes   : 32 bits physical, 32 bits virtual

Martin

--- Begin Message ---
>>> On 29.03.11 at 00:48, Martin Wilck <mwilck@xxxxxxxx> wrote:
> Here is one more capture. It shows that (unfortunately) clocksource=pit
> doesn't help here, and that the xen watchdog hits if I configure it
> (just that the reboot doesn't work, and I can only see the output since
> I've been using the serial console).

The stack evaluates to 

logarithmic_accumulation
update_wall_time
do_timer(0x898d7)
tick_do_update_jiffies64
tick_sched_timer
__run_hrtimer
hrtimer_interrupt
timer_interrupt

(matches the previously sent one, just that there the tick count
passed to do_timer() is "only" 0x179ab.

So the kernel, afaict, is busy recovering from the time jump in Xen.

It is clearly also a bad sign that the NMI hit while Dom0 was
executing, as that guarantees interrupts aren't disabled (and
hence timer interrupts can occur, and timers would not be
prevented from running - presumably the time jump suppressed
the invocation of, among others, the NMI timer).

Jan


--- End Message ---
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.