|
|
|
|
|
|
|
|
|
|
xen-ppc-devel
[XenPPC] [RFC] 'xm restore' following boot
'xm restore' immediately following boot usually wedges the cpu.
However, xm save followed by xm restore works fine (even when
guest domain and htab are relocated to new memory areas).
^AAA shows: with .plpar_hcall_norets @ c00000000003af78
and .HYPERVISOR_sched_op @ c00000000004415c
(XEN) *** Dumping CPU3 state: ***
(XEN) ----[ Xen-3.0-unstable ]----
(XEN) CPU: 00000003 DOMID: 00000001
(XEN) pc c00000000003af88 msr 8000000000009032
(XEN) lr c000000000044210 ctr c000000000044238
(XEN) srr0 ffffffffffffffff srr1 ffffffffffffffff
(XEN) r00: 0000000024555548 c00000000065bcb0 c000000000656630 0000000000000000
(XEN) r04: 0000000000000001 0000000000000000 0000000024555542 c00000000000fc24
(XEN) r08: 00000000ecf515a8 c000000000044238 0000000000989680 c0000000000441a4
(XEN) r12: 0000000001a9f9f8 c00000000052e300 5555555555555555 5555555555555555
(XEN) r16: 5555555555555555 5555555555555555 5555555555555555 5555555555555555
(XEN) r20: 5555555555555555 5555555555555555 5555555555555555 5555555555555555
(XEN) r24: 5555555555555555 5555555555555555 4000000000000000 c000000000000000
(XEN) r28: 0000000000000000 0000000000000010 c00000000053d3c8 0000000000000001
(XEN) reprogram_timer[00] Timeout in the past 0x0000004332DBA479 >
0x00000042C2424DF3
Here are typical console with debug prints and execptions:
If 'xm restore' is run several times, often it will start working,
though the exceptions still occur... (user domain has ramdisk & networking)
At the bottom, some code specified by a couple Exceptions...
1. 'xm restore' following xm save:
cso84:~ # xm console 6
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming
hid4: 0x6200120000000042
arch_gnttab_map: grant table at d000080080000000
irq_resume()
switch_idle_mm()
mfdec: 14315899
__sti()
xencons_resume()
xenbus_resume()
smp_resume()
mfdec: 63024
returning
netfront: device eth0 has copying receive path.
[user@bringup /]#
2. reboot with 'xm restore' that worked 1st time:
cso84:~ # xm console 1
mfdec: -14
TIMEBASE_FREQ: 71592390
Here we're resuming
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume()
switch_idle_mm()
mfdec: 14315924
__sti()
xencons_resume()
xenbus_resume()
BUG: soft lockup detected on CPU#0!
Call Trace:
[C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128
[C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c
LR = .__do_IRQ+0x1ac/0x2b4
[C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable)
[C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4
[C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4
[C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28
[C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100
[C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
LR = .HYPERVISOR_sched_op+0xb4/0x10c
[C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable)
[C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98
[C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154
[C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c
[C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308
[C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54
smp_resume()
mfdec: 90178
returning
netfront: device eth0 has copying receive path.
[user@bringup /]#
3. reboot with typical wedge:
cso84:~ # xm console 1
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume()
switch_idle_mm()
mfdec: 14315903
__sti()
xencons_resume()
xenbus_resume()
smp_resume()
mfdec: 14218880
returning
BUG: soft lockup detected on CPU#0!
Call Trace:
[C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128
[C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c
LR = .__do_IRQ+0x1ac/0x2b4
[C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable)
[C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4
[C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4
[C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28
[C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100
[C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
LR = .HYPERVISOR_sched_op+0xb4/0x10c
[C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable)
[C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98
[C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154
[C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c
[C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308
[C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54
cso84:~ #
4. reboot with another wedge:
cso84:~ # xm console 1
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume()
switch_idle_mm()
mfdec: 14315908
__sti()
xencons_resume()
xenbus_resume()
BUG: soft lockup detected on CPU#0!
Call Trace:
[C000000001AA3650] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C000000001AA3700] [C00000000008956C] .softlockup_tick+0x100/0x128
[C000000001AA37C0] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C000000001AA3840] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C000000001AA3970] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .plpar_hcall_norets+0x10/0x1c
LR = .HYPERVISOR_event_channel_op+0x34/0x50
[C000000001AA3C60] [C0000000000442E4] .HYPERVISOR_event_channel_op+0x1c/0x50 (un
reliable)
[C000000001AA3CF0] [C0000000002BD1F0] .xb_read+0x190/0x2ac
[C000000001AA3E30] [C0000000002BEFD4] .xenbus_thread+0x84/0x278
[C000000001AA3EE0] [C000000000074D08] .kthread+0x158/0x1a8
[C000000001AA3F90] [C000000000028310] .kernel_thread+0x4c/0x68
cso84:~ #
Some code, for example 3:
--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c : c000000000089d2c
0:mon> di c000000000089d20
c000000000089d20 7c0000a6 mfmsr r0
c000000000089d24 60008000 ori r0,r0,32768
c000000000089d28 7c010164 mtmsrd r0,1
c000000000089d2c 7c7d07b4 extsw r29,r3
c000000000089d30 48000010 b c000000000089d40 #
.handle_IRQ_event+0x60/0x13c
c000000000089d34 ebff0028 ld r31,40(r31)
c000000000089d38 2fbf0000 cmpdi cr7,r31,0
c000000000089d3c 419e005c beq cr7,c000000000089d98 #
.handle_IRQ_event+0xb8/0x13c
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c : c00000000003af988
0:mon> di c00000000003af78
c00000000003af78 7c421378 mr r2,r2
c00000000003af7c 7c000026 mfcr r0
c00000000003af80 90010008 stw r0,8(r1)
c00000000003af84 44000022 svca 8
c00000000003af88 80010008 lwz r0,8(r1)
c00000000003af8c 7c0ff120 mtcr r0
c00000000003af90 4e800020 blr
_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [XenPPC] [RFC] 'xm restore' following boot,
poff <=
|
|
|
|
|