Hi Dan,
Some time ago, you reported that the fully virtualize psr and ipsr patch
by Anthony caused Linux compilation to crash the system.
This panic seems to be solved by applying the vcpu_translate_patch.
However, the original gcc segmentation fault problem still occurs on
this system, and it still remains to be solved. Details below.
psr.patch:
http://lists.xensource.com/archives/html/xen-ia64-devel/2005-11/msg00312.html
vcpu_tranalate.patch:
http://lists.xensource.com/archives/html/xen-ia64-devel/2006-03/msg00328.html
The panic occurs when trying to handle a tlb miss following "itc"
instruction. Below is console output:
(XEN) vcpu_translate: bad physical address: a00000010000a090
(XEN) translate_domain_pte: bad mpa=000000010000a090 (> 0000000018000000),
vadr=a00000010000a090,pteval=001000010000a761,itir=0000000000000038
(XEN) lookup_domain_mpa: bad mpa 000000010000a090 (> 0000000018000000
(XEN) handle_op: can't handle privop at 0xa00000010000a090
(op=0x000001a7a7a7a7a7)
slot 0 (type=5), ipsr=0000101208026010
(XEN) priv_emulate: priv_handle_op fails, isr=0000000000000000
(XEN) $$$$$ PANIC in domain 1 (k6=f000000007f98000): psr.dt off,
trying to deliver nested dtlb!
(XEN)
(XEN) CPU 0
(XEN) psr : 0000101208026010 ifs : 800000000000040e ip : [<a00000010000a090>]
(XEN) ip is at ???
(XEN) unat: 0000000000000000 pfs : c00000000000040e rsc : 000000000000000f
(XEN) rnat: 0000000000000000 bsps: 60000fff7fffc160 pr : 000000000555a261
(XEN) ldrs: 0000000000700000 ccv : 0010000001c585a1 fpsr: 0009804c8a70033f
(XEN) csd : 0000000000000000 ssd : 0000000000000000
(XEN) b0 : a00000010000a070 b6 : 20000000001f8780 b7 : 0000000000000000
(XEN) f6 : 000000000000000000000 f7 : 000000000000000000000
(XEN) f8 : 000000000000000000000 f9 : 000000000000000000000
(XEN) f10 : 000000000000000000000 f11 : 000000000000000000000
(XEN) r1 : 60000000000021f0 r2 : 0000000000000000 r3 : 0000000000000308
(XEN) r8 : 0000000000000000 r9 : 20000000002c64a0 r10 : 0000000000000000
(XEN) r11 : c00000000000040e r12 : 60000fffffaa7610 r13 : 20000000002d06a0
(XEN) r14 : 0000000000000030 r15 : 6000000000100000 r16 : 6000000000100000
(XEN) r17 : 0000000001bf4200 r18 : 0010000001c585a1 r19 : 0001800000000040
(XEN) r20 : 000000001613c000 r21 : 0000000000000000 r22 : 5fffff0000000000
(XEN) r23 : 000000001613c000 r24 : 0000000000000038 r25 : 0010000001c585e1
(XEN) r26 : 0010000001c585a1 r27 : 0000000000000038 r28 : 0000000000000000
(XEN) r29 : 4000000000001870 r30 : a00000010000a070 r31 : 000000000555a2a1
(XEN) vcpu_translate: bad physical address: 60000fff7fffc1d0
(XEN) translate_domain_pte: bad mpa=00000fff7fffc1d0 (> 0000000018000000),
vadr=60000fff7fffc1d0,pteval=00100fff7fffc761,itir=0000000000000038
(XEN) lookup_domain_mpa: bad mpa 00000fff7fffc1d0 (> 0000000018000000
(XEN) r32 : f0000000f0000000 r33 : f0000000f0000000 r34 : f0000000f0000000
(XEN) r35 : f0000000f0000000 r36 : f0000000f0000000 r37 : f4f4f4f4f4f4f4f4
(XEN) r38 : f4f4f4f4f4f4f4f4 r39 : f4f4f4f4f4f4f4f4 r40 : f4f4f4f4f4f4f4f4
(XEN) r41 : f4f4f4f4f4f4f4f4 r42 : f4f4f4f4f4f4f4f4 r43 : f4f4f4f4f4f4f4f4
(XEN) r44 : f4f4f4f4f4f4f4f4 r45 : f4f4f4f4f4f4f4f4
(XEN) BUG at domain.c:339
(XEN) bad hyperprivop; ignored
(XEN) iim=0, iip=f0000000040203d0
(XEN) bad hyperprivop; ignored
(XEN) iim=0, iip=f0000000040203d0
One of the above messages:
(XEN) vcpu_translate: bad physical address: a00000010000a090
The address "a00000010000a090" points to the instruction below.
a00000010000a090: cb 00 64 00 2e 04 [MMI] (p06) itc.d r25;;
When the VMM tries to get the opcode to call priv_handle_op(),
it seems to trigger a tlb miss, and causes domU to hang.
It seems from the message that domain is in metaphysical mode
after executing "rsm psr.dt" instruction, and the fault address is
in region 5.
This situation is similar to the problem vcpu_translate_patch
tries to solve. The patch fixes vcpu_translate() so that
the guest OS does not operate in metaphysical mode in such a case.
We have run the same test program on Xen 3.0-unstable with CSet#9395
(which includes vcpu_translate patch) and it run throughout the weekend
without causing any panic. On the other hand, the original Xen without
the patch crashes within 2 hours. However, the original gcc
segmentation faults still occured on the system, so neither CSet#8671
nor #9395 seems to solve the original problem.
Thanks,
Shuji
>Hi Anthony --
>
>Since things have stabilized, I decided to give this patch
>some testing, primarily to see if it might fix the gcc
>segmentation faults that Fujita and I have been seeing.
>Without this patch, I am able to compile Linux 20 times
>on domU; generally 1 or 2 of the compiles fails because
>of the gcc segfault. With the patch, Xen *crashed*
>on the sixth Linux compile (first try) and halfway
>through the first Linux compile (second try). This
>is on a Tiger4, but I currently am not able to get console
>output so I don't have any information about the crash --
>other than that the machine didn't reboot. Could you see
>if you could reproduce this?
>
>As an aside, turning off the FAST_BREAK, FAST_ACCESS_REFLECT,
>and FAST_RFI features (which your patch turns off) slowed
>down the benchmark by about 4%.
>
>Thanks,
>Dan
>
>> -----Original Message-----
>> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>> Sent: Sunday, November 27, 2005 8:22 PM
>> To: Magenheimer, Dan (HP Labs Fort Collins)
>> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> Subject: [Xen-ia64-devel] [PATCH] fully virtualize psr and
>> ipsr on non-VTI domain
>>
>> Dan,
>> This patch is intended to fully virtualize psr and ipsr on non-VTI
>> domain.
>> Following things are done in this patch.
>> 1, previously when guest reads psr, it always get psr dt rt
>> it equal to
>> 1. that is because HV doesn't restore these information,
>> metaphysical_mode can't present all these information. I save these
>> information into privregs->vpsr. Thus guest can get correct
>> information
>> about dt, rt and it.
>> 2, when guest reads psr, we should only return low 32bits and
>> 35 and 36
>> bits, previously return all bits.
>> 3, when guest rsm and ssm psr, HV rsm and ssm some bits of current psr
>> which is used by HV, that is not correct, guest rsm and ssm
>> should only
>> impact guest psr(that is regs->ipsr).
>> 4, mistakenly uses guest DCR, guest DCR should impact guest psr when
>> injecting interruption into guest, but not impact guest ipsr.
>> When injecting interruption into guest,The current implementation is
>> Guest ipsr.be=guest dcr.be
>> Guest ipsr.pp=guest dcr.pp
>> Correct implementation should be,
>> Guest psr.be=guest dcr.be
>> Guest psr.pp=guest dcr.pp.
>>
>> Because of above modifications, I turn off FAST_RFI, FAST_BREAK and
>> FAST_ACCESS_REFLECT.
>>
>> Signed-off-by Anthony Xu < anthony.xu@xxxxxxxxx>
>>
>> One question, why do we need to virtualize guest psr.pp and always set
>> guest psr.pp to 1?
>>
>> Thanks
>> -Anthony
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|