Hi Dan,
Yes, we also got a segmentation fault in 1 run out of 30.
Could you please try this new patch?
Thanks,
-Anthony
>-----Original Message-----
>From: Magenheimer, Dan (HP Labs Fort Collins)
[mailto:dan.magenheimer@xxxxxx]
>Sent: 2006?4?28? 22:49
>To: Xu, Anthony; Tristan Gingold; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx;
>Williamson, Alex (Linux Kernel Dev)
>Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability
>
>Hi Anthony --
>
>I tried your patch overnight and still got a segmentation
>fault in 1 run out of 50. I didn't try Tristan's patch yet,
>so will try both at the same time next... perhaps there
>are two different problems that show up as the segmentation
>fault.
>
>Dan
>
>> -----Original Message-----
>> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>> Sent: Thursday, April 27, 2006 9:19 PM
>> To: Xu, Anthony; Tristan Gingold;
>> xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan (HP Labs
>> Fort Collins); Williamson, Alex (Linux Kernel Dev)
>> Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability
>>
>> Hi Tristan,
>> Could you please check whether this patch address RSE issue?
>>
>> Yes, Intel QA team is doing the test in the meantime.
>>
>>
>> Thanks,
>> -Anthony
>>
>> >-----Original Message-----
>> >From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>> >[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On
>> Behalf Of Xu, Anthony
>> >Sent: 2006?4?28? 9:48
>> >To: Tristan Gingold; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx;
>> Magenheimer, Dan (HP
>> >Labs Fort Collins); Alex Williamson
>> >Subject: RE: [Xen-ia64-devel] PATCH: slightly improve stability
>> >
>> >>From: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
>> >>[mailto:xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx] On
>> Behalf Of Tristan
>> >>Gingold
>> >>Sent: 2006?4?27? 23:14
>> >>To: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx; Magenheimer, Dan
>> (HP Labs Fort
>> >>Collins); Alex Williamson
>> >>Subject: [Xen-ia64-devel] PATCH: slightly improve stability
>> >>
>> >>Hi,
>> >>
>> >>as reported earlier, this patch seems to improve stability:
>> crashes are at
>> >>least more coherent and maybe less frequent.
>> >>
>> >>RSE handling seems to have a bug: crahes are now due to
>> either a bad value in
>> >>a stacked register or a use of an invalid stacked register
>> (although cfm
>> >>seems correct in gdb!)
>> >
>> >I'm looking at this too,
>> >Yes there is a bug about handle_lazy_cover.
>> >
>> >void ia64_do_page_fault (unsigned long address, unsigned
>> long isr, struct
>> >pt_regs *regs, unsigned long itir)
>> >{
>> > unsigned long iip = regs->cr_iip, iha;
>> > // FIXME should validate address here
>> > unsigned long pteval;
>> > unsigned long is_data = !((isr >> IA64_ISR_X_BIT) & 1UL);
>> > IA64FAULT fault;
>> >
>> > if ((isr & IA64_ISR_IR) && handle_lazy_cover(current,
>> isr, regs)) return;
>> >
>> >This code sequence is intended to handle following scenario.
>> >
>> >1. Guest executes br.ret, this may cause mandatory RSE load,
>> and this load may
>> >cause TLB miss.
>> >2. VMM gets control, but VMM can't handle this TLB miss
>> itself, then VMM injects
>> >TLB miss to Guest TLB miss handler, when VMM executing "rfi"
>> to jump to Guest
>> >TLB miss handler, this TLB miss happens again.
>> >3. At this time, interrupt_collection_enabled is 0, so
>> handle_lazy_cover
>> >executes "cover" on behalf of Guest, and return to Guest TLB
>> miss handler again,
>> >this time there is no TLB miss.
>> >
>> >
>> >Following code sequence is in ia64_leave_kernel path with
>> psr.ic and psr.i off.
>> >When br.ret.dptk.many b0 is executed, there may be a
>> mandatory load, thus
>> >There may be a tlb miss, according to above description
>> handle_lazy_cover
>> >executes "cover" on behalf of Guest and return to Guest,
>> this is no correct
>> >in this scenario.
>> >
>> >I didn't find an easy way to fix this bug.
>> >
>> >
>> > mov loc6=0
>> > mov loc7=0
>> >(pRecurse) br.call.dptk.few b0=rse_clear_invalid
>> > ;;
>> > mov loc8=0
>> > mov loc9=0
>> > cmp.ne pReturn,p0=r0,in1 // if recursion count
>> != 0, we need to do a
>> >br.ret
>> > mov loc10=0
>> > mov loc11=0
>> >(pReturn) br.ret.dptk.many b0
>> >#endif /* !CONFIG_ITANIUM */
>> ># undef pRecurse
>> ># undef pReturn
>> > ;;
>> > alloc r17=ar.pfs,0,0,0,0 // drop current register frame
>> > ;;
>> > loadrs
>> >
>> >Thanks,
>> >Anthony
>> >
>> >
>> >>
>> >>Tested by doing many linux kernel compilation in SMP domU (> 100).
>> >>
>> >>Tristan.
>> >
>> >_______________________________________________
>> >Xen-ia64-devel mailing list
>> >Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >http://lists.xensource.com/xen-ia64-devel
>>
RSE_incomplete_cfm.patch
Description: RSE_incomplete_cfm.patch
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|