>With CONFIG_IA64_SPLIT_CACHE on, a new user may encounter
>the problem on a shipping machine and the symptom is that
>the machine immediately crashes when a domU is launched.
Dan,
That means dom0 can boot with CONFIG_IA64_SPLIT_CACHE on, and PAL_CACHE_FLUSH
has been invoked successfully in the process of dom0 boot. So this is not
PAL_CACHE_FLUSH issue, there must be some other issue. Could you provide more
information about the crash, due to we can't reproduce this issue.
Thanks.
-Anthony
>-----Original Message-----
>From: Magenheimer, Dan (HP Labs Fort Collins) [mailto:dan.magenheimer@xxxxxx]
>Sent: 2005年12月22日 21:26
>To: Yang, Fred; Xu, Anthony; Tian, Kevin; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>Subject: RE: CONFIG_IA64_SPLIT_CACHE was: [Xen-ia64-devel] Console problem on
>domU on tip?
>
>With CONFIG_IA64_SPLIT_CACHE on, a new user may encounter
>the problem on a shipping machine and the symptom is that
>the machine immediately crashes when a domU is launched.
>
>With CONFIG_IA64_SPLIT_CACHE off, a developer may encounter
>a different problem on an unreleased machine.
>
>I know that you are focused primarily on the unreleased machine,
>but in this case, I think we should be cautious for the new user
>as the developer knows to change the option when running
>on the unreleased machine.
>
>I will spend some more time on this when I have a chance.
>I think it is a real bug (probably PAL accessing some address
>which isn't pinned) that occurs only on some boxes due
>to some factor like memory configuration.
>
>Thanks,
>Dan
>
>P.S. The debug output just before the crash was:
>ia64_fault: General Exception: IA-64 Reserved Register/Field fault (data
>access): reflecting
>
>> -----Original Message-----
>> From: Yang, Fred [mailto:fred.yang@xxxxxxxxx]
>> Sent: Wednesday, December 21, 2005 10:34 PM
>> To: Magenheimer, Dan (HP Labs Fort Collins); Xu, Anthony;
>> Tian, Kevin; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> Subject: CONFIG_IA64_SPLIT_CACHE was: [Xen-ia64-devel]
>> Console problem on domU on tip?
>>
>> Dan,
>>
>> Can we suggest to always turn on #CONFIG_IA64_SPLIT_CACHE as
>> the default build configuration. People may not be aware of
>> this build flag and miss it one each new build.
>>
>> All the newer generation ia64 processors will come with
>> splitted I/Dcache as discussed in the previous mail thread
>> and it is documented in the Itanium architectur of possible
>> splitted cache for future implementation. With default
>> turning off, it is a potential bugs for all Tiger4 systems
>> using for daily development and future platforms to come.
>>
>> It is also indicated through your mail, it is only HP rx2620
>> system has issue and not the other HP boxes. Can you track
>> down this issue? Rather than put a kludge for rx2620 box?
>>
>> Thanks,
>>
>> -Fred
>>
>>
>> Magenheimer, Dan (HP Labs Fort Collins) wrote:
>> > Committed (but without removal of ifdefs until we
>> > track down this problem).
>> >
>> >> -----Original Message-----
>> >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>> >> Sent: Monday, December 19, 2005 7:15 PM
>> >> To: Magenheimer, Dan (HP Labs Fort Collins); Tian, Kevin;
>> >> xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >> Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
>> >>
>> >> I guest maybe the firmware on your machine doesn't implement
>> >> this pal call due to there is no split I/D cache at that
>> >> time, so when you call this pal call, it will return
>> >> PAL_STATUS_UNIMPLEMENTED, Could you please turn on
>> >> CONFIG_IA64_SPLIT_CACHE and try this new patch to see
>> >> whether your machine can boot domain0?
>> >> If this patch works, could you please remove all
>> >> CONFIG_IA64_SPLIT_CACHE macro?
>> >>
>> >> Thanks
>> >> -Anthony
>> >>
>> >>> -----Original Message-----
>> >>> From: Magenheimer, Dan (HP Labs Fort Collins)
>> >> [mailto:dan.magenheimer@xxxxxx]
>> >>> Sent: 2005年12月19日 23:48
>> >>> To: Xu, Anthony; Tian, Kevin; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >>> Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
>> >>>
>> >>> I have been distracted tracking another bug...
>> >>>
>> >>> Here's where I got:
>> >>>
>> >>> The machine is a new (April 2005) HP rx2620 so it is
>> >>> not old firmware. I can't reproduce it on a machine
>> >>> with an ITP (which does have older firmware).
>> >>>
>> >>> This PAL call is never used in Linux, though there is a
>> >>> routine coded for it. It is the only
>> >>> PAL call coded in Linux that occurs with psr.ic off.
>> >>>
>> >>> The crash I am seeing occurs either during the PAL call or
>> >>> immediately upon return.
>> >>>
>> >>> Is it OK to
>> >>>
>> >>>
>> >>>> -----Original Message-----
>> >>>> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx]
>> >>>> Sent: Monday, December 19, 2005 2:02 AM
>> >>>> To: Tian, Kevin; Magenheimer, Dan (HP Labs Fort Collins);
>> >>>> xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >>>> Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
>> >>>>
>> >>>> Dan,
>> >>>> Have you got time to verify below discussion?
>> >>>>
>> >>>> Thanks
>> >>>> -Anthony
>> >>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Tian, Kevin
>> >>>>> Sent: 2005年12月16日 10:16
>> >>>>> To: Xu, Anthony; 'Magenheimer, Dan (HP Labs Fort Collins)';
>> >>>>> 'xen-ia64-devel@xxxxxxxxxxxxxxxxxxx'
>> >>>>> Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
>> >>>>>
>> >>>>>> From: Xu, Anthony
>> >>>>>> Sent: 2005年12月16日 9:54
>> >>>>>>
>> >>>>>>> Also, why panic if it fails?
>> >>>>>>>
>> >>>>>
>> >>>>> Panic is not required here, and we could just print out
>> a warning
>> >>>>> message. Previously panic is kept there to help our debug in
>> >>>>> early stage.
>> >>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>> Does the problem happen only on VTI? Or both VTI and
>> non-VTI on
>> >>>>>>> split-cache machines?
>> >>>>>>
>> >>>>>> Sometimes, it makes domain0 crash at the very beginning of the
>> >>>>>> domain0 boot process, especially on MP machine.
>> >>>>>>
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> -Anthony
>> >>>>>
>> >>>>> One complement is, that problem definitely exists on new
>> >>>>> split-cache processors, for dom0/domU. For VTI domain, we have
>> >>>>> logic within device model to ensure consistence.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Kevin
>> >>>>>>
>> >>>>>>
>> >>>>>>> -----Original Message-----
>> >>>>>>> From: Magenheimer, Dan (HP Labs Fort Collins)
>> >>>>>> [mailto:dan.magenheimer@xxxxxx]
>> >>>>>>> Sent: 2005年12月16日 1:39
>> >>>>>>> To: Tian, Kevin; xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
>> >>>>>>> Cc: Xu, Anthony
>> >>>>>>> Subject: RE: [Xen-ia64-devel] Console problem on domU on tip?
>> >>>>>>>
>> >>>>>>>>> Is this code fragment necessary for VTI to boot domU
>> >>>>>>>>> or is it OK to remove?
>> >>>>>>>>
>> >>>>>>>> The comment is inaccurate and it should be
>> domU. That I/D
>> >>>>>>>> cache sync step is mandatory to boot domU on new IA64
>> >>>>>>>> processor which has split L2 I/D cache. If without such I/D
>> >>>>>>>> cache sync, control panel loads domU's kernel image
>> which only
>> >>>>>>>> affects D side cache. If there're some stale entry on I-side
>> >>>>>>>> cache within same range of dom0 image, people will
>> see machine
>> >>>>>>>> going weird.
>> >>>>>>>
>> >>>>>>> I don't understand... how can there be stale entries in the
>> >>>>>>> I-cache? The instructions have just been written to memory
>> >>>>>>> (through D-cache) and no instructions in this domain have yet
>> >>>>>>> been executed.
>> >>>>>>> I do see that the D-cache needs to be flushed so that
>> memory is
>> >>>>>>> coherent but are there better ways to do that without a pal
>> >>>>>>> call?
>> >>>>>>>
>> >>>>>>>> Normally I/D cache sync shouldn't force any
>> problem. Possibly
>> >>>>>>>> there's some problem with the pal calling code, like
>> incorrect
>> >>>>>>>> ITLB mapping for pal or similar issue...
>> >>>>>>>
>> >>>>>>> Although the ia64_pal_cache_flush routine is defined
>> in linux's
>> >>>>>>> pal.h, it doesn't appear to be used anywhere in Linux so there
>> >>>>>>> is no use model to copy. I suspect there is some use
>> model for
>> >>>>>>> the call that we don't understand, for example maybe it should
>> >>>>>>> only be called with physical &progress? It definitely fails
>> >>>>>>> every time on one of my (newer) machines and disabling the pal
>> >>>>>>> call makes the problem go away.
>> >>>>>>>
>> >>>>>>>> Though it's intermittent, please
>> >>>>>>>> keep this code
>> >>>>>>>> there for correctness.
>> >>>>>>>
>> >>>>>>> Since the call is definitely failing under some circumstances
>> >>>>>>> that we don't understand, I'm inclined to at least
>> put the code
>> >>>>>>> in an #ifdef CONFIG_SPLIT_CACHE
>> >>>>>>>
>> >>>>>>> Does the problem happen only on VTI? Or both VTI and non-VTI
>> >>>>>>> on split-cache machines?
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Dan
>> >>>>>>>
>> >>>>>>> P.S. I tried Anthony's patch (which moves the PAL call after
>> >>>>>>> new_thread()) but it still crashes.
>>
>>
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|