WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

Re: [Xen-ia64-devel] EFI Mapping Windows Install Crash Bug

To: Simon Horman <horms@xxxxxxxxxxxx>
Subject: Re: [Xen-ia64-devel] EFI Mapping Windows Install Crash Bug
From: Isaku Yamahata <yamahata@xxxxxxxxxxxxx>
Date: Mon, 14 Jul 2008 12:22:21 +0900
Cc: Alex Williamson <alex.williamson@xxxxxx>, xen-ia64-devel <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Sun, 13 Jul 2008 20:22:27 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20080701010326.GE10877@xxxxxxxxxxxx>
List-help: <mailto:xen-ia64-devel-request@lists.xensource.com?subject=help>
List-id: Discussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>
List-post: <mailto:xen-ia64-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-ia64-devel>, <mailto:xen-ia64-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20080701010326.GE10877@xxxxxxxxxxxx>
Sender: xen-ia64-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.6i
Hi Simon-san.

I attached the four ptaches, please review and integrate
them into your patchset if appropriate.
Although I haven't splited them up, I'm posting them because
I think they would help to show the issues.

  remove-warning-and-vpd-itr-fix.patch
  cleanup-pal-vaddr.patch
  dont-map-pal-code.patch
  purge-pal-code-after-firmware-call.patch


I reviewed the kexec patches and tried on my testing machine.
Xen/IA64 VMM failed to boot. initial boot. non-kexec boot.
I created some patches to fix it, and with those patches
Xen/IA64 can boot. (I'm going to test kexec...)

The main issue is 
  The rule how to pin down PAL code was changed.
  So far PAL code was always pinned down by
  itr[IA64_TR_PALCODE], however it has been changed
  such that PAL code is pinned down right before
  calling firmware by setn_one_rr_efi() with region 7.
  But without the attached patch, PAL code is wrongly always
  pinned down to wrong address calculated by efi_get_pal_addr()

  The attached patch fixes the calculation in efi_get_pal_addr()
  and makes functions which switch rr7 not pin down PAL code.

thanks.

Signed-off-by: Isaku Yamahata <yamahata@xxxxxxxxxxxxx>


On Tue, Jul 01, 2008 at 11:03:28AM +1000, Simon Horman wrote:
> Hi,
> 
> I'm a bit hesitant to jump the gun, but I think that I might have
> isolated the cause of win2k3-sp2 crashing during install when my EFI
> Mapping patches are applied. Well, perhaps not the cause, but I think I
> know where it is dying.
> 
>     Quickly as background, the EFI Mapping parches move the mapping
>     that EFI is taught on boot time to map memory where Linux places
>     it ( basically pa + (0xe<60) ) instead of where Xen usually
>     places it ( basically pa + (0xf<60) ). In order to protect this
>     mapping from HVM domains a special region id is used. The
>     hypervisor switches to that region id just before making any
>     PAL, SAL or EFI calls, and switches back to the previous region
>     id once the call completes.  As region 7 has to be changed,
>     entries that are pinned into the TLB have to be repinned. And
>     that is roughly where the fun begins.
> 
> As for the problem? It seems to be caused by ia64_mca_cpe_int_caller()
> calling ia64_log_queue() which calls ia64_sal_get_state_info(). I
> believe that the hypervisor dies in ia64_log_queue() somewhere after
> ia64_sal_get_state_info() completes. That is, I am suspecting that the
> call to ia64_sal_get_state_info() is returning bogus data.
> 
> Furthermore, my traces seem to indicate that the problem arises the
> call to ia64_log_queue() and in turn to ia64_sal_get_state_info() is
> made when the region id is already switched to make some other PAL, SAL
> or EFI call (though I doubt it is particularly important which one).
> 
> This seems to make sense to me as ia64_mca_cpe_int_caller() is
> "Triggered by sw interrupt from CPE polling routine.".
> 
> I am unsure about what to do about this problem, but for testing
> purposes I simply removed the call to ia64_log_queue() from
> ia64_mca_cpe_int_caller() and things seem to work.
> 
> When I say seem to work, this bug does not manifest every time I install
> win2k3-sp2. So it can be hard to tell if a change has improved things or
> not. But for now, I have not seen a crash occur with this hack in place
> (+ various other changes which may or may not be relevant, but this one
> seems to be particularly important).
> 
> I will investigate my theory that things die in ia64_log_queue()
> further. But I wonder if there might be a way to permanently remove/move
> the call to ia64_log_queue() out of ia64_mca_cpe_int_caller() and
> possibly other PAL, SAL or EFI calls inside other MCA code.
> 

-- 
yamahata

Attachment: 01-remove-warning-and-vpd-itr-fix.patch
Description: Text Data

Attachment: 02-cleanup-pal-vaddr.patch
Description: Text Data

Attachment: 03-dont-map-pal-code.patch
Description: Text Data

Attachment: 04-purge-pal-code-after-firmware-call.patch
Description: Text Data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel