On Tue, Jul 01, 2008 at 09:20:27PM +1000, Simon Horman wrote:
> On Tue, Jul 01, 2008 at 08:04:16PM +0900, Isaku Yamahata wrote:
> > On Tue, Jul 01, 2008 at 05:34:42PM +1000, Simon Horman wrote:
> > > On Tue, Jul 01, 2008 at 04:07:53PM +0900, Isaku Yamahata wrote:
> > > > On Tue, Jul 01, 2008 at 11:03:28AM +1000, Simon Horman wrote:
> > > > > Hi,
> > > > >
> > > > > I'm a bit hesitant to jump the gun, but I think that I might have
> > > > > isolated the cause of win2k3-sp2 crashing during install when my EFI
> > > > > Mapping patches are applied. Well, perhaps not the cause, but I think
> > > > > I
> > > > > know where it is dying.
> > > > >
> > > > > Quickly as background, the EFI Mapping parches move the mapping
> > > > > that EFI is taught on boot time to map memory where Linux places
> > > > > it ( basically pa + (0xe<60) ) instead of where Xen usually
> > > > > places it ( basically pa + (0xf<60) ). In order to protect this
> > > > > mapping from HVM domains a special region id is used. The
> > > > > hypervisor switches to that region id just before making any
> > > > > PAL, SAL or EFI calls, and switches back to the previous region
> > > > > id once the call completes. As region 7 has to be changed,
> > > > > entries that are pinned into the TLB have to be repinned. And
> > > > > that is roughly where the fun begins.
> > > > >
> > > > > As for the problem? It seems to be caused by ia64_mca_cpe_int_caller()
> > > > > calling ia64_log_queue() which calls ia64_sal_get_state_info(). I
> > > > > believe that the hypervisor dies in ia64_log_queue() somewhere after
> > > > > ia64_sal_get_state_info() completes. That is, I am suspecting that the
> > > > > call to ia64_sal_get_state_info() is returning bogus data.
> > > >
> > > > Is ia64_mca_cpe_int_caller() called in interrupt context?
> > > > If so, ia64_log_queue() calls xmalloc() which can't be called
> > > > from interrupt context. Then xen VMM crashes at ASSERT(!in_irq())
> > > > in _xmalloc().
> > >
> > > That is a good point. Although xmalloc() is only called if
> > > ia64_sal_get_state_info() returns a non-zero value. Which
> > > according to tracing that I have done this afternoon, does
> > > not seem to be the case (when ia64_log_queue() is called
> > > from other places in mca.c.
> > >
> > > How can I check if the call is being made in interrupt context?
> >
> > in_irq()?
> > Anyway I noticed ia64_mca_cpe_int_caller() is a irq handler so that it is
> > always called from intrrupt context. So ia64_log_queue() has to be
> > fixed in case ia64_sal_get_state_info() returns > 0.
>
> I'm actually not sure that code path ever gets exercised,
> because as you say, if it did then the ASSERT(!in_irq()) in
> _xmalloc() wound be triggered.
>
> This seems to imply that ia64_sal_get_state_info() always returns 0
> if called from an interrupt context - my debuging backs this up.
I supopse fault injection or something like that might be needed to
test the execution path.
>
>
> As for the EFI RID related problem that I am seeing. I am getting some
> good results by translating the log_buffer argument to
> ia64_sal_get_state_info() to an EFI virtual address (basically 0xe...
> instead of 0xf...). I am sure that I tried this before and it failed.
> But this time it seems to be working, so perhaps it is a combination of
> this change and other changes.
As I'm reviewing the patches, I noticed only xen/arch/ia64/xen/ivt.S
is patched, but xen/arch/ia64/xen/vmx_ivt.S isn't patched.
Isn't it necessary to similar change to vmx_ivt.S?
--
yamahata
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|