WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ia64-devel

Re: [Xen-ia64-devel] EFI Mapping Windows Install Crash Bug

On Wed, Jul 02, 2008 at 07:53:27PM +0900, Isaku Yamahata wrote:
> On Wed, Jul 02, 2008 at 04:20:33PM +1000, Simon Horman wrote:
> 
> > I have done some more investigations and it does really
> > seem that calling ia64_sal_get_state_info() via ia64_log_queue()
> > in ia64_mca_cpe_int_caller() causes the hypervisor to lock
> > up when my EFI RR patches are applied.
> > 
> > As you point out, if xmalloc() was ever called by ia64_log_queue()
> > in this context then a BUG would be triggered. As we are not
> > seeing that in the wild, then that case must not occur (or occur
> > so rarely that no one has seen and reported it yet). This means
> > that ia64_sal_get_state_info() must be returning zero.
> > 
> > If I understand correctly, ia64_log_queue() does more or less nothing
> > if ia64_sal_get_state_info() returns zero. Or in other words, if
> > ia64_sal_get_state_info() then for one reason or another there is no
> > information available at that time - we know that because if
> > there was information available then xmalloc() would be called and
> > a BUG would be triggered.
> > 
> > 
> > Given that without the EFF RR patches the call to ia64_log_queue()
> > in ia64_sal_get_state_info() seems to do nothing and with the call
> > a crash occurs, I wonder if the best way forward is to simply
> > remove the call.
> > 
> > The section on SAL_GET_STATE (==ia64_sal_get_state_info()) in the System
> > Abstraction Layer Specification (Dec 2003) does state "In response to
> > the MCA, Processor CMC, or Corrected Platform event, The operating
> > system must call the procedure to obtain all the pending processor and
> > plaftorm error information that triggerd the event."
> > 
> > Does that apply to situations when ia64_mca_cpe_int_caller() is called?
> > And if so, can calling ia64_log_queue() be deffered?
> 
> ia64_mca_cpe_int_caller() is triggered by the polling timer,
> cpe_poll_timer which send IA64_CPEP_VECTOR. So I think
> ia64_log_queue() can be deferred by using softirq or tasklet.
> 
> To be honest, taking a rough look at SAL specification I don't
> understand why the VMM locks up when ia64_sal_get_state_info() is called.
> You stated that when ia64_log_queue() is called, RID is already
> EFI's. Have you tracked down the reason and what's firmware
> call(PAL/SAL/EFI)?

I think that it varies, but I will check my logs.

> And where have you tracked down the hypervisor locks up?
> i.e. The hypervisor locks up in ia64_sal_get_state_info() around
> SAL call or right in the SAL call.

It appears to lock up right in the SAL call.

> If the lock up happens in the SAL call, what we can do is to take
> a closer look at SAL spec and to make the calling condition sure.
> If the lock up happens before or after the SAL call, presumably
> it sould indicate xen/ia64 vmm bug.

Ok, I will look through the speficication (the new one at the link
you set in your next email) and see if I can find anything.


_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel