| 
 Hi, 
  
Am Montag, 28. Oktober 2019, 18:30:12 CET schrieb Stonehouse, Robert: 
> This is a heads-up as I have observed that the following commit (backported onto an Amazon 4.11 tree) causes kexec (and hence kdump) to fail.  
> ======== 
> commit c719519a4183d0630121f6abeba420f49dbc3229 
> Author: Jan Beulich <jbeulich@xxxxxxxx> 
> AuthorDate: Fri Jul 5 10:32:41 2019 +0200 
> Commit: Jan Beulich <jbeulich@xxxxxxxx> 
> CommitDate: Fri Jul 5 10:32:41 2019 +0200 
>  
> x86/SMP: don't try to stop already stopped CPUs 
>      
>     In particular with an enabled IOMMU (but not really limited to this 
>     case), trying to invoke fixup_irqs() after having already done 
>     disable_IO_APIC() -> clear_IO_APIC() is a rather bad idea: 
> ======== 
>  
> The test was performing "echo c > /proc/sysrq-trigger" in dom0 and the loaded crash kernel fails to show any signs of starting. This is the end of the Xen console ... 
> ======== 
> (XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds. 
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. 
> <machine hangs here then reboots via the BIOS after 5 seconds> 
> ======== 
> Expected behaviour is that the kdump kernel immediately loads and then performs the crash dump 
  
I can confirm this behavior but with xen version (4.11.0_08-1) from 
SuSE SLES12 SP4 which doesn't contain the said commit 
c719519a4183d0630121f6abeba420f49dbc3229. But I can see this only on systems with newer Intel CPUS like 
"Intel(R) Xeon(R) Gold 6242 CPU". 
 Dietmar. 
  
>  
> I'm sorry that I have not yet had time to check if this affects vanilla stable-4.11 or master. I just wanted to be certain that you don't have the same issue. 
>  
>  
> Reverting one hunk via the following commit fixes things for me (this is an experiment and not at all a proposed fix) 
> ======== 
> --- a/xen/arch/x86/smp.c 
> +++ b/xen/arch/x86/smp.c 
> @@ -303,15 +303,15 @@ static void stop_this_cpu(void *dummy) 
>  void smp_send_stop(void) 
>  { 
>      unsigned int cpu = smp_processor_id(); 
> +     
> +    local_irq_disable(); 
> +    fixup_irqs(cpumask_of(cpu), 0); 
> +    local_irq_enable(); 
>   
>      if ( num_online_cpus() > 1 ) 
>      { 
>          int timeout = 10; 
>   
> -        local_irq_disable(); 
> -        fixup_irqs(cpumask_of(cpu), 0); 
> -        local_irq_enable(); 
> - 
>          smp_call_function(stop_this_cpu, NULL, 0); 
>   
>          /* Wait 10ms for all other CPUs to go offline. */ 
> ======== 
>  
> Regards 
> Rob 
>  
> _______________________________________________ 
> Xen-devel mailing list 
> Xen-devel@xxxxxxxxxxxxxxxxxxxx 
> https://lists.xenproject.org/mailman/listinfo/xen-devel 
   |