[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/kexec: harden kexec path by entering with NMIs latched



>>> On 18.07.18 at 23:38, <igor.druzhinin@xxxxxxxxxx> wrote:
> On certain occasions platform might generate NMIs during kexec transition.
> It could be cascades of NMIs following the first one, escalated Master
> Aborts following IOMMU shutdown on the transition itself, etc.
> Purgatory code is now unprepared for any sort of exception or interrupt
> handling including NMI handling. This results in intermittent failures
> to enter kdump kernel on certain events or certain platforms caused by
> Triple Fault in purgatory.
> 
> It's possible to start loading kdump kernel in NMI context having them
> latched which postpones NMI handling until the kernel itself enables
> regular interrupts that should unlatch NMIs as soon as the first
> interrupt comes.

However, the kernel may not expect NMIs to be disabled (I think btw
that "latched" is an imprecise term here, but then again I'm not a
native speaker). In particular the (re-)enabling of NMIs as a side
effect of the first regular interrupt (or exception) may come too late.

Furthermore the disabled state will last only until the first IRET, and
there's nothing precluding the purgatory to not have one for
whichever purpose. I think this at least needs spelling out (as placing
an additional requirement on the purgatory).

> --- a/xen/arch/x86/crash.c
> +++ b/xen/arch/x86/crash.c
> @@ -35,6 +35,41 @@ static cpumask_t waiting_to_crash;
>  static unsigned int crashing_cpu;
>  static DEFINE_PER_CPU_READ_MOSTLY(bool, crash_save_done);
>  
> +void crash_self_nmi(void)
> +{
> +    /* Poor mans self_nmi().  __stop_this_cpu() has reverted the LAPIC

Comment style (more further down).

> +     * back to its boot state, so we are unable to rely on the regular
> +     * apic_* functions, due to 'x2apic_enabled' being possibly wrong.
> +     * (The likely scenario is that we have reverted from x2apic mode to
> +     * xapic, at which point #GPFs will occur if we use the apic_*
> +     * functions)
> +     */
> +    switch ( current_local_apic_mode() )
> +    {
> +        u32 apic_id;
> +
> +    case APIC_MODE_X2APIC:
> +        apic_id = apic_rdmsr(APIC_ID);
> +
> +        apic_wrmsr(APIC_ICR, APIC_DM_NMI | APIC_DEST_PHYSICAL
> +                   | ((u64)apic_id << 32));
> +        break;
> +
> +    case APIC_MODE_XAPIC:
> +        apic_id = GET_xAPIC_ID(apic_mem_read(APIC_ID));
> +
> +        while ( apic_mem_read(APIC_ICR) & APIC_ICR_BUSY )
> +            cpu_relax();
> +
> +        apic_mem_write(APIC_ICR2, apic_id << 24);
> +        apic_mem_write(APIC_ICR, APIC_DM_NMI | APIC_DEST_PHYSICAL);
> +        break;
> +
> +    default:
> +        break;

Doing nothing here is bogus: The caller relies on the NMI to have been
raised. Otoh I realize you only move this code to a function.

> +    }
> +}

I don't think you can reasonably return from here. As I've observed
following Andrew's comment in 5191c1ef51, the synchronous behavior
does not appear to apply to older CPUs. IOW I've observed such an
NMI only several instructions later, on a Westmere system iirc. IOW I
think you want to move the loop-over-hlt() into here (and add
"noreturn" to the function declaration).

> @@ -145,10 +146,25 @@ void machine_reboot_kexec(struct kexec_image *image)
>      BUG();
>  }
>  
> +static struct kexec_image *kexec_image;
> +static void do_kexec_crash(void)

Please have a blank line between these two.

> +{
> +    unsigned long reloc_flags = 0;

unsigned int, as you touch (move) this anyway?

> +    if ( kexec_image->arch == EM_386 )
> +        reloc_flags |= KEXEC_RELOC_FLAG_COMPAT;
> +
> +    kexec_reloc(page_to_maddr(kexec_image->control_code_page),
> +                page_to_maddr(kexec_image->aux_page),
> +                kexec_image->head, kexec_image->entry_maddr, reloc_flags);
> +}
> +
>  void machine_kexec(struct kexec_image *image)
>  {
>      int i;
> -    unsigned long reloc_flags = 0;
> +    unsigned int cpu = smp_processor_id();

Take the opportunity and make i unsigned int as well?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.