[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Dom0 crash with old style AMD NUMA detection



On Tue, Sep 18, 2012 at 11:57:33AM +0200, Andre Przywara wrote:
> On 09/17/2012 09:14 PM, Konrad Rzeszutek Wilk wrote:
> >On Mon, Sep 17, 2012 at 09:29:22AM +0200, Andre Przywara wrote:
> >>On 09/14/2012 08:58 PM, Konrad Rzeszutek Wilk wrote:
> >>>>>>[    0.000000] Kernel panic - not syncing: Attempted to kill the idle 
> >>>>>>task!
> >>>>>>(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>The obvious solution would be to explicitly deny northbridge scanning
> >>>>>>when running as Dom0, though I am not sure how to implement this without
> >>>>>>upsetting the other kernel folks about "that crappy Xen thing" again ;-)
> >>>>>
> >>>>>Heh.
> >>>>>Is there a numa=0 option that could be used to override it to turn it
> >>>>>off?
> >>>>
> >>>>Not compile tested.. but was thinking something like this:
> >>>
> >>>ping?
> >>
> >>That looks good to me - at least for the time being.
> >
> >OK, can I've your Tested-by/Acked-by on it pls?
> >
> >>I just want to check how this interacts with upcoming Dom0 NUMA
> >>support. It wouldn't be too clever if we deliberately disable NUMA
> >
> >We can always revert this patch in future versions of Linux.
> 
> I don't like this idea. Then we have Linux kernel up to 3.5 working
> and say from 3.8 on again, but 3.6 and 3.7 cannot use NUMA. That
> would be pretty unfortunate.

Huh? v3.5 working? But it never worked? I would say turn off the NUMA
detection (keep in mind it still will set up the dummy NUMA stuff)
until there are some PV NUMA capability and then we can revert it.

> 
> I haven't checked back with Dario, but I'd suspect that we use ACPI
> for injecting NUMA topology into Dom0. Even if not, a general
> "numa=off" for Dom0 is too much of a sledgehammer for me.

How would you inject it in Dom0? It s a PV guest so the hypervisor would
have to tweak the SRAT/SLIT tables. That is not going to happen
in the very short term.. And I don't recall seeing any patches, so
the dom0 NUMA support is right now non-existent?

> 
> >>and future Xen version will allow us to use it. So let me check if I
> >>can confine this turn-off to the fallback K8 northbridge reading.
> >
> >This potentially could work, but I would prefer to not do it for 3.6.
> 
> Mmh, I don't get the idea of your patch below. One can always read
> the NUMA topology from the AMD northbridge, but this is deprecated
> if favor of ACPI. The amdtopology.c stuff was only there to enable
> NUMA for very early Opterons, where BIOSes didn't provide (sane)
> SRAT tables.
> Though we disallow ACPI for NUMA on Dom0, this northbridge scanning
> unfortunately "shines through" the virtualization, actually
> revealing the system's NUMA topology, which is usually much
> different from Dom0's one.

Right, but isn't that what you found broke? It wasn't ACPI NUMA
but the old-style K8 northbridge information? That is what we are
trying to fix.

> 
> So instead I want to do more something like this:
> 
> diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
> index bfacd2c..7811c0d 100644
> --- a/arch/x86/include/asm/numa.h
> +++ b/arch/x86/include/asm/numa.h
> @@ -20,6 +20,8 @@
> 
>  extern int numa_off;
> 
> +extern bool deny_amd_nb_numa_scan;
> +
>  /*
>   * __apicid_to_node[] stores the raw mapping between physical apicid and
>   * node and is used to initialize cpu_to_node mapping.
> diff --git a/arch/x86/mm/amdtopology.c b/arch/x86/mm/amdtopology.c
> index 5247d01..f223a67 100644
> --- a/arch/x86/mm/amdtopology.c
> +++ b/arch/x86/mm/amdtopology.c
> @@ -29,6 +29,8 @@
> 
>  static unsigned char __initdata nodeids[8];
> 
> +bool deny_amd_nb_numa_scan = 0;
> +
>  static __init int find_northbridge(void)
>  {
>       int num;
> @@ -78,6 +80,9 @@ int __init amd_numa_init(void)
>       u32 nodeid, reg;
>       unsigned int bits, cores, apicid_base;
> 
> +     if (deny_amd_nb_numa_scan)
> +             return -ENOENT;
> +
>       if (!early_pci_allowed())
>               return -EINVAL;
> 
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index d11ca11..6db63c0 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -532,6 +532,8 @@ void __init xen_arch_setup(void)
>       }
>  #endif
> 
> +     deny_amd_nb_numa_scan = 1;
> +
>       memcpy(boot_command_line, xen_start_info->cmd_line,
>              MAX_GUEST_CMDLINE > COMMAND_LINE_SIZE ?
>              COMMAND_LINE_SIZE : MAX_GUEST_CMDLINE);
> 
> This would just turn off this one kind of NUMA discovery for Dom0.
> The patch is admittedly a bit rough (not sure about the proper
> placement into #ifdef's, for instance) and not well tested yet.
> Also one could think about using a more general variable name to
> cover other hardware things in the future that Dom0 shouldn't use.
> So this isn't something still for 3.6, probably not even for 3.7.
> 
> What about if we drop the patch for this problem at all for 3.6 and
> recommend "numa=off" as a workaround? This is much less sticky than
> a kernel patch and could appear in the Xen wiki, for instance.

I hate workarounds. People end up using them forever and they get
codified.

> After all this isn't a strict regression (appears with every 3.x
> kernel, AFAICT).
> Most of the time the northbridge scanning will yield bogus results,
> so the kernel eventually discards it, but sometimes it seems to slip
> through and causes trouble.
> Also it does not trigger on newer (Bulldozer) class CPUs, since we
> deliberately avoided adding the new northbridge PCI-ID for this
> routine.

Right, you end up using the ACPI NUMA in them.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.