[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH v3 15/24] ARM: NUMA: DT: Add CPU NUMA support



On Wed, 26 Jul 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 25/07/17 20:06, Stefano Stabellini wrote:
> > On Tue, 25 Jul 2017, Julien Grall wrote:
> > > On 25/07/17 19:48, Stefano Stabellini wrote:
> > > > On Tue, 25 Jul 2017, Julien Grall wrote:
> > > > > On 25/07/17 07:47, Vijay Kilari wrote:
> > > > > > > >  void numa_failed(void)
> > > > > > > >  {
> > > > > > > >      numa_off = true;
> > > > > > > >      init_dt_numa_distance();
> > > > > > > >      node_distance_fn = NULL;
> > > > > > > > +    init_cpu_to_node();
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +void __init numa_set_cpu_node(int cpu, unsigned int nid)
> > > > > > > > +{
> > > > > > > > +    if ( !node_isset(nid, processor_nodes_parsed) || nid >=
> > > > > > > > MAX_NUMNODES
> > > > > > > > )
> > > > > > > > +        nid = 0;
> > > > > > > 
> > > > > > > 
> > > > > > > This looks wrong to me. If the node-id is invalid, why would you
> > > > > > > blindly
> > > > > > > set
> > > > > > > to 0?
> > > > > > 
> > > > > > Generally this check will not pass. I will make this function return
> > > > > > error code in case
> > > > > > of wrong nid.
> > > > > 
> > > > > I don't really want to see error code and error handling everywhere in
> > > > > the
> > > > > initialization code. I would assume that if the NUMA bindings are
> > > > > wrong we
> > > > > should just crash Xen rather continuing with NUMA disabled.
> > > > > 
> > > > > Stefano do you have any opinion here?
> > > > 
> > > > Yes, I noticed that there is an overabundance of error checks in the
> > > > patches. I have pointed out in other cases that some of these checks are
> > > > duplicates.
> > > > 
> > > > I am OK with some checks but we should not do the same check over and
> > > > over.
> > > > 
> > > > To answer the question: do we need any checks at all?
> > > > 
> > > > I am fine with no checks on the device tree or ACPI bindings themselves.
> > > > I am also OK with some checks in few places to check that the
> > > > information passed by the firmware is in the right shape (for example we
> > > > check for the ACPI header length before accessing any ACPI tables). That
> > > > is good. But I am not OK with repeating the same check multiple times
> > > > uselessly or checking for conditions that cannot happen (like a NULL
> > > > pointer in the ACPI header case again).
> > > 
> > > I would prefer to keep the check on the DT bindings and ACPI bindings. I
> > > hit
> > > some problem in the past that were quite annoying to debug without them.
> > > 
> > > But I was wondering if we should just panic/BUG_ON directly. Rather than
> > > returning an error.
> > 
> > I think BUG_ON is fine, but it would be best if we also printed a
> > useful message before crashing Xen. At least the user would know that
> > the problem is a broken device_tree/ACPI.
> 
> I was suggesting to use panic because you can get a nice message :).

panic is great, somehow BUG_ON grabbed all my attention when I read your
email :-)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.