Xen project Mailing List

Re: [Xen-devel] [RFC v2 2/6] x86/init: use linker tables to simplify x86 init and annotate dependencies

On Fri, Feb 19, 2016 at 6:15 AM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote: > Any failure on the x86 init path can be catastrophic. > A simple shift of a call from one place to another can > easily break things. Likewise adding a new call to > one path without considering all x86 requirements > can make certain x86 run time environments crash. > We currently account for these requirements through > peer code review and run time testing. We could do > much better if we had a clean and simple way to annotate > strong semantics for run time requirements, init sequence > dependencies, and detection mechanisms for additions of > new x86 init sequences. Please document this in a way that will be useful for people trying to understand what the code does as opposed to just people who are trying to understand why you wrote it. More below. > +/** > + * struct x86_init_fn - x86 generic kernel init call > + * > + * Linux x86 features vary in complexity, features may require work done at > + * different levels of the full x86 init sequence. Today there are also two > + * different possible entry points for Linux on x86, one for bare metal, KVM > + * and Xen HVM, and another for Xen PV guests / dom0. Assuming a bootloader > + * has set up 64-bit mode, roughly the x86 init sequence follows this path: > + * > + * Bare metal, KVM, Xen HVM Xen PV / dom0 > + * startup_64() startup_xen() > + * \ / > + * x86_64_start_kernel() xen_start_kernel() > + * \ / > + * x86_64_start_reservations() > + * | > + * start_kernel() > + * [ ... ] > + * [ setup_arch() ] > + * [ ... ] > + * init > + * I don't think this paragraph below is necessary. I also think it's very confusing. Keep in mind that the reader has no idea what a "level" is at this point and that the reader also doesn't need to think about terms like "paravirtualization yielding". > + * x86_64_start_kernel() and xen_start_kernel() are the respective first C > code > + * entry starting points. The different entry points exist to enable Xen to > + * skip a lot of hardware setup already done and managed on behalf of the > + * hypervisor, we refer to this as "paravirtualization yielding". The > different > + * levels of init calls on the x86 init sequence exist to account for these > + * slight differences and requirements. These different entry points also > share > + * a common entry x86 specific path, x86_64_start_reservations(). > + * And here, I don't even know what a "feature" is. > + * A generic x86 feature can have different initialization calls, one on each > + * of the different main x86 init sequences, but must also address both entry > + * points in order to work properly across the board on all supported x86 > + * subarchitectures. Since x86 features can also have dependencies on other > + * setup code or features, x86 features can at times be subordinate to other > + * x86 features, or conditions. struct x86_init_fn enables feature developers > + * to annotate dependency relationships to ensure subsequent init calls only > + * run once a subordinate's dependencies have run. When needed custom > + * dependency requirements can also be spelled out through a custom > dependency > + * checker. In order to account for the dual entry point nature of x86-64 > Linux > + * for "paravirtualization yielding" and to make annotations for support for > + * these explicit each struct x86_init_fn must specify supported > + * subarchitectures. The earliest x86-64 code can read the subarchitecture > + * though is after load_idt(), as such the earliest we can currently rely on > + * subarchitecture for semantics and a common init sequences is on the shared > + * common x86_64_start_reservations(). Each struct x86_init_fn must also > + * declare a two-digit decimal number to impose an ordering relative to other > + * features when required. I'm totally lost in the paragraph above. > + * > + * x86_init_fn enables strong semantics and dependencies to be defined and > + * implemented on the full x86 initialization sequence. Please try explaining what this is, instead. For example: An x86_init_fn represents a function to be called at a certain point during initialization on a specific set of subarchitectures. > + * > + * @order_level: must be set, linker order level, this corresponds to the > table > + * section sub-table index, we record this only for semantic validation > + * purposes. I read this as "this is purely a debugging feature". Is it? I think it's not. > Order-level is always required however you typically would > + * only use X86_INIT_NORMAL*() and leave ordering to be done by placement > + * of code in a C file and the order of objects through a Makefile. > Custom > + * order-levels can be used when order on C file and order of objects on > + * Makfiles does not suffice or much further refinements are needed. Assuming I understand this correctly, here's how I'd write it: @order_level: The temporal order of this x86_init_fn. Lower order_level numbers are called first. Ties are broken by order found in the Makefile and then by order in the C file. Note, however, that my proposed explanation can't be right because it appears to conflict with "depend". Please adjust accordingly. > + * @supp_hardware_subarch: must be set, it represents the bitmask of > supported > + * subarchitectures. We require each struct x86_init_fn to have this set > + * to require developer considerations for each supported x86 > + * subarchitecture and to build strong annotations of different possible > + * run time states particularly in consideration for the two main > + * different entry points for x86 Linux, to account for > paravirtualization > + * yielding. ' Too much motivation, too little documentation. @supp_hardware_subarch: A bitmask of subarchitectures on which to call this init function. --- start big deletion --- > + * > + * The subarchitecture is read by the kernel at early boot from the > + * struct boot_params hardware_subarch. Support for the subarchitecture > + * exists as of x86 boot protocol 2.07. The bootloader would have set up > + * the respective hardware_subarch on the boot sector as per > + * Documentation/x86/boot.txt. > + * > + * What x86 entry point is used is determined at run time by the > + * bootloader. Linux pv_ops was designed to help enable to build one > Linux > + * binary to support bare metal and different hypervisors. pv_ops setup > + * code however is limited in that all pv_ops setup code is run late in > + * the x86 init sequence, during setup_arch(). In fact cpu_has_hypervisor > + * only works after early_cpu_init() during setup_arch(). If an x86 > + * feature requires an earlier determination of what hypervisor was used, > + * or if it needs to annotate only support for certain hypervisors, the > + * x86 hardware_subarch should be set by the bootloader and > + * @supp_hardware_subarch set by the x86 feature. Using hardware_subarch > + * enables x86 features to fill the semantic gap between the Linux x86 > + * entry point used and what pv_ops has to offer through a hypervisor > + * agnostic mechanism. --- end big deletion --- > + * > + * Each supported subarchitecture is set using the respective > + * X86_SUBARCH_* as a bit in the bitmask. For instance if a feature > + * is supported on PC and Xen subarchitectures only you would set this > + * bitmask to: > + * > + * BIT(X86_SUBARCH_PC) | > + * BIT(X86_SUBARCH_XEN); I like this part, but how about "For instance, if an init function should be called on PC and Xen subarchitectures only, you would set the bitmask to..."? > + * > + * @detect: optional, if set returns true if the feature has been detected to > + * be required, it returns false if the feature has been detected to not > + * be required. I have absolutely no idea what this means. > + * @depend: optional, if set this set of init routines must be called prior > to > + * the init routine who's respective detect routine we have set this > + * depends callback to. This is only used for sorting purposes given > + * all current init callbacks have a void return type. Sorting is > + * implemented via x86_init_fn_sort(), it must be called only once, > + * however you can delay sorting until you need it if you can ensure > + * only @order_level and @supp_hardware_subarch can account for proper > + * ordering and dependency requirements for all init sequences prior. > + * If you do not have a depend callback set its assumed the order level > + * (__x86_init_fn(level)) set by the init routine suffices to set the > + * order for when the feature's respective callbacks are called with > + * respect to other calls. Sorting of init calls with the same order > level > + * is determined by linker order, determined by order placement on C code > + * and order listed on a Makefile. A routine that depends on another is > + * known as being subordinate to the init routine it depends on. Routines > + * that are subordinate must have an order-level of lower priority or > + * equal priority than the order-level of the init sequence it depends > on. I don't understand this at all. I assume you're saying that some kind of topological sorting happens. This leads to a question: why is "depend" a function pointer? Shouldn't it be x86_init_fn*? Also, what happens if you depend on something that is disabled on the running subarch? And what happens if the depend-implied order is inconsistent with order_level. I would hope that order_level breaks ties in the topological sort and that there's a bit fat warning if the order_level ordering is inconsistent with the topological ordering. Preferably the warning would be at build time, in which case it could be an error. > + * @early_init: required, routine which will run in > x86_64_start_reservations() > + * after we ensure boot_params.hdr.hardware_subarch is accessible and > + * properly set. Memory is not yet available. This the earliest we can > + * currently define a common shared callback since all callbacks need to > + * check for boot_params.hdr.hardware_subarch and this becomes accessible > + * on x86-64 until after load_idt(). What's this for? Under what conditions is it called? What order? Why is it part of x86_init_fn at all? > + * @flags: optional, bitmask of enum x86_init_fn_flags What are these flags? What do they do? > + */ > +struct x86_init_fn { > + __u32 order_level; > + __u32 supp_hardware_subarch; > + bool (*detect)(void); > + bool (*depend)(void); > + void (*early_init)(void); > + __u32 flags; > +}; > + > +/** > + * enum x86_init_fn_flags: flags for init sequences > + * > + * X86_INIT_FINISH_IF_DETECTED: tells the core that once this init sequence > + * has completed it can break out of the loop for init sequences on > + * its own level. What does this mean? > + * X86_INIT_DETECTED: private flag. Used by the x86 core to annotate that > this > + * init sequence has been detected and it all of its callbacks > + * must be run during initialization. Please make this an entry in a new field scratch_space or similar and just document the entire field as "private to the init core code". > +static struct x86_init_fn *x86_init_fn_find_dep(struct x86_init_fn *start, > + struct x86_init_fn *finish, > + struct x86_init_fn *q) > +{ > + struct x86_init_fn *p; > + > + if (!q) > + return NULL; > + > + for (p = start; p < finish; p++) > + if (p->detect == q->depend) > + return p; That's very strange indeed, and it doesn't seem consistent with my explanation. Please fix up the docs to explain what's going on. Again, as a reviewer and eventual user of this code, I do not need to know what historical problem it solves. I need to know what the code does and how to use it. > + > +void __ref x86_init_fn_init_tables(void) Why is this __ref and not __init? --Andy _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.