[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC 6/9] x86/boot: Install trap handlers much earlier on boot



On 15/05/14 11:53, Jan Beulich wrote:
>>>> On 15.05.14 at 11:48, <andrew.cooper3@xxxxxxxxxx> wrote:
>> +void __cpuinit load_system_tables(void)
>> +{
>> +    unsigned int cpu = smp_processor_id();
>> +    unsigned long stack_bottom = get_stack_bottom(),
>> +            stack_top = stack_bottom & ~(STACK_SIZE - 1);
>> +
>> +    struct tss_struct *tss = &this_cpu(init_tss);
>> +    struct desc_struct *gdt =
>> +            this_cpu(gdt_table) - FIRST_RESERVED_GDT_ENTRY;
>> +    struct desc_struct *compat_gdt =
>> +            this_cpu(compat_gdt_table)- FIRST_RESERVED_GDT_ENTRY;
>> +
>> +    struct desc_ptr gdtr = {
>> +            .base = (unsigned long)gdt,
>> +            .limit = LAST_RESERVED_GDT_BYTE,
>> +    };
>> +    struct desc_ptr idtr = {
>> +            .base = (unsigned long)idt_tables[cpu],
>> +            .limit = (IDT_ENTRIES * sizeof(idt_entry_t)) - 1,
>> +    };
>> +
>> +    /* Main stack for interrupts/exceptions */
>> +    tss->rsp0 = stack_bottom;
>> +    tss->bitmap = IOBMP_INVALID_OFFSET;
>> +
>> +    /* MCE, NMI and Double Fault handlers get their own stacks */
>> +    tss->ist[IST_MCE - 1] = stack_top + IST_MCE * PAGE_SIZE;
>> +    tss->ist[IST_DF  - 1] = stack_top + IST_DF  * PAGE_SIZE;
> Hard tab.

These should all be hard tabs, following the file's style.

>
>> +    tss->ist[IST_NMI - 1] = stack_top + IST_NMI * PAGE_SIZE;
>> +
>> +    _set_tssldt_desc(
>> +            gdt + TSS_ENTRY,
>> +            (unsigned long)tss,
>> +            offsetof(struct tss_struct, __cacheline_filler) - 1,
>> +            DESC_TYPE_tss_avail);
>> +    _set_tssldt_desc(
>> +            compat_gdt + TSS_ENTRY,
>> +            (unsigned long)tss,
>> +            offsetof(struct tss_struct, __cacheline_filler) - 1,
>> +            DESC_TYPE_tss_busy);
>> +
>> +    asm volatile ( "lgdt %0"   : : "m"  (gdtr) );
>> +    asm volatile ( "lldt %w0"  : : "rm" (0) );
>> +    asm volatile ( "ltr  %w0"  : : "rm" (TSS_ENTRY << 3) );
>> +    asm volatile ( "lidt %0"   : : "m"  (idtr) );
> I think you ought to load the IDT right after the GDT, so that eventual
> faults on the LLDT and LTR have less chance of ending in a triple fault.

That particular lldt can't possibly fault.  I wanted to avoid having IST
information in the IDT without a loaded TSS, but given the proximity of
the loads, catching an invalid_tss from the 'ltr' might be useful.

>
>> --- a/xen/arch/x86/setup.c
>> +++ b/xen/arch/x86/setup.c
>> @@ -558,15 +558,20 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>>          .stop_bits = 1
>>      };
>>  
>> +    /* Critical region without IDT or TSS.  Any fault is deadly! */
> Which raises the question whether the next patch shouldn't be
> dropped.

With this patch, the BSP code looks like:

* patch ingnore_int everywhere into idt_table
* call into C
* patch real trap handlers into idt_table
* load system tables such that faults all work.

Therefore, the ignore_int() patching is almost completely pointless,
which is why it is completely removed in the subsequent patch.

>
>>      set_processor_id(0);
>>      set_current((struct vcpu *)0xfffff000); /* debug sanity */
>>      this_cpu(curr_vcpu) = idle_vcpu[0] = current;
>>  
>> +    init_idt_traps();
>> +    load_system_tables();
>> +
>>      sort_exception_tables();
>>  
>> -    percpu_init_areas();
>> +    /* Full trap support from here on in */
> Stop missing (also elsewhere).
>
>> @@ -345,6 +338,10 @@ void start_secondary(void *unused)
>>       */
>>      spin_debug_disable();
>>  
>> +    load_system_tables();
>> +
>> +    /* Full trap support from here on in */
>> +
>>      percpu_traps_init();
> I guess this ends up being confusing, even if it's just because the
> name of the function perhaps no longer reflects its purpose.

Yes - there is a little more poor resultant naming than I would like.  I
am also not completely happy with the names "load_system_tables()" and
"init_idt_traps()", but can't think of more appropriate names offhand.

>
>> @@ -3515,23 +3515,40 @@ void __init trap_init(void)
>>      set_intr_gate(TRAP_bounds,&bounds);
>>      set_intr_gate(TRAP_invalid_op,&invalid_op);
>>      set_intr_gate(TRAP_no_device,&device_not_available);
>> +    set_intr_gate(TRAP_double_fault, &double_fault);
>>      set_intr_gate(TRAP_copro_seg,&coprocessor_segment_overrun);
>>      set_intr_gate(TRAP_invalid_tss,&invalid_TSS);
>>      set_intr_gate(TRAP_no_segment,&segment_not_present);
>>      set_intr_gate(TRAP_stack_error,&stack_segment);
>>      set_intr_gate(TRAP_gp_fault,&general_protection);
>> -    set_intr_gate(TRAP_page_fault,&page_fault);
>> +    set_intr_gate(TRAP_page_fault,&early_page_fault);
>>      set_intr_gate(TRAP_spurious_int,&spurious_interrupt_bug);
>>      set_intr_gate(TRAP_copro_error,&coprocessor_error);
>>      set_intr_gate(TRAP_alignment_check,&alignment_check);
>>      set_intr_gate(TRAP_machine_check,&machine_check);
>>      set_intr_gate(TRAP_simd_error,&simd_coprocessor_error);
>>  
>> +    /* Specify dedicated interrupt stacks for NMI, #DF, and #MC. */
>> +    set_ist(&idt_table[TRAP_double_fault],  IST_DF);
>> +    set_ist(&idt_table[TRAP_nmi],           IST_NMI);
>> +    set_ist(&idt_table[TRAP_machine_check], IST_MCE);
> Mind putting them right aside the respective set_intr_gate() above?

Sorted in a later patch.  For now I think it is clearer as obvious code
motion out of cpu_init().

>
>> +void __init trap_init(void)
>> +{
>> +    set_intr_gate(TRAP_page_fault, &page_fault);
>> +
>> +    /* The 32-on-64 hypercall entry vector is only accessible from ring 1. 
>> */
>> +    _set_gate(idt_table+HYPERCALL_VECTOR, DESC_TYPE_trap_gate, 1, 
>> &compat_hypercall);
>> +
>> +    /* Fast trap for int80 (faster than taking the #GP-fixup path). */
>> +    _set_gate(idt_table+0x80, DESC_TYPE_trap_gate, 3, &int80_direct_trap);
> Long lines?
>
> Jan

Again, code motion, although this long line was because of
"DESC_TYPE_trap_gate".  "SYS_DESC_trap" in the first patch would fix the
issue.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.