[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/2] x86: allow Meltdown band-aid to be disabled



>>> On 15.01.18 at 19:26, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 15/01/18 11:07, Jan Beulich wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1849,6 +1849,15 @@ In the case that x2apic is in use, this
>>  clustered mode.  The default, given no hint from the **FADT**, is cluster
>>  mode.
>>  
>> +### xpti
>> +> `= <boolean>`
>> +
>> +> Default: `false` on AMD hardware
>> +> Default: `true` everywhere else
> 
> This is fine for now, but any further isolation is going to have to be
> unconditional, or possibly compile-time choice, but maintaining that
> going forwards is going to be truly horrible.
> 
>> +
>> +Override default selection of whether to isolate 64-bit PV guest page
>> +tables.
> 
> This wants a
> 
> ** WARNING: Not yet a complete isolation implementation, but better than
> nothing. **
> 
> or similar, just to avoid giving the people the impression that it is
> complete.  Perhaps also a similar warning in the patch 1 commit message?

I've added the warning text here as suggested, and a respective
remark to patch 1's description, albeit under the current guidelines
of when we would consider an information leak a security issue, I
think isolation is sufficiently complete: No parts of the direct map
not pertaining to the current guest are being exposed, which
implies that no parts of the Xen heap not pertaining to the current
guest are. Leaking control page contents of other guests as well
as leaking the bits and pieces of the Xen image is not an issue by
itself.

IOW I'm not convinced such a warning is - strictly from a "would
this need an XSA on its own" pov - entirely appropriate.

Furthermore the command line option really exists to _disable_
the isolation (it also being capable of enabling is more a side
effect, but perhaps a useful one in the niche of running Xen
itself virtualized with being given the impression of running on
AMD hardware, but actually running on Intel), at which point the
warning is completely irrelevant.

One thing we may want/need to consider (but which is orthogonal
to the changes here) is to overwrite the part of the hypervisor
stack which isn't already being overwritten during context switch.

>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1511,8 +1511,10 @@ void paravirt_ctxt_switch_to(struct vcpu
>>  {
>>      unsigned long cr4;
>>  
>> -    this_cpu(root_pgt)[root_table_offset(PERDOMAIN_VIRT_START)] =
>> -        l4e_from_page(v->domain->arch.perdomain_l3_pg, 
>> __PAGE_HYPERVISOR_RW);
>> +    if ( this_cpu(root_pgt) )
>> +        this_cpu(root_pgt)[root_table_offset(PERDOMAIN_VIRT_START)] =
>> +            l4e_from_page(v->domain->arch.perdomain_l3_pg,
>> +                          __PAGE_HYPERVISOR_RW);
> 
> This would be cleaner and have better generated code by pulling
> this_cpu(root_pgt) into a local variable.

Done.

>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -3685,7 +3685,7 @@ long do_mmu_update(
>>                      rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
>>                                        cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
>>                      if ( !rc )
>> -                        sync_guest = true;
>> +                        sync_guest = this_cpu(root_pgt);
> 
> This is quite deceptive, as it is actually sync_guest =
> this_cpu(root_pgt) != NULL;

We omit such "!= NULL" / "!= 0" elsewhere too.

> It would probably be cleaner to export opt_xpti and use that, rather
> than imply things based on root_pgt.  It would certainly be more
> efficient code.

Again I had considered this and decided against: Other reasons to
have NULL in here may arise, so it felt better to key the decision
off of the actual fact that it depends on, rather than that one's
pre-condition.

>> @@ -992,10 +998,13 @@ void __init smp_prepare_cpus(unsigned in
>>  
>>      stack_base[0] = stack_start;
>>  
>> +    if ( opt_xpti < 0 )
>> +        opt_xpti = boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
> 
> Newline?

I've added one, but imo it hides that the assignment is a pre-condition
for ...

>>      rc = setup_cpu_root_pgt(0);

... this call (i.e. makes less obvious that re-ordering here is not an
option).

>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -46,6 +46,7 @@ restore_all_guest:
>>          movabs $DIRECTMAP_VIRT_START, %rcx
>>          mov   %rdi, %rax
>>          and   %rsi, %rdi
>> +        jz    .Lrag_keep_cr3
> 
> What is rag?  What about .L_skip_cr3_reload ?

Look at the function name: It's the acronym of "restore_all_guest".
I specifically wanted to avoid any unspecific names like the one
you suggest, and short of a better idea I've used this one (to
parallel infixes like "cstar" used elsewhere).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.