[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/ctxt-switch: Document and improve GDT handling


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <JBeulich@xxxxxxxx>
  • Date: Fri, 5 Jul 2019 13:56:49 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; test.office365.com 1;spf=none;dmarc=none;dkim=none;arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=testarcselector01; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QSQOTPu6l3PmeFR7rjxLp6uQYXKLSHZ/jr+TYS9Rz2A=; b=DsflsW25r68IHao72fbb6FSLyC+1KQCBtPccoxa45HaaYwiC1E5lAe8CtbZpX9iKbtHA3qJed5rmQ7quI5umRJ/vQHLF+VG97cailvas2nwIsZT7NjMT9IguV/5dP7rPe79zIGMAV+gUZHE3663qy881eUDXDrCa+JJ/0hK3iE0=
  • Arc-seal: i=1; a=rsa-sha256; s=testarcselector01; d=microsoft.com; cv=none; b=eaJWK3Ykdr4Pgfep5AOkYWW5HChYH1WbfcuyGhI+oOlhBIvVr2afx5RnRwyKUfEE2OUTA1zZ86QyLiKJyhi4RPbbmp51289PtM1TdWkcw5wF3amurxBnl0g/+3bW13qRaJiwA4cGNMsZqEAzN91lR4huvG/oLRDwaRIap6iKTUg=
  • Authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@xxxxxxxx;
  • Cc: Juergen Gross <JGross@xxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 05 Jul 2019 13:57:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHVMzl9f7XpHmXEn0SQQWgMTo0c+w==
  • Thread-topic: [PATCH] x86/ctxt-switch: Document and improve GDT handling

On 05.07.2019 15:36, Andrew Cooper wrote:
> On 05/07/2019 11:00, Jan Beulich wrote:
>> On 04.07.2019 19:57, Andrew Cooper wrote:
>>> Also, it should now be very obvious to people that Xen's current GDT 
>>> handling
>>> for non-PV vcpus is a recipe subtle bugs, if we ever manage to execute a 
>>> stray
>>> mov/pop %sreg instruction.  We really ought to have Xen's regular GDT in an
>>> area where slots 0-13 are either mapped to the zero page, or not present, so
>>> we don't risk loading a non-faulting garbage selector.
>> Well, there's certainly room for improvement, but loading a stray
>> selector seems pretty unlikely an event to happen, and the
>> respective code having got slipped in without anyone noticing.
>> Other than in context switching code I don't think there are many
>> places at all where we write to the selector registers.
> 
> There are however many places where we write some bytes into a stub and
> then execute them.
> 
> This isn't a security issue.  There aren't any legitimate codepaths for
> which is this a problem, but there are plenty of cascade failures where
> this is liable to make a bad situation worse is weird hard-to-debug ways.
> 
> Not to mention that for security hardening purposes, we should be using
> a RO mapping to combat sgdt or fixed-ABI knowledge from an attacker.

Okay, I can see how SGDT can reveal undue knowledge to an attacker,
even if they can't use the address directly, but only to infer further
things (hence UMIP's existence). But I'm having trouble seeing how a
r/o mapped GDT adds much security. Could you expand on this?

> And on that note... nothing really updates the full GDT via the
> perdomain mappings, so I think that can already move to being RO.  This
> does depend on the fact that noone has used segmented virtual memory
> since long before Xen was a thing.  We can trap and emulate the setting
> of A bits, and I bet that path will never get hit even with old PV guests.

Well, you could go the first step here and map the Xen page r/o right
away. There we don't even need to fear A bits needing to get set, since
we control the contents.

>>> @@ -1718,15 +1737,12 @@ static void __context_switch(void)
>>>    
>>>        psr_ctxt_switch_to(nd);
>>>    
>>> -    gdt = !is_pv_32bit_domain(nd) ? per_cpu(gdt_table, cpu) :
>>> -                                    per_cpu(compat_gdt_table, cpu);
>>> -
>>>        if ( need_full_gdt(nd) )
>>> -        write_full_gdt_ptes(gdt, n);
>>> +        update_xen_slot_in_full_gdt(n, cpu);
>>>    
>>>        if ( need_full_gdt(pd) &&
>>>             ((p->vcpu_id != n->vcpu_id) || !need_full_gdt(nd)) )
>>> -        load_default_gdt(gdt, cpu);
>>> +        load_default_gdt(cpu);
>>   From looking at this transformation I cannot see how, as said in
>> the description and as expressed by removing the gdt parameter
>> from load_default_gdt(), the gdt having got passed in here would
>> always have been per_cpu(gdt_table, cpu). It pretty clearly was
>> the compat one for nd being 32-bit PV. What am I missing?
> 
> To be perfectly honest, I wrote "how it {does,should} logically work",
> then adjusted the code.
> 
>> Or is the description perhaps instead meaning to say that it doesn't
>> _need_ to be the compat one that we load here, as in case it is
>> the subsequent load_full_gdt() will replace it again anyway?
> 
> lgdt is an expensive operation.  I hadn't even spotted that we are doing
> it twice on that path.  There is surely some room for improvement here
> as well.

Well, the double load had to be added for a very simple reason:
While we're switching page tables, the GDT mapping has to remain
intact. Hence the dependency upon the two vCPU IDs (not) matching.
IOW I don't currently see room for improvement here.

> I wonder if caching the last gdt base address per cpu would be a better
> option, and only doing a "lazy" lgdt.  It would certainly simply the
> "when should I lgdt?" logic.

At the first glance I'm not convinced this would simplify things, but
I'd have to see actual code.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.