[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/boot: Restrict directmap permissions for .text/.rodata



On 06/12/2021 13:58, Jan Beulich wrote:
> On 06.12.2021 14:08, Andrew Cooper wrote:
>> While we've been diligent to ensure that the main text/data/rodata mappings
>> have suitable restrictions, their aliases via the directmap were left fully
>> RW.  Worse, we even had pieces of code making use of this as a feature.
>>
>> Restrict the permissions, as we have no legitimate need for writeability of
>> these areas via the directmap alias.
> Where do we end up reading .text and/or .rodata through the directmap? Can't
> we zap the mappings altogether?

I felt it was safer to keep readability via the directmap.

I'm not aware of any logic we have which reads the directmap in order,
but it ought to be possible.

> As to superpage shattering - I understand this is not deemed to be an issue
> in the common case since, with Xen moved as high up below 4G as possible,
> it wouldn't normally live inside a 1G mapping anyway? This may want calling
> out here. Plus, in non-EFI, non-XEN_ALIGN_2M builds isn't this going to
> shatter a 2M page at the tail of .rodata?

cpu0_stack has already shattered down to 4k, which is likely in the same
superpage as rodata in a non-2M build.

But at the end of the day, it is a security/performance tradeoff.

memcpy(__va(__pa(divide_error)), "\x0f\x0b", 2);
asm ("div %ecx" :: "c" (0));

is an especially low barrier for an attacker who has a partial write gadget.

The security benefits are substantial, and the perf downsides are a
handful of extra pagetables, and a handful of pagewalks taking extra
steps, in non-fast paths (i.e. distinctly marginal).

It occurs to me while writing this that the same applies to livepatches.

>
>> Note that the pagetables and cpu0_stack do get written through their 
>> directmap
>> alias, so we can't just read-only the whole image.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> CC: Wei Liu <wl@xxxxxxx>
>>
>> Ever so slightly RFC, as it has only had light testing.
>>
>> Notes:
>>  * The stubs are still have RX via one alias, RW via another, and these need
>>    to stay.  Hardening options include splitting the stubs so the SYSCALL 
>> ones
>>    can be read-only after setup, and/or expanding the stub size to 4k per CPU
>>    so we really can keep the writeable alias as not present when the stub
>>    isn't in active use.
>>  * Future CPUs with Protection Key Supervisor (Sapphire Rapids and later)
>>    would be able to inhibit writeability outside of a permitted region, and
>>    because the protection key is per logical thread, we woulnd't need to
>>    expand the stubs to 4k per CPU.
> I'm afraid I don't follow: The keys still apply to entire pages, don't they?
> This would still allow write access by all 16 CPUs sharing a page for their
> stubs.

It would be all stubs, because there are only 16 protection keys and we
wouldn't want to interleave adjacent stub mappings.

The logic would now be:

pks_allow_write_access();
write new stub;
pks_revoke_write_access();

so as to limit writeability of any stub to just the critical intending
to modify it.

This way, an unrelated buggy hypercall couldn't write into the stub.

>>  * At the time of writing, PV Shim still makes use of .rodata's read/write
>>    alias in the directmap to patch the hypercall table, but that runs earlier
>>    on boot.  Also, there are patches out to address this.
> I did consider committing that change, but it wasn't clear to me whether
> there were dependencies on earlier parts of the series that it's part of.

I've got a dis-entangled version in my CET series.

https://github.com/andyhhp/xen/commit/8d55e1c8ff1d979c985b3fb75c23627348c15209

which needed some header file adjustments to build.

But yes - I too was thinking that it ought to be committed.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.