[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...




On 15.12.2021 10:35, Jan Beulich wrote:
> On 15.12.2021 10:27, Michal Orzel wrote:
>> Replying to both Julien and Jan,
>>
>> On 14.12.2021 12:30, Julien Grall wrote:
>>>
>>>
>>> On 14/12/2021 11:01, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> Replying in one e-mail the comments from Jan and Michal.
>>>>
>>>> On 14/12/2021 10:01, Jan Beulich wrote:
>>>>> On 14.12.2021 10:51, Michal Orzel wrote:
>>>>>> Hi Julien,
>>>>>>
>>>>>> On 14.12.2021 10:33, Julien Grall wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 14/12/2021 09:17, Michal Orzel wrote:
>>>>>>>> Hi Julien, Jan
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>> On 08.12.2021 10:55, Julien Grall wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 08/12/2021 07:20, Jan Beulich wrote:
>>>>>>>>>> On 07.12.2021 20:11, Julien Grall wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 07/12/2021 08:37, Michal Orzel wrote:
>>>>>>>>>>>> Hi Julien,
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>>> On 06.12.2021 16:29, Julien Grall wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 06/12/2021 14:20, Michal Orzel wrote:
>>>>>>>>>>>>>> to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>>>
>>>>>>>>>>>> I will change to "from AArch32 state".
>>>>>>>>>>>>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>>>>>>>>>>>>> "If the general-purpose register was accessible from AArch32 
>>>>>>>>>>>>>> state the
>>>>>>>>>>>>>> upper 32 bits either become zero, or hold the value that the same
>>>>>>>>>>>>>> architectural register held before any AArch32 execution.
>>>>>>>>>>>>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>>>>>>>>>>>>
>>>>>>>>>>>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>>>>>>>>>>>
>>>>>>>>>>>> Ok.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Currently Xen does not ensure that the top 32 bits are zeroed 
>>>>>>>>>>>>>> and this
>>>>>>>>>>>>>> needs to be fixed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you outline why this is a problem and why we need to protect? 
>>>>>>>>>>>>> IIRC, the main concern is Xen may misinterpret what the guest 
>>>>>>>>>>>>> requested but we are not concerned about Xen using wrong value.
>>>>>>>>>>>>>
>>>>>>>>>>>> I would say:
>>>>>>>>>>>> "
>>>>>>>>>>>> The reason why this is a problem is that there are places in Xen 
>>>>>>>>>>>> where we assume that top 32bits are zero for AArch32 guests.
>>>>>>>>>>>> If they are not, this can lead to misinterpretation of Xen 
>>>>>>>>>>>> regarding what the guest requested.
>>>>>>>>>>>> For example hypercalls returning an error encoded in a signed long 
>>>>>>>>>>>> like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>>>>>>>>>>> if the command passed as the first argument was clobbered,
>>>>>>>>>>>> "
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fix this bug by zeroing the upper 32 bits of these registers on 
>>>>>>>>>>>>>> an
>>>>>>>>>>>>>> entry to hypervisor when switching to AArch32 state.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Set default value of parameter compat of macro entry to 0 
>>>>>>>>>>>>>> (AArch64 mode
>>>>>>>>>>>>>> as we are on 64-bit hypervisor) to avoid checking if parameter 
>>>>>>>>>>>>>> is blank
>>>>>>>>>>>>>> when not passed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Which error do you see otherwise? Is it a compilation error?
>>>>>>>>>>>>>
>>>>>>>>>>>> Yes, this is a compilation error. The errors appear at each line 
>>>>>>>>>>>> when "entry" is called without passing value for "compat".
>>>>>>>>>>>> So basically in all the places where entry is called with hyp=1.
>>>>>>>>>>>> When taking the current patch and removing default value for 
>>>>>>>>>>>> compat you will get:
>>>>>>>>>>>> ```
>>>>>>>>>>>> entry.S:254: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:258: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:258: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:262: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:262: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:266: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:266: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:278: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:278: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:292: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:292: Error: ".endif" without ".if"
>>>>>>>>>>>> entry.S:317: Error: symbol `.if' is already defined
>>>>>>>>>>>> entry.S:317: Error: ".endif" without ".if"
>>>>>>>>>>>> ```
>>>>>>>>>>>
>>>>>>>>>>> Thanks for input. I am concerned with your suggested approach (or 
>>>>>>>>>>> using
>>>>>>>>>>> .if 0\compat as suggested by Jan) because they allow the caller to 
>>>>>>>>>>> not
>>>>>>>>>>> properly specify compat when hyp=0. The risk here is we may 
>>>>>>>>>>> generate the
>>>>>>>>>>> wrong entry.
>>>>>>>>>>>
>>>>>>>>>>> compat should need to be specified when hyp=1 as we will always run 
>>>>>>>>>>> in
>>>>>>>>>>> aarch64 mode. So could we protect this code with hyp=0?
>>>>>>>>>>
>>>>>>>>>> Since my suggestion was only to avoid the need for specifying a 
>>>>>>>>>> default
>>>>>>>>>> for the parameter (which you didn't seem to be happy about), it would
>>>>>>>>>> then merely extend to
>>>>>>>>>>
>>>>>>>>>> .if !0\hyp && 0\compat
>>>>>>>>> Isn't it effectively the same as setting a default value?
>>>>>>>>>
>>>>>>>>> The reason we seem to get away is because other part of the macro 
>>>>>>>>> (e.g. entry_guest) will need compat to be valid.
>>>>>>>>>
>>>>>>>>> But that seems pretty fragile to me. So I would prefer if the new 
>>>>>>>>> code it added within a macro that is only called when hyp==0.
>>>>>>>>>
>>>>>>>> So you would like to have a macro that is called if hyp=0 (which means 
>>>>>>>> compat had to be passed) and inside this macro additional check if 
>>>>>>>> compat is 1?
>>>>>>>
>>>>>>> Yes. This is the only way I could think to avoid making 
>>>>>>> 'compat'optional.
>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> or something along those lines.
>>>>>>>>>>
>>>>>>>>>> Jan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> So when it comes to zeroing the top 32bits by pushing zero to higher 
>>>>>>>> halves of stack slots I would do in a loop:
>>>>>>>> stp wzr, wzr, [sp #8 * 0]
>>>>>>>> stp wzr, wzr, [sp #8 * 1]
>>>>>>>> ...
>>>>>>>
>>>>>>> I don't think you can use stp here because this would store two 32-bit 
>>>>>>> values consecutively. Instead, you would need to use ldr to store one 
>>>>>>> 32-bit value at the time.
>>>>>>>
>>>>>> I hope you meant str and not ldr.
>>>>
>>>> Yes. I am not sure why I wrote ldr.
>>>>
>>>>>> So a loop would look like that:
>>>>>> str wzr, [sp, #8 * 1]
>>>>>> str wzr, [sp, #8 * 3]
>>>>>> ...
>>>>>
>>>>> Why "a loop" and why #8 (I'd have expected #4)?
>>>>>
>>>>> There's another oddity which I'm noticing only now, but which also
>>>>> may look odd to me only because I lack sufficient Arm details: On
>>>>> x86, it would not be advisable to store anything below the stack
>>>>> pointer (like is done here when storing x0 and x1 early), unless
>>>>> it's absolutely certain that no further interruptions could clobber
>>>>> that part of the stack.
>>>>
>>>> We are entering the hypervisor with both Interrupts and SErrors masked. 
>>>> They will only be unmasked after the guest registers have been saved on 
>>>> the stack.
>>>>
>>>> You may still receive a Data Abort before the macro 'entry' has completed. 
>>>> But this is going to result to an hypervisor crash because they are not 
>>>> meant to happen in those paths.
>>>>
>>>> So I believe, we are safe to modify sp before.
>>>
>>> Hmmm... I meant to write on the stack before sp is modified.
>>>
>>> Cheers,
>>>
>>
>> I would like to summarize what we discussed before pushing v2.
>> Changes since v1:
>> -update commit message adding information why do we need to zero top 32bits
>> -zero corresponding stack slots instead of zeroing directly gp registers
>> -create a macro called by entry, protected by if hyp=0. In macro add if 
>> compat=1
>>
>> Now when it comes to implementation.
>>
>> 1. Regarding save_x0_x1, it is 0 only for guest_sync_slowpath, which is 
>> called by guest_sync.
>> So as we are dealing only with compat=1, save_x0_x1 cannot be 0.
>> The conclusion is that we do not need to worry about it.
> 
> Oh, good point. I guess you may want to add a build time check to
> avoid silently introducing a user of the macro violating that
> assumption.
> 
>> 2. Regarding clearing high halves of stack slots.
> 
> I don't think I understood earlier responses that way. I think
> fiddling with the stack was meant solely for x0 and x1 when they
> were saved earlier on (i.e. instead of re-loading, zero-extending,
> and then storing them back). That's also why ...
> 
This patch and the problem it solves is about clearing top 32bits of all gp 
registers so not only x0,x1.

>> The new macro (called zero_stack_top_halves) will be called in entry before 
>> the first instruction sub sp,sp.
>> To avoid saving sp position/moving it, the simplest would be to execute 30 
>> times:
>> str wzr, [sp, #-(UREGS_kernel_sizeof - 4)]
>> str wzr, [sp, #-(UREGS_kernel_sizeof - 12)]
>> ...
>> I could also use .irp loop like (.irp n,1,3,5,7,...) and then:
>> str wzr, [sp, #-(UREGS_kernel_sizeof - (4 * n))]
>> but FWIK Jan does not like loops :)
> 
> ... in an earlier reply I expressed my surprise of you mentioning
> loops - I simply didn't see how a loop would come into play when
> dealing with just x0 and x1.
> 
> Jan
> 

Michal



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.