[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFD] OP-TEE (and probably other TEEs) support



On 29 November 2016 at 18:02, Julien Grall <julien.grall@xxxxxxx> wrote:
> Hello Volodymyr,
>
> On 29/11/16 15:27, Volodymyr Babchuk wrote:
>>
>> On 28 November 2016 at 22:10, Julien Grall <julien.grall@xxxxxxx> wrote:
>>>
>>> On 28/11/16 18:09, Volodymyr Babchuk wrote:
>>>>
>>>> On 28 November 2016 at 18:14, Julien Grall <julien.grall@xxxxxxx> wrote:
>>>>>
>>>>> On 24/11/16 21:10, Volodymyr Babchuk wrote:
>>>
>>> I mean, is there any command that will affect the trusted OS (e.g reset
>>> it,
>>> or else) in whole and not only the session for a given guest?
>>
>> Yes, there are such commands. For example there are command that
>> enables/disables caching for shared memory.
>> We should disable this caching, by the way.
>> SMC handler should manage commands like this.
>
>
> So you have to implement a white-list, right?
Yes. Actually, I imagine this as huge switch(operation_id) where
default action is to return error to caller.
Only in this way I can be sure that I'm properly handling calls to
TEE. Yes, with this design maintainer will need to keep virtualization
code in sync with TEE internal APIs. But only this approach will
ensure security and stability.


>>>> No, they are not standardized and they can change in the future.
>>>> OP-TEE tries to be backward-compatible, though. So hypervisor can drop
>>>> unknown capability flags in SMC call GET_CAPABILITIES. In this way it
>>>> can ensure that guest will use only  APIs that are known by
>>>> hypervisor.
>>>>
>>>>> How about other TEE?
>>>>
>>>>
>>>> I can't say for sure. But I think, situation is the same as with OP-TEE
>>>
>>>
>>>
>>> By any chance, is there a TEE specification out somewhere?
>>
>> Yes. There are GlobalPlatform API specs. You can find them at [3]
>> Probably you will be interested in "TEE System Architecture v1.0".
>
>
> Thank you I will give a look.
This is rather high level design, because GP leaves many details to be
implementation-specific. They focus more on client side.

>
>>
>>>
>>>>
>>>>> If not, then it might be worth to consider a 3rd solution where the TEE
>>>>> SMC
>>>>> calls are forwarded to a specific domain handling the SMC on behalf of
>>>>> the
>>>>> guests. This would allow to upgrade the TEE layer without having to
>>>>> upgrade
>>>>> the hypervisor.
>>>>
>>>>
>>>> Yes, this is good idea. How this can look? I imagine following flow:
>>>> Hypervisor traps SMC, uses event channel to pass request to Dom0. Some
>>>> userspace daemon receives it, maps pages with request data, alters is
>>>> (e.g. by replacing IPAs with PAs), sends request to hypervisor to
>>>> issue real SMC, then alters response and only then returns data back
>>>> to guest.
>>>
>>>
>>>
>>> The event channel is only a way to notify (similar to an interrupt), you
>>> would need a shared memory page between the hypervisor and the client to
>>> communicate the SMC information.
>>>
>>> I was thinking to get advantage of the VM event API for trapping the SMC.
>>> But I am not sure if it is the best solution here. Stefano, do you have
>>> any
>>> opinions here?
>>>
>>>>
>>>> Is this even possible with current APIs available to dom0?
>>>
>>>
>>>
>>> It is always possible to extend the API if something is missing :).
>>
>> Yes. On other hand I don't like idea that some domain can map any
>> memory page of other domain to play with SMC calls. We can't use grefs
>> there. So, service domain should be able to map any memory page it
>> wants. This is unsecure.
>
>
> I don't follow your point here. Why would the SMC handler need to map the
> guest memory?
Because this is how parameters are passed. We can pass some parameters
in registers, but for example in OP-TEE registers hold only address of
command buffer. In this command buffer there are actual parameters.
Some of those parameters can be references to another memory objects.
So, to translate IPAs to PAs we need to map this command buffer,
analyze it and so on.
>
>>>>
>>>> I can see only one benefit there - this code will be not in
>>>> hypervisor. And there are number of drawbacks:
>>>>
>>>> Stability: if this userspace demon will crash or get killed by, say,
>>>> OOM, we will lose information about all opened sessions, mapped shared
>>>> buffers, etc.That would be complete disaster.
>>>
>>>
>>>
>>> I disagree on your statement, you would gain in isolation. If your
>>> userspace
>>> crashes (because of an emulation bug), you will only loose access to TEE
>>> for
>>> a bit. If the hypervisor crashes (because of an emulation bug), then you
>>> take down the platform. I agree that you lose information when the
>>> userspace
>>> app is crashing but your platform is still up. Isn't it the most
>>> important?
>>
>> This is arguable and depends on what you consider more valuable:
>> system security or system stability.
>> I'm standing on security point.
>
>
> How handling SMC in the hypervisor would be more secure? The OP-TEE support
> will introduce code will need to:
>         - Whitelist SMC call
>         - Altering SMC call to translate an IPA to PA
>         - Keep track of session
>         - ....
Actually only two first options + some approach to synchronization. I
think that TEE should track sessions on its side. It know VM ID of the
caller anyways. Hypervisor should tell TEE when guest dies, so it can
purge sessions with that guest.

> In general, I am quite concern every time someone ask to add emulation in
> the hypervisor.  This is increasing the possibility of bug, this is more
> true with emulation.
It is not an emulation. Actually it is virtualization. It is like
hypervisor provides virtual CPU or virtual GIC. There can be virtual
TEE as well.

> [...]
>
>>>> Performance: how big will be latency introduced by switching between
>>>> hypervisor, dom0 SVC and USR modes? I have seen use case where TEE is
>>>> part of video playback pipe (it decodes DRM media).
>>>> There also can be questions about security, but Dom0 in any case can
>>>> access any memory from any guest.
>>>
>>>
>>>
>>> But those concerns would be the same in the hypervisor, right? If your
>>> emulation is buggy then a guest would get access to all the memory.
>>
>> Yes, but I hope that is harder to compromise hypervisor, than to
>> compromise guest domain.
>
>
> I am afraid, but this would need more justification. If you use
> disaggregation and are careful enough to isolate your service, then it would
> be hard to compromise a separate VM only handling SMC on behalf of the
> guest.
Probably yes, separate hardened VM is much better than general
linux-based VM. But virtualization inside hypervisor is even better :)
See below.

>>
>>>> But I really like the idea, because I don't want to mess with
>>>> hypervisor when I don't need to. So, how do you think, how it will
>>>> affect performance?
>>>
>>>
>>>
>>> I can't tell here. I would recommend you to try a quick prototype (e.g
>>> receiving and sending SMC) and see what would be the overhead.
>>>
>>> When I wrote my previous e-mail, I mentioned "specific domain", because I
>>> don't think it is strictly necessary to forward the SMC to DOM0. If you
>>> are
>>> concern about overloading DOM0, you could have a separate service domain
>>> that would handle TEE for you. You could have your "custom OS" handling
>>> TEE
>>> request directly in kernel space (i.e SVC).
>>
>> Hmmm. I heard something about Unikernel domains. This is what you want
>> to propose?
>
>
> Yes, that would be the idea. And also what has been suggested on IRC
> yesterday.
Sorry, looks like I have missed this.

> I am not saying this is the best way, but I think we should explore more
> before saying: "Let's put more emulation in the hypervisor". Because here we
> are not talking about one TEE, but potentially multiple ones.
Yep. I'm not convinced yet to use separate VM. But lets try to image
how it will look.

Someone (can we trust dom0?) should identity which TEE is running on
system and create service domain with appropriate TEE handler.
There will be problem if we are using Secure Boot. Bootloader (like
ARM Trusted FW) can verify XEN in Dom0 kernel images. But it can't
verify which TEE handler will be loaded into service domain. This
verification can be done only by dom0, so dom0 userspace should be
part of chain of trust. This imposes restrictions on dom0 structure.

Then, when it comes to SMC call from guest. there should be special
subsystem in hypervisor. It will trap SMC, put all necessary data into
ring buffer and issue event to service domain. Probably, we will need
some hypercall to register service domain as SMC handler. But again,
how we can trust to that domain? Probably dom0 will say "use domain N
as trusted SMC handler"

Anyway, service domain handles SMC (probably, by doing real SMC to
TEE) and uses the same ring buffer/event channel mechanism to return
data to calling guest. During SMC handling it will map guest memory
pages by IPA, so we will need hypercall "map arbitrary guest memory by
guest IPA".

If service domain will need to wake up guest that is sleeping in TEE
client code, it will ask hypervisor to fire interrupt to that guest.

Then, I took a look onto MiniOS. Looks like it does not support
aarch64, so it need to be ported.

On other hand TEE virtualization right in hypervisor would ease things
significantly: no problems with secure boot, trusted service domains,
memory mapping, etc.

Also, I hate to ask again, but can we ask some TrustZone guys on how
they see interaction between Normal and Secure worlds in presence of
hypervisor?

-- 
WBR Volodymyr Babchuk aka lorc [+380976646013]
mailto: vlad.babchuk@xxxxxxxxx

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.