[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFD] OP-TEE (and probably other TEEs) support





On 28/11/16 18:09, Volodymyr Babchuk wrote:
Hello,

Hello Volodymyr,

On 28 November 2016 at 18:14, Julien Grall <julien.grall@xxxxxxx> wrote:
On 24/11/16 21:10, Volodymyr Babchuk wrote:
My name is Volodymyr Babchuk, I'm working on EPAM Systems with bunch
of other guys like Artem Mygaiev or Andrii Anisov. My responsibility
there is security in embedded systems.

I would like to discuss approaches to OP-TEE support in XEN.


Thank you for sharing this, I am CC-ing some people who showed interest on
accessing trusted firmware from the guest.

In the future, please try to CC relevant people (in this case ARM
maintainers) to avoid any delay on the answer.
Thanks. I never worked with XEN community earlier, so I don't know who is who :)

You can give a look to the MAINTAINERS file at the root xen.git.

[...]

You can find patches at [1] if you are interested.
During working on this PoC I have identified main questions that
should be answered:

On XEN side:
1. SMC handling in XEN. There are many different SMCs and only portion
of them belong to TEE. We need some SMC dispatcher that will route
calls to different subsystems. Like PSCI calls to PSCI subsystem, TEE
calls to TEE subsystem.


So from my understanding of this paragraph, all SMC TEE calls should have a
guest ID in the command. We don't expect command affecting all TEE. Correct?
Yes. Idea is to trap SMC, alter it, add guest ID (into r7, as SMCCC
says) and then
do real SMC to pass it to TEE.

But I'm not get this: "We don't expect command affecting all TEE".
What did you mean?

I mean, is there any command that will affect the trusted OS (e.g reset it, or else) in whole and not only the session for a given guest?




2. Support for different TEEs. There are OP-TEE, Google Trusty, TI
M-Shield... They all work thru SMC, but have different protocols.
Currently, we are aimed only to OP-TEE. But we need some generic API
in XEN, so support for new TEE can be easily added.

For instance you
Hm?
Is there any  generic way to detect which TEE is been in used and the
version?
Yes, according to SMCCC, there call number 0xBF00FF01 that should
return Trusted OS UID.
OP-TEE supports this call. I hope, other TEEs also support it. In this
way we can which TrustedOS is running on host.

Looking at the SMCC, this SMC call seems to be mandatory.



3. TEE services. Hypervisor should inform TEE when new guest is
created or destroyed, it should tag SMCs to TEE with GuestID, so TEE
can isolate guest data on its side.

4. SMC mangling. RichOS communicates with TEE using shared buffers, by
providing physical memory addresses. Hypervisor should convert IPAs to
PAs.


I am actually concerned about this bit. From my understanding, the
hypervisor would need some knowledge of the SMC.
Yes, it was my first idea - separate subsystem in the hypervisor that
handles SMC calls for different TEEs. This subsystems has a number of
backends. One for each TEE.

So are the OP-TEE SMC calls fully standardized? By that I mean they will not
change across version?
No, they are not standardized and they can change in the future.
OP-TEE tries to be backward-compatible, though. So hypervisor can drop
unknown capability flags in SMC call GET_CAPABILITIES. In this way it
can ensure that guest will use only  APIs that are known by
hypervisor.

How about other TEE?
I can't say for sure. But I think, situation is the same as with OP-TEE

By any chance, is there a TEE specification out somewhere?


If not, then it might be worth to consider a 3rd solution where the TEE SMC
calls are forwarded to a specific domain handling the SMC on behalf of the
guests. This would allow to upgrade the TEE layer without having to upgrade
the hypervisor.
Yes, this is good idea. How this can look? I imagine following flow:
Hypervisor traps SMC, uses event channel to pass request to Dom0. Some
userspace daemon receives it, maps pages with request data, alters is
(e.g. by replacing IPAs with PAs), sends request to hypervisor to
issue real SMC, then alters response and only then returns data back
to guest.

The event channel is only a way to notify (similar to an interrupt), you would need a shared memory page between the hypervisor and the client to communicate the SMC information.

I was thinking to get advantage of the VM event API for trapping the SMC. But I am not sure if it is the best solution here. Stefano, do you have any opinions here?


Is this even possible with current APIs available to dom0?

It is always possible to extend the API if something is missing :).


I can see only one benefit there - this code will be not in
hypervisor. And there are number of drawbacks:

Stability: if this userspace demon will crash or get killed by, say,
OOM, we will lose information about all opened sessions, mapped shared
buffers, etc.That would be complete disaster.

I disagree on your statement, you would gain in isolation. If your userspace crashes (because of an emulation bug), you will only loose access to TEE for a bit. If the hypervisor crashes (because of an emulation bug), then you take down the platform. I agree that you lose information when the userspace app is crashing but your platform is still up. Isn't it the most important?

Note that I think it would be "fairly easy" to implement code to reset everything or having a backup on the side.

Performance: how big will be latency introduced by switching between
hypervisor, dom0 SVC and USR modes? I have seen use case where TEE is
part of video playback pipe (it decodes DRM media).
There also can be questions about security, but Dom0 in any case can
access any memory from any guest.

But those concerns would be the same in the hypervisor, right? If your emulation is buggy then a guest would get access to all the memory.

But I really like the idea, because I don't want to mess with
hypervisor when I don't need to. So, how do you think, how it will
affect performance?

I can't tell here. I would recommend you to try a quick prototype (e.g receiving and sending SMC) and see what would be the overhead.

When I wrote my previous e-mail, I mentioned "specific domain", because I don't think it is strictly necessary to forward the SMC to DOM0. If you are concern about overloading DOM0, you could have a separate service domain that would handle TEE for you. You could have your "custom OS" handling TEE request directly in kernel space (i.e SVC).

This would be up to the developer of this TEE-layer to decide what to do.



Currently I'm rewriting parts of OP-TEE to make it support arbitrary
buffers originated from RichOS.

5. Events from TEE. This is hard topic. Sometimes OP-TEE needs some
services from RichOS. For example it wants Linux to service pending
IRQ request, or allocate portion of shared memory, or lock calling
thread, etc. This is called "RPC request". To do RPC request OP-TEE
initiates return to Normal World, but it sets special return code to
indicate that Linux should do some job for OP-TEE. When Linux finishes
work, it initiates another SMC with code like "I have finished RPC
request" and OP-TEE resumes its execution.
OP-TEE mutexes create problem there. We don't  want to sleep in secure
state, so when OP-TEE thread gets blocked on a mutex, it issues RPC
request that asks calling thread to wait on wait queue. When mutex
owner unlocks it, that another thread also issues RPC to wake up first
thread.
This works perfectly when there are one OS (or one guest). But when
there are many, it is possible that request from one guest blocks
another guest. That another guest will wait on wait queue, but there
will be no one, who can wake it up. So we need another mechanism to
wake up sleeping threads. Obvious candidate is IPI. There are 8
non-secure IPIs that all are used by linux kernel. By there are also 8
secure IPIs. I think OP-TEE can use one of those to deliver events to
Normal World. But this will require changes to OP-TEE, XEN and Linux
kernel.


Before giving suggestion here. I would like to confirm that by IPI you mean
SGI, right?
Yes. Kernel calls them IPI, so did I.
If I remember ARM GIC TRM well, it recommends to give 8 SGIs to Normal
World and 8 to Secure World.

You said "ARM GIC TRM", so I guess you are speaking about a specific GIC implementation. I looked at the GIC specific (ARM IHI 0048B.b) and can't find a such suggestion.

Hypervisor can use one of secure SGIs to deliver events to guests.
But, actually, this feature will be needed only virtualized
environment, so I think we can use XEN events there.
Jens, can you please comment on this? I can't imagine use case when
OP-TEE need to send SGI to normal world, when there are no
virtualization. But maybe I'm missing something?



6. Some mechanism to control which guests can work with TEE. At this
time I have no idea how this should work.


Probably a toolstack option "TEE enabled".
So there will be some additional flag in guest descriptor structure?

That's the idea.

Regards,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.