Xen project Mailing List

Re: [Xen-devel] [RFC 0/4] TEE mediator framework + OP-TEE mediator

To: Julien Grall <julien.grall@xxxxxxxxxx>

From: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx>

Date: Thu, 2 Nov 2017 22:07:05 +0200

Cc: Julien Grall <julien.grall@xxxxxxx>, nd@xxxxxxx, Stefano Stabellini <sstabellini@xxxxxxxxxx>, jens.wiklander@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxx

Delivery-date: Thu, 02 Nov 2017 20:07:32 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Spamdiagnosticmetadata: NSPM

Spamdiagnosticoutput: 1:99

On Thu, Nov 02, 2017 at 05:49:12PM +0000, Julien Grall wrote: Hi Julien, > On 02/11/17 16:53, Volodymyr Babchuk wrote: > >On Thu, Nov 02, 2017 at 01:17:26PM +0000, Julien Grall wrote: > >>On 24/10/17 20:02, Volodymyr Babchuk wrote: > >>>>>>>>If it is not safe, this means you have a whitelist solution and > >>>>>>>>therefore > >>>>>>>>tie Xen to a specific OP-TEE version. So if you need to use a new > >>>>>>>>function > >>>>>>>>you would need to upgrade Xen making the code of using new version > >>>>>>>>potentially high. > >>>>>>>Yes, any ABI change between OP-TEE and its clients will require > >>>>>>>mediator > >>>>>>>upgrade. Luckilly, OP-TEE maintains ABI backward-compatible, so if > >>>>>>>you'll > >>>>>>>install old XEN and new OP-TEE, OP-TEE will use only that subset of > >>>>>>>ABI, > >>>>>>>which is known to XEN. > >>>>>>> > >>>>>>>>Also, correct me if I am wrong, OP-TEE is a BSD 2-Clause. This means > >>>>>>>>you > >>>>>>>>impose anyone wanted to modify OP-TEE for their own purpose can make a > >>>>>>>>closed version of the TEE. But if you need to introspect/whitelist > >>>>>>>>call, you > >>>>>>>>impose the vendor to expose their API. > >>>>>>>Basically yes. Is this bad? OP-TEE driver in Linux is licensed under > >>>>>>>GPL v2. > >>>>>>>If vendor modifies interface between OP-TEE and Linux, they anyways > >>>>>>>obligued > >>>>>>>to expose API. > >>>>>> > >>>>>>Pardon me for potential stupid questions, my knowledge of OP-TEE is > >>>>>>limited. > >>>>>> > >>>>>>My understanding is the OP-TEE will provide a generic way to access > >>>>>>different Trusted Application. While OP-TEE API may be generic, the TA > >>>>>>API > >>>>>>is custom. AFAICT the latter is not part of Linux driver. > >>>>>Yes, you are perfectly right there. > >>>>> > >>>>>>So here my questions: > >>>>>> 1) Are you planning allow all the guests to access every Trusted > >>>>>>Applications? > >>>>>This is a good question. There are two types of TAs supported in > >>>>>OP-TEE: real TAs (as they are described in GlobalPlatform specs) and > >>>>>PseudoTAs. The latter ones are statically linked right into OP-TEE > >>>>>kernel and execute at S-EL1 level. > >>>>>Real TAs are provided by client. That means that NW userspace > >>>>>supplicant loads TA into OP-TEE. OP-TEE checks signature for the TA > >>>>>and then runs it in S-EL0. > >>>>>So, I'm planning to allow client to work with any real TA. I can't see > >>>>>real problem there. > >>>> > >>>>Are the real TAs going to be shared between guests? Or will each guest > >>>>have > >>>>their own one? > >>>No, we don't plan this. At least at this momoent. Every guest will have > >>>own instance of TA. > >>> > >>>>Will you allow every guests loading real TAs? > >>>Yes, if guest has access to TEE, it can load TA. Either there is no > >>>sense to use TEE. OP-TEE core itself does not provide useful services > >>>to clients. > >> > >>In a previous e-mail you mentioned OP-TEE has limited memory. How will you > >>ensure that guest A will not use all the memory of OP-TEE and prevent guest > >>B to load TAs? > >There are no way to do this right now. Even on bare-metal system, one client > >call load huge TA or eat up memory in another way to prevent other clients > >to use OP-TEE. This is known limitation. It can be mitigated by enforcing > >quotas. > > Yes, but those clients only serve one OS. Here you would serve multiple > OSes, clients from OS A could eat up the memory and prevent a client from OS > B to run. > > This could be a serious issue depending on how important the clients for OS > are. > > So likely enforcing quotas will be needed. Yes. I agree there. I think, it is possible to implement them in mediator, so we can use XSM to define quotas. > > > >>[...] > >> > >>>>Not really, you could the domain could block when issuing an SMC until the > >>>>mediator is up and running. > >>>Do you mean, that if domain tries to execute SMC, and mediator is not > >>>ready, then hypervisor should pause all domain's vCPUs? That can be > >>>destructive for hw domain. > >> > >>Xen is free to unschedule any vCPU at any time. So why would it be > >>destructive? > >Suppose that mediator domain needs 0.5s to boot up and be ready to > >serve calls. For half of a second HW domain will be blocked. I don't > >like the idea, that it will not be able to serve IRQs and other > >requests. IMHO, it is okay for DomU, but not for Dom0. > > > > > >>>>>>>>>And yes, it seems obvious, but I want to say this explicitly: generic > >>>>>>>>>TEE mediator framework should and will use XSM to control which > >>>>>>>>>domain > >>>>>>>>>can work with TEE. So, if you don't trust your guest - don't let it > >>>>>>>>>to call TEE at all. > >>>>>>>> > >>>>>>>>Correct me if I am wrong. TEE could be used by Android guest which > >>>>>>>>likely > >>>>>>>>run the user apps... right? So are you saying you fully trust that > >>>>>>>>guest and > >>>>>>>>obviously the user installing rogue app? > >>>>>>>I don't think that app downloaded from Play Marget can access OP-TEE > >>>>>>>directly. > >>>>>>>OP-TEE can be used by Android itself as a key storage or to access to > >>>>>>>a SE, > >>>>>>>for example. But 3rd app that issues TEE calls... I don't think so. > >>>>>> > >>>>>>You didn't get my point here. That rogue app may be able to break into > >>>>>>kernel via an exploit or have enough privilege to break the guest. Who > >>>>>>knows > >>>>>>what it will be able to do after... > >>>>>Only what hypervisor and TEE will allow it to do. Look, OP-TEE was not > >>>>>designed > >>>>>to rule the machine. There is ARM TF for that :) OP-TEE's task is to > >>>>>provide > >>>>>some safer environment for sensitive data and code. This environment has > >>>>>well-defined interfaces and is desgined to be as safe as possible. > >>>>> > >>>>>If rogue app breaks into kernel, then it can issue any SMC which it > >>>>>wants. > >>>>>But OP-TEE does not trust to NW. Hypervisor does not trust to guests. > >>>>>Mediator should be written in the same way. > >>>>> > >>>>>So, what can do rogue kernel? As I know - it can cause DoS in OP-TEE. > >>>>>This is > >>>>>known issue. If there is a security bug in OP-TEE, it probably can > >>>>>overcome > >>>>>whole system. But this is true for any system running OP-TEE. > >>>> > >>>>I agree that if you take over OP-TEE, you will take over any system. This > >>>>is > >>>>not specific to hypervisor. > >>>Yes. But it just occured to me that mediator+OP-TEE *can* be more > >>>secure then just OP-TEE. You see, mediator should perform own security > >>>checks before forwarding call to OP-TEE. So if OP-TEE misses > >>>something, mediator can back it up. I wouldn't rely on this. It just > >>>interesting thought :-) > >>> > >>>>Baremetal OS taking down the platform will only harm itself. A guest OS > >>>>could harm the whole platform. > >>>Can't argument with that. I think that this feature (shared TEE) is > >>>not suitable for, say, VPSes. But it can work just fine on smartphones > >>>or on another embedded devices, where vendor defines whole system. > >> > >>I guess your use case is "vendor defines whole system". But I am struggling > >>understand how this would more suitable there. > >Excuse me... "There" - it is where exactly? > > "vendor defines whole system". > > > > >>That guest OS may be "controlled" by the user. So how is that safer? > >Can you please define what is "safe" and "unsafe" in this context? > > > >Lets take a look at whole picture. I can see the following attacks: > > > >1) DoS attack. One domain spends all OP-TEE resources, other domains > > can't work with it. As I said earlier, this is know limitation. > > > >2) Mediator crash. Sort of DoS, if mediator can't restart properly. > > > >3) OP-TEE crash. This crashes whole system. > > > >4) Virtualization breach. Attacker gains control over mediator -> > > control over all TEE-enabled guests. > > > >5) Virtualization breach. Attacker gains control over hypervisor -> > > control over all guests. > > > >6) Virtualization breach. Attacker gains control over OP-TEE -> > > control over whole system, including firmware. > > > >Now it would be great to give you likehood for every attack type. But, > >obvioulsy I have no such numbers. I can only speculate about this. > > > >Returning to your question... To what extent guest OS can be controlled > >by user? Can user execute arbitrary code at EL1 for example? Or it can > >install only apps prebuilt by system vendor? > > > >What bad things will happen if user will compromise the whole system? > > > >Which guests will also run on the same system? Which subset of them > >will access OP-TEE? > > > >If you can asnwer to this questions, I can tell you, if it is safe > >to use OP-TEE + virtualization on your system. > > I don't make any end product and I have no idea what kind of guests would be > run on top. I know. I just wanted to emphasis that one selects stack (OSes, hypervisor, TEE, other parts) based on requirements. There are no silver bullet. For one project you can consider safe to use OP-TEE in any guest, while another project would require strict security policies and OP-TEE access only from one privileged domain. > So if I had to answers to those questions, I would consider all the guests > potentially nasty and therefore making sure the attack surface is limited > and understood. > > You really have to ask yourself what kinds of guest you will run on that > platform and assess the risk. Yes, exactly. > If you tell me you are going to run safety critical in one VM and another > with Android. Then I would be looking at limiting the attack surface of the > Android guest. True. For example, on Android I can employ SELinux to make sure that only framework (or even one separate service) can use TEE. > If you tell me that the user will only be able to install pre-built apps by > system vendor. Then I will have some trouble to believe it is secure given > how complex is an operating system. Actually, I don't see there nothing OP-TEE (or just TEE)- specific. I can have exactly the same concerns about PV drivers frontends-backends, or generic XEN interfaces. > And I am not even mentioning that allowing the user to install pre-built > apps by system vendor likely means having network/bluetooth access. I am > sure you have seen recently vulnerability... Yes. Actually, any decent SoC has lots of IP blocks that can execute arbitrary code. No one knows what is inside theirs firmwares and what security vulnerabilities they has. You know this better than me. We can discuss possible attacks indefinitely. OP-TEE is just another component of the system, and it, at least, is open source and is written WRT to security considerations. > > > >For some "generic" system I can say that it is pretty "safe" (except > >that problem with OP-TEE resources). > > > >>> > >>>>What I am not sure yet, maybe because of my lack of knowledge around > >>>>OP-TEE, > >>>>who is going to protect a TA to access all the NS memory? > >>>TAs is runing in S-EL0. It can't control MMU. Before every TA > >>>invocation, OP-TEE setups MMU in such way, so TA sees only shared > >>>memory arguments passed by client for this particular invocation. > >> > >>Can you give a bit more details here? Particularly what is the life of that > >>mapped region? Is it just for a command? If not, who is going to unmap it > >>and when? > >Yes, this map is created for every call. TA code and data are mapped always, > >obviously. > > Where does the TA code and data live? Is it in secure or non-secure memory? It is in secure memory. OP-TEE support paging, BTW. So it can encrypt memory pages and store them in non-secure memory, if secure one is limited. But no sensitive data leaves secure memory unencrypted. TA binaries can be stored in normal world, but OP-TEE checks signatures during TA load process. > >But parameters are mapped every call and only needed ones. > >Example: I have shared buffers A, B, C, D. > > > >1) I call OpenSession(TA_UUID, A, B). > > TA sees only buffers A, B (okay, actually it sees whole page, because > > buffer is mapped from userspace). > > > >2) I call InvokeCommand(Session, CMD_ID, B, C). > > TA sees only buffers B & C. > > > >3) I call InvokeCommand(Session, CMD_ID, A, D). > > TA sees only buffers A & D. > > > >Note, that such buffers are not mapped at OP-TEE address space at all. > >They will be mapped only to TA address space. > > To confirm, what you are saying is as soon as any call is returned by TA, > the region will be unmapped from the TA address space? Yes. Also, just to clarify: TA executes only by request from client. It can't have external events. So, TA address space is somewhat ephemeral entity. It exists only during time between TA entry and TA exit. At all other times, TA does have no address space, no thread context, anything. Just code and data somewhere in memory. > > > >[...] > >>>>>>>>>>To be clear, this series don't look controversial at least for > >>>>>>>>>>OP-TEE. What > >>>>>>>>>>I am more concerned is about DomU supports. > >>>>>>>>>Your concern is that rogue DomU can compromise whole system, right? > >>>>>>>> > >>>>>>>>Yes. You seem to assume that DomU using TEE will always be trusted, I > >>>>>>>>think > >>>>>>>>this is the wrong approach if the use is able to interact directly > >>>>>>>>with > >>>>>>>>those guests. See above. > >>>>>>>No, I am not assuming that DomU that calls TEE should be trusted. Why > >>>>>>>do you > >>>>>>>think so? It should be able to use TEE services, but this does not > >>>>>>>mean that > >>>>>>>XEN should trust it. > >>>>>> > >>>>>>In a previous answer you said: "So, if you don't trust your guest - > >>>>>>don't > >>>>>>let it". For me, this clearly means you consider that DomU using TEE are > >>>>>>trusted. > >>>>>> > >>>>>>So can you clarify by what you mean by trust then? > >>>>>Well... In real world "trust" isn't binary option. You don't want to > >>>>>allow all domains to access TEE. Breached TEE user domain doesn't > >>>>>automatically mean that your whole system is compromised. But this > >>>>>certainly increases attack surface. So it is safer to give TEE access > >>>>>only to those domains, which really require it. You can call them > >>>>>sligtly more trusted, then others. > >>>> > >>>>Do you have an example of guest you would slightly trust more? > >>>I have an example of guest I would trust less: if I'm running server, > >>>and I'm selling virtual machines on that server, I don't want to them > >>>to access TEE. > >> > >>Make sense. > >> > >>> > >>>I will trust slightly more to my own guest. > >> > >>I kind of agree if there are either no interaction with the user or the user > >>is not able to gain privilege permissions. > >Okay, if user can execute arbitrary code at EL1... Even then nothing bad > >will happen. They must be able to hack mediator/hypervisor/OP-TEE to realy > >gain priviegs in system. > > My worry here is you base the trust on OP-TEE and not only the hypervisor. > At the moment we had to trust the hardware to do the right thing and the > software is owned by Xen. How about firmware? E.g. ARM TF? > Now you are telling me, we have this TEE running in EL3 and have to trust > him to do the isolation between guests. Until the last 2 e-mails, it was not > clear for me how OP-TEE could ensure this isolation. Actually, OP-TEE is running at S-EL1 :-) Only ARM TF (or whatever firmware is used) has ultimate control over the system. If we are talking about modern ARMv8 platforms. > I would advise to explain a bit more in your cover letter of your next > version the design of OP-TEE. This would help people to see how this can > work with the hypervisor and also understanding the consequence... I see. I'll do this, certainly. I just didn't expected that someone will be interested in OP-TEE internals at such level. But, I think, cover leter for next OP-TEE will be done much later. Now, I'm busy with OP-TEE part, then there will be changes to support multi-domain boot and only then OP-TEE specific patches... BTW, if anyone is interested in current state of OP-TEE mediator, you can find it at [1]. I was able to pass OP-TEE tests from DomU in the last version. I use it for OP-TEE development, so it is not production-ready. Julien, I want to ask about VM monitor feature in XEN. monitor_smc() function and whole xen/arch/arm/monitor.c... Looks like it was introduced for some sort of debugging. Do you know any users of this? [1] https://github.com/lorc/xen/tree/optee WBR, -- Volodymyr Babchuk _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.