Re: [Xen-devel] [PATCH][ACM] kernel enforcement of vbd policies

On Thu, 2006-07-27 at 11:37 -0400, Reiner Sailer wrote:
> 
> 
> > > Getting back to Reiner's point about block AC checks in the
> backend
> > > drivers:  I think that if you trust the backend code sufficiently
> to
> > > _have_ the AC check in the first place, then you trust it
> implicitly
> > > to make correct use of page sharing etc.  So why not implement the
> > > tests for (a) permission to talk to the specified frontend, and
> (b)
> > > permission for that frontend to talk to the specified disk at the
> > > store level (which is where the two drivers are negotiating things
> > > anyway), and just use existing in-hypervisor AC mechanisms to
> control
> > > whether the backend is allowed to map the comms page and connect
> event
> > > channels.
> > 
> > I might be missing the point in the above paragraph.
> > 
> > I'm not sure that we have to trust the BE at all.  It's possible to
> > insert a trusted intermediate encrypt/decrypt/versioning/digital
> > signature layer so we don't have to trust the BE with resource
> isolation
> > or returning the right data and the FE can use mirroring for
> redundancy
> > against data lost by the BE.
> > 
> > So I think it's better for both a and b to be done by a trusted
> third
> > party which is smaller, easier to verify and subject to less
> frequent
> > change than a whole kernel.
> > 
> > Harry.
> >  
> 
> Regarding the suggestion not to trust BE/device domain (this seems to
> be a very interesting discussion point): 

I wasn't suggesting not to have any trusted BE domains I was suggesting
that it might be useful to have some untrusted BE domains.

> 
> I encourage to build BE/device domains so that they are trusted. To
> start the discussion, I state some of MY PERSONAL thoughts regarding
> the attacker model/trust model for sHype/ACM that support trusted
> BE/device domains. 
> 
> Simplified Commercial-grade Guarantees: 
> i.  Confine workloads and resources so that viruses or other integrity
> problems don't swap from one workload type into another 
> ii. Confine workloads and resources so that data does not leak from
> one workload to another 
> iii.Confinement will be no better or worse than the core hypervisor
> isolation (depends on the hypervisor/hardware sHype operates on, here
> Xen) 
> 
> Simplified Attacker model / trust model for above guarantees: 
> i. Do not rely on cooperation of any user domain (ensure confinement
> even if user domains go rogue) 
> ii.Rely/trust on device domains and other domains that host multiple
> workload types to keep them separate 
> 
> Risk management: 
> i. if a trusted domain becomes compromised, this affects only the
> workload types that it handles 
> ii.if a trusted domain becomes compromised, the workload types it
> handles can no longer be guaranteed to be confined against each other 
> 
> So I am actually encouraging to trust minimal device domains that are
> carefully engineered if they serve different workloads. 

I agree it is necessary to trust some minimal BE domains.  The question
is what kind of BE domains we want to trust.

I'd argue that domains doing complex device driver stuff and talking to
hardware devices tend to be full of code which is particularly difficult
to verify.  Say the domain decides to program the hardware to DMA over
the code which is performing the MAC checks for example.

I'd much prefer to only have to trust domains which are doing very pure
software only tasks that are simple and provably correct.  In the
example I gave, an intermediate domain which was doing some encryption
and decryption, versioning and digital encryption work.  It's much
easier to prove that a domain like this is correct than a domain which
contains a hairy device driver.

> 
> Why? Here are my reasons (for discussion): 
> 
> a) guest domains shall not be trusted (this is the whole point of
> having hypervisor level coarse grained security; it does not assume
> security in guest OS) 
> b) device domains can be generic and used by many guest domains, they
> run a very limited number of processes 
>    --> evaluation in the long-term seems most feasible for small
> domains with limited functionality that doesn't change often 
> c) IF you don't trust the device domain, then it can see only
> encrypted/signed data and you don't get availability (assuming you
> don't trust any device domain, then replication does not help
> availability because all can deny access at will in coordinated
> attacks) 
>    --> you need for each workload type another (trusted) domain that
> encrypts/signs 
>    --> you inherit performance overhead and key/other management
> overhead 
>    --> you introduce multiple trusted domains instead of a single one 
> 
> My feeling is: 
> a) trusting a small number of specialized domains (device domains,
> security domains) scales because such domains should remain pretty
> stable and can run minimally configured kernels etc. 
> b.1) when people write backend drivers, they manage to handle much
> more complex things than a function that resolves access control 

No.  Generally they _almost_ manage to handle much more complex things.

> b.2) starting to get people to handle security the same way they
> handle memory management in their code seems a good step towards
> consolidating security (this is probably a quite controversial
> statement and valid mainly in the context of commercial COTS systems) 

Not a good idea to assume you are going to educate the world.  Better to
work with a solution that reduces the complexity of the work performed
by the majority of developers and then get someone who knows what they
are doing to do the hard bit.  Memory management is an interesting
choice of example on your part.  All of the Xen memory management is
broken as far as I'm concerned because there isn't an up-front resource
reservation strategy which, for example, allows some operations to fail
if memory is allocated by something else while the operation is in
progress.  So the developers we are working with don't have a rigorous
approach for memory management and aren't likely to do security
rigorously either.  (No offence intended!)

> Concluding/summarizing: the device domain (BE) IS a trusted third
> party hosting shared hardware. I encourage discussions about why or
> under which circumstances moving the trust into yet another third
> party helps. 

So, I think we mostly agree but I'd move the security checks into a
pure-software third party domain and have other pure-software domains
doing things like encryption etc and I'd try to minimise the amount of
trust given to any domain containing a real device driver.

I.e. try to only place trust in code which is actually easy to audit.


Also, if you look at how my solution scales...

untrusted driver domain <-> trusted encryption domain <-> FE-domain
                           hypervisor
                   trusted access control domain

...you'll see that when you create a new be device driver you don't have
to audit any more code at all.  Which is optimal.

I'd still want to do all this analysis in the context of a FT cluster
architecture because the conclusions may be slightly different.

Harry.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [PATCH][ACM] kernel enforcement of vbd policies via blkb