[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Should PV frontend drivers trust the backends?


  • To: 'Marek Marczykowski-Górecki' <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
  • Date: Tue, 1 May 2018 15:32:55 +0000
  • Accept-language: en-GB, en-US
  • Cc: Oleksandr Andrushchenko <andr2000@xxxxxxxxx>, 'Juergen Gross' <jgross@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 01 May 2018 15:33:31 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHT3JMUNelLBKu/xEO++KCtlgXwiqQRfQ5AgADvzgCAAERE4P//6quAgAbb/wCAAQ7cAIAAWNMAgAAoDmA=
  • Thread-topic: [Xen-devel] Should PV frontend drivers trust the backends?

> -----Original Message-----
[snip]
> > So what happens if the backend servicing the VM's boot disk fails? Is it
> better to:
> >
> > a) BUG()/BSOD with some meaningful stack and code such that it's obvious
> that happened, so
> > b) cover up and wait until something further up the storage stack crashes
> the VM, probably with some error that's just a generic timeout
> >
> > I'm clearly advocating a) but it's possible b) may be more desirable in some
> scenarios. I think the choice is up to whoever is writing the frontend and no-
> one else should decide their policy for them.
> 
> But you know, BUG() isn't the only method for getting error message.
> I see in this thread proper logging is used as an excuse for crashing
> things - really, this is very poor excuse. You can use printk, or even
> WARN() or such.

On Windows? I think not.

> And if there are cases where the only way to get
> meaningful messages is crashing the whole thing, somethings is _really_
> wrong.

Forcing a BSOD really is sometime the best option on Windows.

> In many cases crashing the thing will actually make retrieving messages
> harder, not easier (remote systems, console not working etc).
> 

Again, forcing a BSOD in Windows can resulting in a meaningful crashdump that 
can take you straight to a diagnosis of the problem. Fixing things up and 
getting some form of arbitrary 'page in timeout' BSOD a couple of minutes later 
can make post mortem diagnosis a lot harder.

> > > > > > And, if my assumption is correct, we still do trust the contents of 
> > > > > > the
> > > > > > requests
> > > > > > and responses, e.g. the payload is still trusted.
> > > > > Why should the payload be any more trusted than the content of the
> > > shared ring? They are both shared with the backend and therefore can
> be
> > > corrupted to the same extent.
> > > > This is exactly my point: if we only try to protect from inconsistent
> > > > prod/cons then
> > > > this protection is still incomplete as the payload may be the source of
> > > > failure.
> > >
> > > Well, you can take extra measures, external to the driver, to
> > > protect against malicious payload (like encryption mentioned by Andrew,
> > > or dm-verity for block devices). But you can't do the same about the
> > > driver itself (ring handling etc).
> > >
> >
> > As I said, verification should be down to the layer that has the relevant
> information.
> >
> > > Of course backend will be able to perform a DoS to some extend in all
> > > the cases, at least by stopping responding to requests. But keep in mind
> > > that root fs is not the only device out there. There are also other
> > > block device, network interfaces etc. And misbehaving backend should
> > > _not_ be able to take over frontend domain in those cases. And ideally
> > > also shouldn't also be able to crash it (if device isn't critical for
> > > domU).
> > >
> >
> > I still think that is the choice of the frontend. Yes, they can be 
> > programmed
> defensively but for some usecases it may just not be that important.
> >
> > > If you want some real world use cases for this, here are two from Qubes
> > > OS:
> > >
> > > 1. Block devices - base system devices (/, /home equivalent etc) have
> > > backends in dom0 (*), but there is also an option to use block devices
> > > exported by other domains. For example the one handling USB
> controllers.
> > > So, when you plug USB stick, one domain handle all the USB nasty stuff,
> > > and export it as a plain device to another domain when user can mount
> > > LUKS container stored there. Whatever happens there, nothing from that
> > > USB stick touches dom0 at any time.
> > >
> > > 2. Network devices - there are no network backends in dom0 at all. There
> > > is one (or more) dedicated domain for handling NICs, then there is
> > > (possibly a tree of) domain(s) routing the traffic. In some cases a VM
> > > facing actual network (where the backend runs) is considered less
> > > trusted than a VM using that network (where the frontend runs).
> >
> > But, without revocable grants that backend could still DoS the frontend,
> right?
> 
> Yes, but in that case it should be enough to kill the backend (domain)
> and frontend domain should be good, right?
> What I mean, malicious/buggy backend should be able to do harm only to
> devices it controls. Not crashing the whole driver (affecting all
> devices of that kind), or the whole system.
> 
> And definitely arbitrary code execution or info leak also should not be
> possible. I hope we agree at least to this point, right?

It's a good idea to defend against it...

> 
> Of course this all is about what the driver itself. If upper layer is about
> to execute any payload it gets, then PV driver can do nothing about it.

...but as you point out here, it will likely always be possible at some level.

  Paul

> But as you've said, it should be up to the frontend [domain configuration].
> 
> > > BTW Since XSA-155 we do have some additional patches for block and
> > > network frontend, making similar changes as done to backends at that
> > > time. I'll resend them in a moment.
> > >
> > > (*) we still have plans to support also untrusted backends for base
> > > system, with domU verifying all the data it gets (dm-verity, dm-crypt).
> > > But it isn't there yet.
> >
> > Maybe the frontend should advised on the trust level of a backend so that
> it can apply auditing should it wish to. If the backend were running in dom0
> then there would be little point, but a frontend may wish to be more careful
> when e.g. the domain is a trusted driver domain (but with no dm priv). There
> have also been discussions about skipping the use of grants when the
> backend has mapping privilege, for performance reasons, so maybe that
> could be worked in too.
> 
> Generally I'd avoid multiple modes (either dom0/non-dom0 or
> trusted/untrusted). This almost always leads to some bugs in one of
> those branches sooner or later.
> 
> --
> Best Regards,
> Marek Marczykowski-Górecki
> Invisible Things Lab
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.