[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands



On Mon, Feb 29, 2016 at 09:12:30AM +0100, Juergen Gross wrote:
> On 29/02/16 04:37, Bob Liu wrote:
> > 1) What is this patch about?
> > This patch introduces an new block operation (BLKIF_OP_EXTRA_FLAG).
> > A request with BLKIF_OP_EXTRA_FLAG set means the following request is an
> > extra request which is used to pass through SCSI commands.
> > This is like a simplified version of XEN_NETIF_EXTRA_* in netif.h.
> > It can be extended easily to transmit other per-request/bio data from 
> > frontend
> > to backend e.g Data Integrity Field per bio.
> > 
> > 2) Why we need this?
> > Currently only raw data segments are transmitted from blkfront to blkback, 
> > which
> > means some advanced features are lost.
> >  * Guest knows nothing about features of the real backend storage.
> >     For example, on bare-metal environment INQUIRY SCSI command can be used
> >     to query storage device information. If it's a SSD or flash device we
> >     can have the option to use the device as a fast cache.
> >     But this can't happen in current domU guests, because blkfront only
> >     knows it's just a normal virtual disk
> > 
> >  * Failover Clusters in Windows
> >     Failover clusters require SCSI-3 persistent reservation target disks,
> >     but now this can't work in domU.
> > 
> > 3) Known issues:
> >  * Security issues, how to 'validate' this extra request payload.
> >    E.g SCSI operates on LUN bases (the whole disk) while we really just 
> > want to
> >    operate on partitions
> 
> It's not only validation: some operations just affect the whole LUN
> (e.g. Reserve/Release). And what about "multi-LUN" commands like
> "report LUNs"?

Don't expose them. Bob and I want to get an idea of what would be a good
compromise to allow some SCSI specific (or perhaps ATA specific or DIF/DIX?) 
type of
commands go through the PV driver.

Would it be better if it was through XenBus? But that may not work for some
that are tied closely to requests, such as DIF/DIX.

However the 'DISCARD' for example worked out - it is an umbrella for both
SCSI UNMAP and ATA DISCARD operation and hides the complexity of the low level
protocol. Could there be an 'INQ' ? Since the SCSI VPD is the most exhaustive
in terms of details it may make sense to base it on that..?

> 
> >  * Can't pass SCSI commands through if the backend storage driver is 
> > bio-based
> >    instead of request-based.
> > 
> > 4) Alternative approach: Using PVSCSI instead:
> >  * Doubt PVSCSI can support as many type of backend storage devices as 
> > Xen-block.
> 
> pvSCSI won't need to support all types of backends. It's enough to
> support those where passing through SCSI commands makes sense.
> 
> Seems to be a similar issue as the above mentioned problem with
> bio-based backend storage drivers.

In particular the ones we care about are:
 - Multipath over FibreChannel devices.
 - Linear mapping (LVM) over the multipath.
 - And then potentially an filesystem on top of that
 - .. and a raw file on the filesystem.

Having SCSI VPD 0x83 page sent to frontend for that would be good.

Not sure about SCSI reservations. Perhaps those are more of .. unique
in that the implementation would have to make sure that the guest
owns the whole LUN. But that is implementation question.

This is about the design - how would you envision to to cram in 
SCSI commands or DIF/DIX commands or ATA commands via PV block layer?

> 
> >  * Much longer path:
> >    ioctl() -> SCSI upper layer -> Middle layer -> PVSCSI-frontend -> 
> > PVSCSI-backend -> Target framework(LIO?) ->
> > 
> >    With xen-block we only need:
> >    ioctl() -> blkfront -> blkback ->
> 
> I'd like to see performance numbers before making a decision.

For SCSI INQ? <laughs>

Or are you talking about raw READ/WRITE?
> 
> >  * xen-block has been existed for many years, widely used and more stable.
> 
> Adding another SCSI passthrough capability wasn't accepted for pvSCSI
> (that's the reason I used the Target Framework). Why do you think it
> will be accepted for pvblk?
> 
> This is not my personal opinion, just a heads up from someone who had a
> try already. ;-)

Right. So SCSI passthrough out. What about mediated access for
specific SCSI, or specific ATA, or DIF/DIX ones? And how would you do it
knowing the SCSI maintainers much better than we do?


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.