WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support

To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
Date: Wed, 12 Oct 2011 17:14:57 +0100
Cc: "hch@xxxxxxxxxxxxx" <hch@xxxxxxxxxxxxx>, Dong Yang Li <lidongyang@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>
Delivery-date: Wed, 12 Oct 2011 09:17:36 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20111012154549.GB1732@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Citrix Systems, Inc.
References: <1318260494-27985-1-git-send-email-konrad.wilk@xxxxxxxxxx> <1318260494-27985-4-git-send-email-konrad.wilk@xxxxxxxxxx> <1318263187.21903.464.camel@xxxxxxxxxxxxxxxxxxxxxx> <20111010164250.GG28646@xxxxxxxxxxxxxxxxx> <1318274402.27397.13.camel@xxxxxxxxxxxxxxxxxxxx> <20111010195749.GA5755@xxxxxxxxxxxxxxxxx> <4E940E21020000780005AA29@xxxxxxxxxxxxxxxxxxxx> <20111011155133.GC29349@xxxxxxxxxxxxxxxxx> <20111011205729.GB22668@xxxxxxxxxxxxxxxxx> <1318416842.21903.674.camel@xxxxxxxxxxxxxxxxxxxxxx> <20111012154549.GB1732@xxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, 2011-10-12 at 16:45 +0100, Konrad Rzeszutek Wilk wrote:
> On Wed, Oct 12, 2011 at 11:54:02AM +0100, Ian Campbell wrote:
> > On Tue, 2011-10-11 at 21:57 +0100, Konrad Rzeszutek Wilk wrote:
> > > > My later response to it should include it:
> > > > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00652.html
> > > >
> > > > >
> > > > > Further I wonder why you don't use the "reserved" field instead of
> > > > > extending the structure at the end.
> > > >
> > > > <blinks> I completly missed it. That would definitly work as well.
> > > >
> > > > Let me redo it with that in mind.
> > > 
> > > I've posted the Xen hypervisor ABI one that thread above. The 
> > > implementation
> > > of that looks as follow:
> > 
> > Ian.
> > 
> > > 
> > > commit ae33f998d66c5982af533bda25c2b6c4f863789f
> > > Author: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > > Date:   Mon Oct 10 10:58:40 2011 -0400
> > > 
> > >     xen/blk[front|back]: Enhance discard support with secure erasing 
> > > support.
> > > 
> > >     Part of the blkdev_issue_discard(xx) operation is that it can also
> > >     issue a secure discard operation that will permanantly remove the
> > >     sectors in question. We advertise that we can support that via the
> > >     'discard-secure' attribute and on the request, if the 'secure' bit
> > >     is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE.
> > > 
> > >     CC: Li Dongyang <lidongyang@xxxxxxxxxx>
> > >     [v1: Used 'flag' instead of 'secure:1' bit]
> > >     [v2: Use 'reserved 'uint8_t' as a flag]
> > >     Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > > 
> > > diff --git a/drivers/block/xen-blkback/blkback.c 
> > > b/drivers/block/xen-blkback/blkback.c
> > > index 94e659d..4f33c13 100644
> > > --- a/drivers/block/xen-blkback/blkback.c
> > > +++ b/drivers/block/xen-blkback/blkback.c
> > > @@ -362,7 +362,7 @@ static int xen_blkbk_map(struct blkif_request *req,
> > >  {
> > >         struct gnttab_map_grant_ref map[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> > >         int i;
> > > -       int nseg = req->nr_segments;
> > > +       int nseg = req->u1.nr_segments;
> > >         int ret = 0;
> > > 
> > >         /*
> > > @@ -422,13 +422,16 @@ static void xen_blk_discard(struct xen_blkif 
> > > *blkif, struct blkif_request *req)
> > >         int status = BLKIF_RSP_OKAY;
> > >         struct block_device *bdev = blkif->vbd.bdev;
> > > 
> > > -       if (blkif->blk_backend_type == BLKIF_BACKEND_PHY)
> > > +       if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) {
> > > +               unsigned long secure = (blkif->vbd.discard_secure &&
> > > +                       (req->u1.flag & BLKIF_OP_DISCARD_FLAG_SECURE)) ?
> > > +                       BLKDEV_DISCARD_SECURE : 0;
> > >                 /* just forward the discard request */
> > >                 err = blkdev_issue_discard(bdev,
> > >                                 req->u.discard.sector_number,
> > >                                 req->u.discard.nr_sectors,
> > > -                               GFP_KERNEL, 0);
> > > -       else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) {
> > > +                               GFP_KERNEL, secure);
> > > +       } else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) {
> > >                 /* punch a hole in the backing file */
> > >                 struct loop_device *lo = bdev->bd_disk->private_data;
> > >                 struct file *file = lo->lo_backing_file;
> > > @@ -618,6 +621,9 @@ static int dispatch_rw_block_io(struct xen_blkif 
> > > *blkif,
> > >         struct blk_plug plug;
> > >         bool drain = false;
> > > 
> > > +       /* Check that the number of segments is sane. */
> > > +       nseg = req->u1.nr_segments;
> > 
> > This field is invalid (at least with these semantics) if req->operation
> > == BLKIF_OP_DISCARD so reading it here and clearing it later when you
> > decide it is invalid is just confusing. Why not read it inside the
> > switch iff it is valid?
> 
> The problem was that 'nseg' would be read after the switch, so it would
> contain the flag value. Which would throw off a lot of the loops which
> would try to enumerate "(for (i = 0; i < nseg;...)".
> 
> 
> Hence moving it to the top would make it valid for all the operations
> except the BLKIF_OP_DISCARD. And BLKIF_OP_DISCARD would sensibly set it
> nseg to zero so that we would not trip on those 'for (i = 0').

I think this is the crux of the problem: setting nsegs to an invalid
value just so you can clear it again later when you check the op is
completely backwards and confusing to any reader of the code.

> But I think you idea of making it an if statement would do, like:
> 
> 
> > > @@ -643,8 +650,6 @@ static int dispatch_rw_block_io(struct xen_blkif 
> > > *blkif,
> > >                 break;
> > >         }
> > > 
>             if (operation != REQ_DISCARD)
>               /* Check that the number of segments is sane. */
>               nseg = req->nr_segments;
>           else
>               nseg = 0;

Right above this hunk is a switch statement over the req->operation. The
value of req->operation precisely defines the semantics/validity or
otherwise of the req->nr_segments field and whether or not it contains
the nr of segments or (due to the aliasing) something else. Why not set
nsegs inside that switch statement (and explicitly zero it in the other
cases) so that this obvious connection is retained?

> > >         if (unlikely(nseg == 0 && operation != WRITE_FLUSH &&
> > >                                 operation != REQ_DISCARD) ||
> 
> And I guess we can also skip the REQ_DISCARD test here.

I don't think so, if nseg == 0 and operation == REQ_DISCARD that is
fine, right? The fact that there is all this "operation != xx &&
operation != yy" conditional stuff suggests this would also be cleaner
if it was also pulled up into the existing switch over req->operation.
IOW overall you end up with:

        switch (req->operation) {
        case BLKIF_OP_READ:
                blkif->st_rd_req++;
                operation = READ;
                nseg = req->nr_segments;
                if (unlikely(nseg == 0 || nseg > 
BLKIF_MAX_SEGMENTS_PER_REQUEST)) {
                        pr_debug(DRV_PFX "Bad number of segments in read 
request (%d)\n",
                                 nseg);
                        /* Haven't submitted any bio's yet. */
                        goto fail_response;
                }
                break;
        case BLKIF_OP_WRITE:
                blkif->st_wr_req++;
                operation = WRITE_ODIRECT;
                nseg = req->nr_segments;
                if (unlikely(nseg == 0 || nseg > 
BLKIF_MAX_SEGMENTS_PER_REQUEST)) {
                        pr_debug(DRV_PFX "Bad number of segments in write 
request (%d)\n",
                                 nseg);
                        /* Haven't submitted any bio's yet. */
                        goto fail_response;
                }
                break;
        case BLKIF_OP_FLUSH_DISKCACHE:
                blkif->st_f_req++;
                operation = WRITE_FLUSH;
                nseg = req->nr_segments;
                /* nseg == 0 is ok here */
                if (unlikely(nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) {
                        pr_debug(DRV_PFX "Bad number of segments in 
write/flush-cache request (%d)\n",
                                 nseg);
                        /* Haven't submitted any bio's yet. */
                        goto fail_response;
                }
                break;
        case BLKIF_OP_DISCARD:
                blkif->st_some_stat++;
                operation = REQ_DISCARD;
                nseg = 0; /* No data associated with a discard */
                break;
        case BLKIF_OP_WRITE_BARRIER:
        default:
                operation = 0; /* make gcc happy */
                goto fail_response;
                break;
        }

(I think I'm right that BLKIF_OP_FLUSH_DISKCACHE can have associated
data or not)

However do discard and r/w really have so much in common that handling
them all in dispatch_rw_block_io() and relying on nsegs == 0 when the
operation is discard makes sense?

Would it be clearer if the caller (__do_block_io_op) had this switch
over req->operation and called dispatch_rw_block_io(req, WRITE_FLUSH,
nsegs), dispatch_discard(req) etc as appropriate?

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>