WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH RFC 3/3] Virtio draft III: example block driver

To: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH RFC 3/3] Virtio draft III: example block driver
From: Avi Kivity <avi@xxxxxxxxxxxx>
Date: Tue, 19 Jun 2007 11:34:46 +0300
Cc: Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>, Xen Mailing List <xen-devel@xxxxxxxxxxxxxxxxxxx>, "jmk@xxxxxxxxxxxxxxxxxxx" <jmk@xxxxxxxxxxxxxxxxxxx>, kvm-devel <kvm-devel@xxxxxxxxxxxxxxxxxxxxx>, virtualization <virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx>, Christian Borntraeger <cborntra@xxxxxxxxxx>, Latchesar Ionkov <lionkov@xxxxxxxx>, Suzanne McIntosh <skranjac@xxxxxxxxxx>, Jens Axboe <jens.axboe@xxxxxxxxxx>, Martin Schwidefsky <schwidefsky@xxxxxxxxxx>
Delivery-date: Tue, 19 Jun 2007 01:33:05 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <1182234466.19064.51.camel@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1181217762.14054.192.camel@xxxxxxxxxxxxxxxxxxxxx> <1181999552.6237.255.camel@xxxxxxxxxxxxxxxxxxxxx> <1181999669.6237.257.camel@xxxxxxxxxxxxxxxxxxxxx> <1181999825.6237.260.camel@xxxxxxxxxxxxxxxxxxxxx> <1181999920.6237.263.camel@xxxxxxxxxxxxxxxxxxxxx> <46754451.2010305@xxxxxxxxxxxx> <1182154095.19064.24.camel@xxxxxxxxxxxxxxxxxxxxx> <46764BB5.6070704@xxxxxxxxxxxx> <1182234466.19064.51.camel@xxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.0 (X11/20070419)
Rusty Russell wrote:
On Mon, 2007-06-18 at 12:09 +0300, Avi Kivity wrote:
Rusty Russell wrote:
On Sun, 2007-06-17 at 17:25 +0300, Avi Kivity wrote:
Rusty Russell wrote:
+       /* Set up for reply. */
+       vblk->sg[0].page = virt_to_page(&vbr->in_hdr);
+       vblk->sg[0].offset = offset_in_page(&vbr->in_hdr);
+       vblk->sg[0].length = sizeof(vbr->in_hdr);
+       num = blk_rq_map_sg(q, vbr->req, vblk->sg+1);
+       vbr->out_hdr.id = vblk->vdev->ops->add_inbuf(vblk->vdev, vblk->sg,
+                                                    1+num, vbr);
+       if (IS_ERR_VALUE(vbr->out_hdr.id))
+               goto full;
+
+       vblk->sg[0].page = virt_to_page(&vbr->out_hdr);
+       vblk->sg[0].offset = offset_in_page(&vbr->out_hdr);
+       vblk->sg[0].length = sizeof(vbr->out_hdr);
+
+       vbr->out_id = vblk->vdev->ops->add_outbuf(vblk->vdev, vblk->sg, 1,
+                                                 vbr);
This strikes me as wasteful. Why not set up a single descriptor which contains both placement and the data itself?
We could actually do this for write, but not for read (where the length
& sector need to be sent to other end, and the data comes from other
end).
So you map the first sg entry for output, and the rest for input.

Less pretty, but more efficient.

At the moment there are two kinds of sgs: input and output.  You're
proposing instead that we have a single kind, and a flag on each segment
which indicates direction.  That may make block a little efficient,
depending on particular implementation of virtio.

Yet network really wants in and out separate.  I'm not even sure how
we'd implement a mixed-sg scheme efficiently for that case.  At the
moment it's very straightforward.


I think the differences are deeper than that. Network wants two queues, disk wants one queue. The directionality of the dma transfer is orthogonal to that.

For a disk, the initiator is always the guest; you issue a request and expend a completion within a few milliseconds. Read and write requests are similar in nature; you want them in a single queue because you want the elevator to run on both read and write request. When idling, the queue is empty.

For a network interface, the guest initiates transmits, but the host initiates receives. Yes, the guest preloads the queue with rx descriptors, but it can't expect a response in any given time, hence the need for two queues. When idling, the tx queue is empty, the rx queue is full.

So I'd suggest having a struct virtio_queue to represent a queue (so a network interface would have two of them), and the dma direction to be specified separately:

+       unsigned long (*add_buf)(struct virtio_queue *q,
+                                const struct scatterlist out_sg[], int out_nr,
+                                const struct scatterlist in_sg[], int in_nr,
+                                void *data);


I think this fits better with the Xen model (which has one ring for block, two each for network and console, if memory serves).

Of course, I'm carrying two implementations and two drivers, so it's
natural for me to try to resist changes 8

Life is rough on the frontier.  But all the glory will be yours!

--
error compiling committee.c: too many arguments to function


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>