WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [kvm-devel] [PATCH RFC 3/3] virtio infrastructure: examp

To: carsteno@xxxxxxxxxx
Subject: [Xen-devel] Re: [kvm-devel] [PATCH RFC 3/3] virtio infrastructure: example block driver
From: Jens Axboe <jens.axboe@xxxxxxxxxx>
Date: Fri, 1 Jun 2007 15:13:16 +0200
Cc: Jimi Xenidis <jimix@xxxxxxxxxxxxxx>, Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx>, Xen Mailing List <xen-devel@xxxxxxxxxxxxxxxxxxx>, "jmk@xxxxxxxxxxxxxxxxxxx" <jmk@xxxxxxxxxxxxxxxxxxx>, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>, kvm-devel <kvm-devel@xxxxxxxxxxxxxxxxxxxxx>, Rusty Russell <rusty@xxxxxxxxxxxxxxx>, mschwid2@xxxxxxxxxxxxxxxxxx, virtualization <virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx>, Christian Borntraeger <cborntra@xxxxxxxxxx>, Suzanne McIntosh <skranjac@xxxxxxxxxx>
Delivery-date: Fri, 01 Jun 2007 10:39:29 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <465FC65C.6020905@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1180613947.11133.58.camel@xxxxxxxxxxxxxxxxxxxxx> <1180614044.11133.61.camel@xxxxxxxxxxxxxxxxxxxxx> <1180614091.11133.63.camel@xxxxxxxxxxxxxxxxxxxxx> <465EC637.7020504@xxxxxxxxxx> <1180654765.10999.6.camel@xxxxxxxxxxxxxxxxxxxxx> <465FC65C.6020905@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Fri, Jun 01 2007, Carsten Otte wrote:
> Rusty Russell wrote:
> >Now my lack of block-layer knowledge is showing.  I would have thought
> >that if we want to do things like ionice(1) to work, we have to do some
> >guest scheduling or pass that information down to the host.
> Yea that would only work on the host: one can use ionice to set the io 
> niceness of the entire guest. Individual processes inside the guest 
> are opaque to the host, and thus are opaque to its io scheduler.
> 
> >>It seems preferable to do that in the host, especially when requests 
> >>of multiple guests end up on the same physical media (shared access, 
> >>or partitioned).
> >
> >What's the overhead in doing both?
> With regard to compute power needed, almost none. The penalty is 
> latency, not overhead: A small request may sit on the request queue to 
> wait for other work to arrive until the queue gets unplugged. This 
> penality is compensated by the benefit of a good chance that more 
> requests will be merged during this time period.
> If we have this method both in host and guest, we have twice the 
> penalty with no added benefit.

I don't buy that argument. We can easily expose the unplug delay, so you
can kill it at what ever level you want. Or you could just do it in the
driver right now, but that is a bit hackish.

> On the other hand, if we choose to hook into q->make_request_fn, we do 
> end up doing far more hypercalls: bios do not get merged on the guest 
> side.  We must do a hypercall per bio in this scenario, or we'll end 
> up adding latency again. In contrast, we can submit the entire content 
> of the queue with a single hypercall when calling at do_request().
> 
> A third way out of that situation is to do queueing between guest and 
> host: on the first bio, guest does a hypercall. When the next bio 
> arrives, guest sees that the host has not finished processing the 
> queue yet and pushes another buffer without doing a notification. 
> We've also implemented this, with the result that our host stack was 
> quick enough to practically always process the bio before the guest 
> had the chance to submit another one. Performance was a nightmare, so 
> we discontinued pursuing that idea.

I'd greatly prefer maintaing a request_fn interface for this. The
make_request_fn/request_fn call ratio is at least 1, and typically a lot
larger (4kb blocks, 128kb request not uncommon -> 32).

-- 
Jens Axboe


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel