[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Block ring protocol (segment expansion, multi-page, etc).



On Wed, Sep 05, 2012 at 09:29:21AM -0400, Konrad Rzeszutek Wilk wrote:
> Please correct me if I a got something wrong.

CC-ing here a Citrix person who has expressed interest in also
implementing persistent grants in block backend.
> 
> About two or three years ago Citrix (and Red Hat I think?) posted a
> multi-page extension protocol (max-ring-page-order, max-ring-pages
> and ring-page-order and ring-pages)-
> which never got upstream (needed just to be rebased on the driver that
> went in the kernel I think?).
> 
> Then about a year ago SpectraLogic started enhancing the FreeBSD variant
> of blkback - and realized what Ronghui also did - that the just doing a
> multi-page extension is not enough. The issue was that if one just
> expanded to a ring composed of two pages, 1/4 of the page was wasted b/c
> of the segment is constrained to 11.
> 
> Justin (SpectraLogic) came up with a protocol enh were the existing
> blkif protocol is the same, but the BLKIF_MAX_SEGMENTS_PER_REQUEST
> is negotitated via max-request-segments. And then there is the
> max-request-size which rolls the segment size and the size of the ring
> to give you an idea of what is the biggest I/O you can fit on a ring in
> a single transaction. This solves the wastage problem and expands the
> ring.
> 
> Ronghui did something similar, but instead of re-using the existing
> blkif structure he split them in two. One ring is for
> blkif_request_header (which has the segments ripped out), and the other
> is for just for blkif_request_segments. Solves the wastage and also
> allows to expand the ring.
> 
> The three major outstanding issues that exists with the current protocol
> that I know of are:
>  - We split up the I/O requests. This ends up eating a lot of CPU
>    cycles.
>  - We might have huge I/O requests. Justin mentioned 1MB single I/Os -
>    and to fit that on a ring it has to be .. well, be able to fit 256
>    segments. Jan mentioned 256kB for SCSI - since the protocol
>    extensions here could very well be carried over.
>  - concurrent usage. If we have more than 4 VBDs blkback suffers when it
>    tries to get a page as there is a "global" pool shared across all
>    guests instead of being something 'per guest' or 'per VBD'.
> 
> So.. Ronghui - I am curious to why you choosen the path of making two
> seperate rings? Was the mechanism that Justin came up not really that
> good or was this just easier to implement?
> 
> Thanks.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.