[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libbxl: add support for pvscsi, iteration 1



On Fri, May 02, Ian Campbell wrote:

> > The backend driver uses a single_host:many_devices notation to manage
> > domU devices. The xenstore layout looks like this:
> 
> Please can you add ~/device/vscsi/$DEVID/* and
> ~/backend/vif/$DOMID/$DEVID/* to docs/misc/xenstore-paths.markdown.
> 
> If you were inclined to improve xen/include/public/io/vscsiif.h by using
> some of the content from here then that would be awesome too.

I think its a good idea to document what the current backend/frontend
code does, as this will also aid with review of the libxl changes.

> > This is a challenge for libxl because libxl currently handles only
> > single_host:single_device in its device helper functions.
> 
> This is a bit like pci I suppose?

Maybe. But today Ondrej found a way to attach additional devices,
detaching them will most likely work as well. Basically one creates the
new vscsi-devs/dev-N/ subdir with state==1, then write state==7 to the
backend. Similar with detach, state==5 to the dev-N, then state==7 to
the backend.

> > Due to this
> > limitation in libxl, scsi-detach will remove the whole virtual scsi host
> > with all its virtual devices, instead of just the requested single
> > virtual device.
> 
> That's rather unfortunate!
> 
> Are multiple <vhost> allowed/supported? e.g. could I (or better, libxl
> by default) setup a device-per-host situation to alleviate this?
> 
> For pcifront/back we support bus reconfiguration (via
> XenbusStateReconfigur{ing,ed}) which is how this works there. Does vscsi
> not support anything like that?

Many vhosts are supported, I just did not realize that reconfigure was
used here as well. So maybe libxl does not need further changes. I will
hack some code to show how attach/detach works in theory.

> > To support the pdev notation '/dev/$device' it is required to parse
> > sysfs to translate the path to HST:CHN:TGT:LUN notation. In its current
> > variant /sys/dev/ is used, which is available since at least Linux 3.0.
> > Support for dom0 kernels which lack this interface will be added later.
> 
> Are there any such kernels which anyone cares about any more?

I have to check when /sys/dev was introduced. In the end it would be
nice to support also older dom0 kernels.

> >  - xl scsi-detach will remove entire scsi hosts, unlike its xm counter part.
> 
> What did xm do instead? Fail?

No, it will use the xenstore reconfigure state to indicate a change.

> >  Note that the latter format is unreliable because
> > +the HOST value can change across dom0 reboots.
> 
> /dev/sgN would be unreliable too I think? As might block devices not
> using the /dev/disk/by-* links you give in the example. You might be
> best saying that only /dev/disk/by-* block device references are
> reliable because all of the others might change over a reboot.

I will reword this part.

One customer reported that his SCSI device was not working with xend
because this device was accessible only via a char device. And I think
udev was able to create some by-id path for the sgN node. Have to dig
into that old report. 

> > +
> > +=item C<vdev>
> > +
> > +Specifies how the SCSI device is mapped into the guest. The notation is in
> > +SCSI notation HOST:CHANNEL:TARGET:LUN. HOST in this case means a virthal
> 
> Typo "virtual"
> 
> > +SCSI host within the guest.
> 
> All of HOST:CHANNEL:TARGET:LUN are numbers I think, are they
> hex/dec/oct? Are there limits?

I have to find out, its most likely all decimal.

> > +Right now only one option is recognized: feature-host.
> What does it do?

I have to find out ;-) 
Last time I checked the command was passed as is from domU to dom0, but
I dont really know as it is not documented.

> > +=item B<scsi-attach> I<domain-id> I<pdev> I<vdev> I<[,feature-host]>
> 
> Is the comma in I<[,feature-host]> literally required, as in
>     xl scsi-attach mydom 0:0:0:0 0:0:0:0 ,feature-host
> ?

This is a typo I noticed after submission.
Not sure how to specify the option. Either a '0:0:0:0,feature-host' or
as two strings.

> > +Removes the vscsi device from domain specified by I<domain-id>.
> > +Note that the whole virtual SCSI host with all its devices is removed.
> > +This is a BUG!
> 
> I have a feeling it would be preferable in the first instance to either
> just not provide the detach portion of this interface, or to require
> that vdev is only a host and not a specific device (... unless the host
> only has a single device, but maybe that's getting too far)

Attach/detach can probably be implemented without much work.

> > +++ b/tools/libxl/libxl.c
> > +
> > +void libxl__device_vscsi_add(libxl__egc *egc, uint32_t domid,
> > +                           libxl_device_vscsi *vscsi,
> > +                           libxl__ao_device *aodev)
> > +{
> > +    STATE_AO_GC(aodev->ao);
> > +    flexarray_t *front;
> > +    flexarray_t *back;
> > +    libxl__device *device;
> > +    unsigned int rc, i;
> > +
> > +    i = 2 * (4 + (3 * vscsi->num_vscsi_devs));
> 
> That's quite exciting ;-)
> 
> A flexarray will grow as neccessary so some sort of static best guest of
> a starting point would be fine here I think (or a comment about 2 4 and
> 3)!

Yes, that was just an attempt to avoid realloc. As usual, optimization
of non-critical code paths...

> I suppose the num_vscsi_devs here is an artefact of the short coming you
> mentioned before. The way PCI handles this is to create the "bus" (aka
> vhost here) either when the first device is attached or in the higher
> level code before it calls the individual device adds (i.e. this
> function), or something (I must confess my memory is a bit fuzzy,
> libxl__create_pci_backend might be a place to look). Having done that
> then libxl_device_pci is just a single device.
> 
> I think you should look to do something similar here, so that each
> libxl_device_vscsi is actually a single device. The alternative is some
> sort of libxl_bus_vscsi thing, which would be unconventional compared
> with everything else

The thing with host:dev[,dev[,dev]] is that within the guest something
may rely on the way SCSI devices are presented (ordering, naming, no
idea). Surely the vscsi= and scsi-attach/detach parsing code could do
some sort of translation from a group of devices-per-vhost into a single
vhost:dev. But now that we have found that reconfiguration is used in
xenstore to attach/detach devices it may be possible to reuse existing
libxl helpers while preserving existing domU.cfg files and backend
drivers.


> > +libxl_device_vscsi *libxl_device_vscsi_list(libxl_ctx *ctx, uint32_t 
> > domid, int *num)
> > +{
> > +    GC_INIT(ctx);
> > +
> > +    libxl_device_vscsi *vscsi_hosts = NULL;
> > +    char *fe_path;
> > +    char **dir;
> > +    unsigned int ndirs = 0;
> > +
> > +    fe_path = libxl__sprintf(gc, "%s/device/vscsi", 
> > libxl__xs_get_dompath(gc, domid));
> 
> Need to take care trusting this too much...

Not sure what you mean with this sentence?

> > @@ -3520,6 +3713,10 @@ DEFINE_DEVICE_REMOVE(vfb, destroy, 1)
> >  DEFINE_DEVICE_REMOVE(vtpm, remove, 0)
> >  DEFINE_DEVICE_REMOVE(vtpm, destroy, 1)
> >  
> > +/* vscsi */
> > +DEFINE_DEVICE_REMOVE(vscsi, remove, 0)
> > +DEFINE_DEVICE_REMOVE(vscsi, destroy, 0)
> 
> Should the second one not be (vscsi, destroy, 1) ?

Right. 

> > +libxl_device_vscsi = Struct("device_vscsi", [
> > +    ("backend_domid",    libxl_domid),
> > +    ("devid",            libxl_devid),
> > +    ("v_hst",            uint32),
> > +    ("vscsi_devs",       Array(libxl_vscsi_dev, "num_vscsi_devs")),
> > +    ("feature_host",     bool),
> 
> Not defbool (I don't know what this does...)

This is just flag for "feature-host" yes/no. Have to find out what this
backend option actually does.

> > @@ -508,6 +528,28 @@ libxl_vtpminfo = Struct("vtpminfo", [
> >      ("uuid", libxl_uuid),
> >      ], dir=DIR_OUT)
> >  
> > +libxl_vscsiinfo = Struct("vscsiinfo", [
> > +    ("backend", string),
> > +    ("backend_id", uint32),
> > +    ("frontend", string),
> > +    ("frontend_id", uint32),
> > +    ("devid", libxl_devid),
> > +    ("p_hst", uint16),
> > +    ("p_chn", uint16),
> > +    ("p_tgt", uint16),
> > +    ("p_lun", uint16),
> 
> Type, also the widths here differ from the device...
> 
> > +    ("vscsi_dev_id", libxl_devid),
> > +    ("v_hst", uint16),
> > +    ("v_chn", uint16),
> > +    ("v_tgt", uint16),
> > +    ("v_lun", uint16),
> 
> and again

Some format code was unhappy with u16, cant remember. I was just trying
to save space. Again, optimization of non-critical code paths.

> > +static char *vscsi_trim_string(char *s)
> > +{
> > +    unsigned int len;
> > +
> > +    while (isspace(*s))
> > +        s++;
> > +    len = strlen(s);
> > +    while (len-- > 1 && isspace(s[len]))
> > +        s[len] = '\0';
> > +    return s;
> 
> Doesn't seem to be anything vscsi specific here (and it seems like a
> generally useful thing. I expect at least thevif parsing would benefit
> from deploying this too (in so much as all this wouldn't be better with
> a proper parser using flex).

I figured that could be a generic helper. Have to check existing code if
it can make use of it.

> > +}
> > +
> > +static void parse_vscsi_config(libxl_device_vscsi *vscsi_host,
> > +                              libxl_vscsi_dev *vscsi_dev,
> > +                              char *buf)
> > +{
> > +    char *pdev, *vdev, *fhost;
> > +    unsigned int hst, chn, tgt, lun;
> > +
> > +    libxl_device_vscsi_init(vscsi_host);
> > +    pdev = strtok(buf, ",");
> > +    vdev = strtok(NULL, ",");
> > +    fhost = strtok(NULL, ",");
> 
> I guess we are single threaded here so this is ok.
> 
> I wonder if this might be useful to put in libxlu alongside the disk
> parser? The question is whether anything else wants to parse xm style
> vscsi strings, like libvirt perhaps. (arguably a bunch of these parsers
> here have that property and belong in libxlu).

I was wondering how libvirt support for vscsi would look like, looks
like it has to reimplement all that parsing to fill struct
libxl_domain_config. Maybe that parser could be generic, depends how
external callers deal with such parsing today.

> > +    pdev = vscsi_trim_string(pdev);
> > +    vdev = vscsi_trim_string(vdev);
> > +
> > +    if (strncmp(pdev, "/dev/", 5) == 0) {
> > +#ifdef __linux__
> 
> Urk. Normally we do this by splitting by files, e.g. into a linux
> version and a nop version and compiling the appropriate one.
> 
> This seems like a candidate for libxlu too. We probably need to have a
> think about how much of this stuff is done in libxl too -- e.g. for
> disks we pass the pdev in as a string (path) and interpret it there.
> That allows us to choose between alternative backend providers etc. Also
> libxl is a slightly more palatable place to deal with Linux vs non-Linux
> bits. I think that might be the right answer here too.

There is a libxl_linux.c file, but thats for libxl not xl. So we went
with the ifdef for the time being.

> > @@ -1242,6 +1360,56 @@ static void parse_config_data(const char 
> > *config_source,
> >          }
> >      }
> >  
> > +    if (!xlu_cfg_get_list(config, "vscsi", &vscsis, 0, 0)) {
> > +        int cnt_vscsi_devs = 0;
> > +        d_config->num_vscsis = 0;
> > +        d_config->vscsis = NULL;
> > +        while ((buf = xlu_cfg_get_listitem (vscsis, cnt_vscsi_devs)) != 
> > NULL) {
> > +            libxl_vscsi_dev vscsi_dev = { };
> > +            libxl_device_vscsi vscsi_host = { };
> > +            libxl_device_vscsi *host;
> > +            char *tmp_buf;
> > +            int num_vscsis, host_found = 0;
> > +
> > +            /*
> > +             * #1: parse the devspec and place it in temporary host+dev 
> > part
> > +             * #2: find existing vscsi_host with number v_hst
> > +             *     if found, append the vscsi_dev to this vscsi_host
> > +             * #3: otherwise, create new vscsi_host and append vscsi_dev
> > +             * Note: v_hst does not represent the index named "num_vscsis",
> > +             *       it is a private index used just in the config file
> 
> Would all be a lot simpler if a libxl_device_vscsi was a single device
> and all the worrying about which bus was which was in libxl

As said above, domU.cfg code like this may serve a purpose. Devices are
grouped per vhost (0: and 1: in this example):

### vscsi = [
###     '/dev/sdm, 0:0:0:0',
###     '/dev/sdn, 0:0:0:1',
###     '/dev/sdo, 0:0:1:0',
###     '/dev/sdp, 0:2:0:1 ',
###     '/dev/sdq, 1:0:0:0, feature-host  ',
###     '/dev/sdr, 1:1:1:0, feature-host',
### ]


Thanks for the quick review!

Olaf

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.