Re: [Xen-devel] libxl: cannot start guest

On Tue, 2012-05-22 at 13:35 +0100, Christoph Egger wrote:
> On 05/21/12 17:57, Ian Campbell wrote:
> >> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> >> vdev=hda spec.backend=unknown
> >> libxl: debug: libxl_device.c:219:libxl__device_disk_set_backend: Disk
> >> vdev=hda, using backend phy
> >> xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9bd04
> >> xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19bd04
> >>   Loader:        0000000000100000->000000000019bd04
> >>   TOTAL:         0000000000000000->00000000ff800000
> >>   ENTRY ADDRESS: 0000000000100000
> >>   4KB PAGES: 0x0000000000000200
> >>   2MB PAGES: 0x00000000000003fb
> >>   1GB PAGES: 0x0000000000000002
> >> xc: detail: elf_load_binary: phdr 0 at 0x0x7f7ff7f42000 -> 0x0x7f7ff7fd4b74
> >> libxl: error: libxl.c:3213:libxl_sched_credit_domain_set: Cpu weight out
> >> of range, valid values are within range from 1 to 65535
> >> libxl: error: libxl_dom.c:74:libxl__sched_set_params:
> >> libxl_sched_credit_domain_set failed -6
> >> libxl: debug: libxl_device.c:183:libxl__device_disk_set_backend: Disk
> >> vdev=hda spec.backend=phy
> >> libxl: error: libxl_xshelp.c:102:libxl__xs_get_dompath: failed to get
> >> dompath for 7: Bad file descriptor
> > 
> > This is back to the original issue, I think the last couple of mails
> > have been something of a tangent since you weren't getting as far as
> > this failure.
> > 
> > I'm not really sure what to suggest here -- something is either closing
> > the fd or scribbling over the memory which contains it.
> > 
> > I suppose you could sprinkle calls to libxl__xs_get_dompath() around
> > between libxl__sched_set_params and libxl__device_disk_set_backend and
> > see where it starts failing -- that's going to be pretty tedious though.
> It starts failing in libxl__build_post() right after
> xs_introduce_domain().

What method did you use to determine that?

So at the xs_transaction_end right before that ctx->xsh is valid, but
right after...
        xs_introduce_domain(ctx->xsh, domid, state->store_mfn, 
...it is invalid? i.e. before the free(vmpath) it is already corrupt? 

(Aside: why isn't vmpath in the gc, instead of done manually,

Does the xs_introduce_domain itself succeed? Or do you mean that the
next use of xsh after this fails (where is that, somewhere back up the
callchain? store_libxl_entry perhaps?)

xs_introduce_domain doesn't seem to do much which is untoward with the

The only thing which springs to mind is that it may generate an
@IntroduceDomain watch event. However xl is single threaded so we won't
process that event until we unwind to whichever point we do an event
loop iteration, in which case the corruption would have to happen later
than right after xs_introduce_domain().

Did you manage to determine if "Bad file descriptor" was due to it being
closed vs. the value being corrupted?


