WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] xend segfaults when starting

On Wednesday 18 August 2010 14:14:19 Ian Campbell wrote:
> Thanks for the analysis. I'm a bit confused though.
>
> On Wed, 2010-08-18 at 11:44 +0100, Christoph Egger wrote:
> > I tracked down where the error happens. In safe_munlock(),
> > the munlock() fails.
> >
> > The trace is:
> >
> > xc_interface_close -> _xc_clean_hcall_buf -> unlock_pages -> safe_munlock
> > -> munlock
> >
> > hcall_buf->buf has the address 0x7f7ffdfe7040
>
> Mustn't this be page aligned, due to
>         hcall_buf->buf = xc_memalign(PAGE_SIZE, PAGE_SIZE);
> ?
>
> This appears to turn into valloc on NetBSD which (at least according to
> the Linux manpages) returns a page aligned result.

Yes, correct.

> > In unlock_pages, the address and length passed to munlock() is:
> >
> >  laddr 0x7f7ffdfe7000, llen 0x2000
> >
> > The reason why munlock() fails is that mlock() hasn't been called before.
> > The hcall_buf_prep() is not called at all before the first call to
> > _xc_clean_hcall_buf().
>
> If hcall_buf_prep() has never been called then
> "pthread_getspecific(hcall_buf_pkey)" should return NULL and
> _xc_clean_hcall_buf will never be called from xc_clean_hcall_buf.
> _xc_clean_hcall_buf also ignores NULL values itself.

Who calls hcall_buf_prep() in your case ?

Only hypercalls call hcall_buf_prep().
What if no hypercalls are not called during xend startup ?

If you call xc_clean_hcall_buf() from xc_interface_close()
then you should also call hcall_buf_prep() from xc_interface_open().

> However you say that hcall_buf_pkey is not NULL, but rather contains a
> valid hcall_buf containing 0x7f7ffdfe7040.

hcall_buf itself has the address 0x7f7ffdfe7000.

hcall_buf->buf has the address 0x7f7ffdfe7040.

> The only call to "pthread_setspecific(hcall_buf_pkey, ...)" with a non-NULL
> value is in hcall_buf_prep(), so it must have been called at some point.

In that case, I am puzzled why I don't get the trace.
Something really fishy is going on.

> Please can you confirm if _xc_init_hcall_buf() is ever called and what
> the behaviour of "pthread_getspecific(hcall_buf_pkey)" is if
> _xc_init_hcall_buf() has never been called. I think it is supposed to
> return NULL in this case and we certainly rely on that.

_xc_init_hcall_buf() is not called.  pthread_getspecific() should return NULL
but doesn't.

I am starting to ask myself "How did libxc ever work?". It feels like we are
hunting down a long-term hidden bug.

> pthread_getspecific(hcall_buf_pkey) is supposed to return NULL on error,
> however hcall_buf_pkey is uninitialised until _xc_init_hcall_buf,
> perhaps on NetBSD the uninitialised value somehow looks valid? It's not
> clear what the correct value to initialise a pthread_key_t to in order
> for it to appear invalid until it is properly setup is, but I suppose we
> should be initialising it before use. Please can you try this patch:

I tried the replacement patch from the other mail.
With it, xend does not crash, hcall_buf is NULL,
pthread_getspecific() returns NULL,
and I am not able to start a guest with 'xm'

Xend has probably crashed!  Invalid or missing HTTP status code.

>
> If that doesn't work perhaps you can reduce the issue to a simple test
> case like the attached? (which doesn't reproduce the issue for me on
> Linux) If you can do that then please run it with the attached libxc
> patch and post the output.

xc_interface is 0x7f7ffdb03800
before prep buf is 0x7f7ffdb0b000 / 0x7f7ffdb0b040
after prep buf is 0x7f7ffdb0b000 / 0x7f7ffdb20000
after release buf is 0x7f7ffdb0b000 / 0x7f7ffdb0b040
xc interface close returned 0

No crash. Is this the expected output ?

Christoph

-- 
---to satisfy European Law for business letters:
Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach b. Muenchen
Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel