[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] RE: Coredumps and 'crash' utility...



Ian, 

Thanks for your response.

I was able to build the Domain 0 kernel from your DDK and ended up with
a vmlinux that contained all of the symbols.

Furthermore, the type of /proc/vmcore seems to have changed with my new
builds from

#file /dom0/proc/vmcore

/dom0/proc/vmcore: ELF 64-bit LSB core file AMD x86-64, version 1
(SYSV), SVR4-style

To:
 
# file /proc/vmcore

/proc/vmcore: ELF 64-bit LSB core file Intel 80386, version 1 (SYSV),
SVR4-style

Not sure why but that didn't didn't make a difference when trying to
read the file with readelf.  It continued to complain about the large
values out of range. I'm assuming it is related to the 64-bit addresses
like you surmised. I'll track down the readelf from elfutils as you have
suggested next.


I was able to make good progress once I had the rebuilt vmlinux.  With
the 'crash' utility running in our DDK, I was able to remotely (ssh
mounted disk) access the /proc/vmcore of the crashed hypervisor.
However, it was not as useful as I had expected.  I was hoping I could
attach to the context and move up and down the stack, examining stack
variables and source code.  GDB has this feature for other architecture
OS's so I thought it would be available under 'crash'.  I have confirmed
that it is not.

If I try to use GDB itself to read the /proc/vmcore file, it complains
that it is not a validly formatted file.  Just to be sure, has anyone
used GDB to attach to a /vmcore file and move up and down the
hypervisor's stack, all the while having the ability to look at the
source code files?





-----Original Message-----
From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxxxxx] 
Sent: Friday, October 12, 2007 5:56 AM
To: Roger Cruz
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] RE: Coredumps and 'crash' utility...

[I'll continue this on-list rather than privately like I normally ould
for XE realted questions since it seems the issue is more general than
the fact that you happen to be using XenEnterprise]

On Tue, 2007-10-09 at 20:02 -0400, Roger Cruz wrote: 
> >First, you won't get anywhere with the vmlinuz file; you must 
> >use the kernel's vmlinux file, which must have been built with -g.
> 
> I have confirmed that the domain 0 vmlinux file is compiled with the
-g
> switch so the symbol info should be there.  It's also not obvious from
> my post because I didn't rename the file, but the vmlinuz is the
> uncompressed version of the XenSource-provided Domain 0 kernel.

The installed image will have been stripped so you might need to get the
unstripped vmlinux file from our DDK.

> #readelf  -a /proc/vmcore
> readelf: Error: Could not locate '/proc/vmcore'.  System error
> message:
> Value too large for defined data type

If I remember correctly /proc/vmcore is always an ELFCLASS64 file since
it must contain physical addresses which can be >4G even on 32 bit
(PAE). The e_machine field in the ELF header will be EM_386 or EM_X86_64
depending on the hypervisor (not the kernel or userspace).

I have generally found the binutils readelf to not be that great,
especially when faced with ELFCLASS* files which don't match your
userspace. The readelf from elfutils is better in this regard.

In any case I don't think your problems stem from the format of vmcore
-- I am pretty sure it is correct. More likely the version of the tools
you are trying to use cannot cope with the 64 bit-ness, probably because
they were compiled/are running in a 32 bit userspace environment or they
are otherwise confused due to the 32on64 configuration of XE.

> The host machine is an x86_64.  I've been told that the hypervisor
> supports 64-bits and that domain 0 is 32-bits but I'm not 100%.

If you were using pristine XenEnterprise v4 then this would be correct,
however the log you provided shows that your modified hypervisor is
actually 32 bit: 
        (XEN) *** LOADING DOMAIN 0 ***
        (XEN)  Xen  kernel: 32-bit, PAE, lsb
        (XEN)  Dom0 kernel: 32-bit, PAE, lsb, paddr 0xc0100000 ->
0xc0440000

> I don't know with 100% certainty.  It is created with kdump but I
don't
> know if they've modified the file format.  

I don't think we did, certainly not on purpose ;-). We made a change to
get the e_machine field in the ELF header to be correct (i.e. match the
hypervisor not the kernel) in a 32on64 bit world, that shouldn't have
broken anything 32on32 though and the patch is upstream.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.