[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Hidden symbol when debugging hypervisor



On 30/04/14 15:16, Jan Beulich wrote:
>>>> On 30.04.14 at 16:08, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 30/04/14 14:16, Jan Beulich wrote:
>>>>>> On 30.04.14 at 13:28, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>> On 30/04/14 11:21, Jan Beulich wrote:
>>>> I have encountered similar problems generating stack traces with the Xen
>>>> Crashdump Analyser, which only has System.map available.
>>>>
>>>> xen.git/xen$ cat System.map | cut -d ' ' -f 3 | sort | uniq -d | wc -l
>>>> 78
>>>>
>>>> Having duplicate symbol names for different symbols is confusing at the
>>>> very least, and trivial to avoid.  I reckon that most if not all of
>>>> those 78 duplicate symbols can, and should be, deduplicated.  Renaming
>>>> credit -> credit2 will amend about 1/4 of that list.
>>> For the crash dump analyzer, I can't see why it shouldn't be able to
>>> consume the symbol table from elf-syms or elf.efi instead of the
>>> (reduced) System.map.
>> Because it mostly runs on a systems without the debuginfo rpms installed.
> But crash dump analysis wouldn't normally be done on the crashing
> system, would it?

Large numbers of XenServer customers have servers with more RAM than
local hard drive space. Storing the crash ram image is not possible.

Analysis gets done on /proc/vmcore in the crash environment, with logs
written into the dom0 root filesystem.

In some copious free time I am looking to extend this to be able to
specify network locations to put the logs, and pack enough into the
initrd so the root filesystem doesn't need mounting.  This will help
with issues caused by the root filesystem driver locking up (although in
general the crash environment 'reset_devices' kernel parameter is
usually good enough to get enough of a filesystem working).

>
>> Furthermore, it needs to fit in a 64MB crash region with the crash
>> kernel and initrd as well (although this is more flexible).
> Why would the symbol table need to be in the crash region?

It wouldn't (necessarily), but is certainly less overhead to have the
text symbol table in memory than all of the debugging symbols.

>
>>> And for Xen generated stack traces I think I already said that this has
>>> been on my todo list for quite some time, pending no more important
>>> things to deal with, yet not to follow what you suggest, but to make
>>> Xen consume its own ELF/COFF symbol table instead of the (again
>>> reduced) one generated by tools/symbols.
>>>
>>> My main rationale here is that within a source file having prefix-less
>>> names is not only fine, but preferable (less typing, less needless line
>>> wrapping), and hence only global symbols need to be fully
>>> disambiguated.
>> From a coding point of view, certainly.
>>
>> From a debugging point of view, I completely disagree.  From a stack
>> trace, you want to be able to identify the function absolutely. 
>> Currently, finding "csched_schedule()" in a stack trace still means that
>> I have to work out which scheduler is actually in use.  With a cpupool
>> using credit1 and a cpupool using credit2, this can be very difficult
>> after-the-fact.
> How would "common/sched_credit.c:schedule" (with the pointless
> prefix already dropped) be ambiguous?

That wouldn't, but would we really want full paths in stack traces?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.