[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] how to generate a smaller core with xm dump-core



On 01/16/15 03:39, Zhenzhong Duan wrote:
> å 2015/1/16 0:16, Don Slutz åé:
>> On 01/15/15 05:20, Ian Campbell wrote:
>>> On Thu, 2015-01-15 at 11:31 +0800, Zhenzhong Duan wrote:
>>>> Hi Maintainers,
...
>> If these are Linux guests then the patch:
>>
>> http://lists.xenproject.org/archives/html/xen-devel/2013-11/msg02351.html
>>
>> Can be used to enable crash to access the crashed guest and collect
>> some basic info. I would also include the output of xen-hvmctx and/or
>> xenctx.
>> -- This is the quick way I would do Ian's minicore .
>>
>> Note: dump-core currently does not include xen-hvmctx output (nor does
>> it include
>> xenctx -a output).
>>
>> Using the results from this may allow you to not need a copy of every
>> dump (
>> not to imply that 2 similar minicore's would insure that a copy of
>> each core
>> was not needed).
>>
>> I also think that makedumpfile can process a xen dump-core file and
>> make it
>> smaller.
> Looks similar as gdbsx.

Sigh, I somehow missing this email till now.

It is like gdbsx, but for crash not gdb.

> This patch will help if I could reproduce the issue locally.
> But I can't access customer's env in most situation and they will not
> wait me to do online debug remotely.

Clearly I did not get my message across. I was not trying to say
"reproduce the issue locally".  What I was trying to say was that
this patch (to generate code) that can be used in a script to generate a
"minicore".  Also a script could use this "minicore" to decide if a full
core was needed.

crash is good at providing a summary of a core and can give high level
info about the state of linux.  I do not know of scripts that use just
gdb that can get this data.

This would be a way to save time and storage.

(Info on crash at http://people.redhat.com/anderson/ )

For example:

[root@hyper-0-21-51 C63-min-tools]# xl list
Name                                        ID   Mem VCPUs      State
Time(s)
Domain-0                                     0  2048     4     r-----
11563.1
C63-min-tools                                4  2080     1     ---sc-
   32.3
[root@hyper-0-21-51 C63-min-tools]# /usr/lib/xen/bin/xen-crashd 4 4444&
[1] 30786
[root@hyper-0-21-51 C63-min-tools]# 30 Jan 15 22:37:53.363 socket ready
on port 4444 after 1 bind call
crash localhost:4444
/home/don/C63-min-tools/kernel-debuginfo-2.6.32-279/usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/vmlinux

crash 7.0.9
Copyright (C) 2002-2014  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

30 Jan 15 22:38:03.373 Accepted a connection.
WARNING: daemon cannot access /proc/version

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL:
/home/don/C63-min-tools/kernel-debuginfo-2.6.32-279/usr/lib/debug/lib/modules/2.6.32-279.el6.x86_64/vmlinux
    DUMPFILE: /dev/mem@localhost  (remote live system)
        CPUS: 1
        DATE: Fri Jan 30 22:16:13 2015
      UPTIME: 00:21:06
LOAD AVERAGE: 0.07, 0.05, 0.02
       TASKS: 96
    NODENAME: C63-min-tools.tc5.don.cloudswitch.com
     RELEASE: 2.6.32-279.el6.x86_64
     VERSION: #1 SMP Fri Jun 22 12:19:21 UTC 2012
     MACHINE: x86_64  (2000 Mhz)
      MEMORY: 2 GB
         PID: 0
     COMMAND: "swapper"
        TASK: ffffffff81a8d020  [THREAD_INFO: ffffffff81a00000]
         CPU: 0
       STATE: TASK_RUNNING

crash> dmesg | tail -100
udev: renamed network interface rename17 to eth17
udev: renamed network interface eth5 to rename7
udev: renamed network interface eth9 to rename11
udev: renamed network interface rename14 to eth14
udev: renamed network interface eth4 to rename6
udev: renamed network interface eth8 to rename10
udev: renamed network interface eth2 to rename4
udev: renamed network interface eth6 to rename8
udev: renamed network interface eth1 to rename3
udev: renamed network interface eth0 to eth18
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:1:0: Attached scsi generic sg1 type 0
sr 1:0:0:0: Attached scsi generic sg2 type 5
sr 1:0:1:0: Attached scsi generic sg3 type 5
piix4_smbus 0000:00:07.3: SMBus Host Controller at 0xb100, revision 0
shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
EXT4-fs (xvda1): mounted filesystem with ordered data mode. Opts:
SELinux: initialized (dev xvda1, type ext4), uses xattr
Adding 2064376k swap on /dev/mapper/vg_c63mintools-lv_swap.  Priority:-1
extents:1 across:2064376k SS
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses
genfs_contexts
vmci module is older than RHEL 6.2 ... applying fixups
VMCI: Major device number is: 249
vsock module is older than RHEL 6.2 ... applying fixups
ip6_tables: (C) 2000-2006 Netfilter Core Team
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ip_tables: (C) 2000-2006 Netfilter Core Team
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
SELinux: initialized (dev rpc_pipefs, type rpc_pipefs), uses genfs_contexts
802.1Q VLAN Support v1.8 Ben Greear <greearb@xxxxxxxxxxxxxxx>
All bugs added by David S. Miller <davem@xxxxxxxxxx>
bnx2fc: Broadcom NetXtreme II FCoE Driver bnx2fc v1.0.11 (Apr 24, 2012)
eth0: no IPv6 routers present
eth1: no IPv6 routers present
SysRq : Trigger a crash
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8132e6b6>] sysrq_handle_crash+0x16/0x20
PGD 7928a067 PUD 79b04067 PMD 0
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/dev
CPU 0
Modules linked in: bnx2fc fcoe libfcoe libfc scsi_transport_fc 8021q
scsi_tgt garp stp llc sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
xt_state nf_conntrack ip6table_filter ip6_tables vsock(U) vmci(U) shpchp
i2c_piix4 i2c_core sg ext4 mbcache jbd2 vmw_pvscsi mptspi mptscsih
mptbase scsi_transport_spi sd_mod crc_t10dif sr_mod cdrom xen_netfront
xen_blkfront pata_acpi ata_generic ata_piix e1000 vmxnet3 dm_mirror
dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4
cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs
libiscsi scsi_transport_iscsi [last unloaded: scsi_wait_scan]

Pid: 1803, comm: bash Not tainted 2.6.32-279.el6.x86_64 #1 Xen HVM domU
VMwareless
RIP: 0010:[<ffffffff8132e6b6>]  [<ffffffff8132e6b6>]
sysrq_handle_crash+0x16/0x20
RSP: 0018:ffff8800370f7e18  EFLAGS: 00010096
RAX: 0000000000000010 RBX: 0000000000000063 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000063
RBP: ffff8800370f7e18 R08: 0000000000000000 R09: ffffffff8163ab80
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffff81afb760 R14: 0000000000000286 R15: 0000000000000007
FS:  00007fb4234a4700(0000) GS:ffff88000c400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000037075000 CR4: 00000000000406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process bash (pid: 1803, threadinfo ffff8800370f6000, task ffff88007c62a040)
Stack:
 ffff8800370f7e68 ffffffff8132e972 ffff88007c62a040 ffff880000000000
<d> 0000000d80df4018 0000000000000002 ffff88007d14cd40 00007fb4234a2000
<d> 0000000000000002 fffffffffffffffb ffff8800370f7e98 ffffffff8132ea2e
Call Trace:
 [<ffffffff8132e972>] __handle_sysrq+0x132/0x1a0
 [<ffffffff8132ea2e>] write_sysrq_trigger+0x4e/0x50
 [<ffffffff811e0abe>] proc_reg_write+0x7e/0xc0
 [<ffffffff8117b068>] vfs_write+0xb8/0x1a0
 [<ffffffff810d69e2>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8117ba81>] sys_write+0x51/0x90
 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Code: d0 88 81 63 44 fd 81 c9 c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 55
48 89 e5 0f 1f 44 00 00 c7 05 1d 8d 76 00 01 00 00 00 0f ae f8 <c6> 04
25 00 00 00 00 01 c9 c3 55 48 89 e5 0f 1f 44 00 00 8d 47
RIP  [<ffffffff8132e6b6>] sysrq_handle_crash+0x16/0x20
 RSP <ffff8800370f7e18>
CR2: 0000000000000000
---[ end trace 0593342d2727c9fd ]---
Kernel panic - not syncing: Fatal exception
Pid: 1803, comm: bash Tainted: G      D    ---------------
2.6.32-279.el6.x86_64 #1
Call Trace:
 [<ffffffff814fd11a>] ? panic+0xa0/0x168
 [<ffffffff815012b4>] ? oops_end+0xe4/0x100
 [<ffffffff81043bab>] ? no_context+0xfb/0x260
 [<ffffffff81043e35>] ? __bad_area_nosemaphore+0x125/0x1e0
 [<ffffffff815032e6>] ? notifier_call_chain+0x16/0x80
 [<ffffffff81043f5e>] ? bad_area+0x4e/0x60
 [<ffffffff81044710>] ? __do_page_fault+0x3d0/0x480
 [<ffffffff8106b8e5>] ? __call_console_drivers+0x75/0x90
 [<ffffffff81097e2f>] ? up+0x2f/0x50
 [<ffffffff8106b94a>] ? _call_console_drivers+0x4a/0x80
 [<ffffffff8106bfdf>] ? release_console_sem+0x1cf/0x220
 [<ffffffff8150326e>] ? do_page_fault+0x3e/0xa0
 [<ffffffff81500625>] ? page_fault+0x25/0x30
 [<ffffffff8132e6b6>] ? sysrq_handle_crash+0x16/0x20
 [<ffffffff8132e972>] ? __handle_sysrq+0x132/0x1a0
 [<ffffffff8132ea2e>] ? write_sysrq_trigger+0x4e/0x50
 [<ffffffff811e0abe>] ? proc_reg_write+0x7e/0xc0
 [<ffffffff8117b068>] ? vfs_write+0xb8/0x1a0
 [<ffffffff810d69e2>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8117ba81>] ? sys_write+0x51/0x90
 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
crash> runq
CPU 0 RUNQUEUE: ffff88000c416680
  CURRENT: PID: 1803   TASK: ffff88007c62a040  COMMAND: "bash"
  RT PRIO_ARRAY: ffff88000c416808
     [no tasks queued]
  CFS RB_ROOT: ffff88000c416718
     [no tasks queued]
crash> net -a
NEIGHBOUR        IP ADDRESS      HW TYPE    HW ADDRESS         DEVICE  STATE
ffff8800374f6d80 172.16.51.1     ETHER      00:22:99:c8:81:67  eth0
REACHABLE
ffff8800374f6e80 0.0.0.0         UNKNOWN    00 00 00 00 00 00  lo      NOARP
crash>

etc.

   -Don Slutz


> 
> Yes, makedumpfile may help in size rather than time here.
> I am thinking if port some code from makedumpfile is possible and
> acceptable.
> 
> zduan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.