Kexec/Kdump setup in CentOS


kdump is the Linux kernel’s built-in crash dump mechanism. In the event of a kernel crash, kdump creates a memory image (also known as vmcore) that can be analyzed for the purposes of debugging and determining the cause of a crash. Dumped image of the main memory, exported as an Executable and Linkable Format (ELF) object, can be accessed either directly during the handling of a kernel crash, or it can be automatically saved to a locally accessible file system, to a raw device, or to a remote system accessible over the network.

Here we are going to see how to install and configure kdump on CentOS 6.

Step 1: Install required package

[root@client ~]# yum install kexec-tools crash kernel-debug kernel-debuginfo-`uname -r`

Step 2: Kdump configuration

To configure the amount of memory to be reserved for the kdump kernel, edit the /boot/grub/grub.conf file and add crashkernel=M or crashkernel=auto options.

Note that the crashkernel=auto option only reserves the memory if the physical memory of the system is equal to or greater than 2 GB on 32-bit and 64-bit x86 architectures;

[root@client ~]# vi /etc/grub.conf

— multiple lines are removed —

title CentOS (2.6.32-358.el6.x86_64.debug)
root (hd0,0)
kernel /vmlinuz-2.6.32-358.el6.x86_64.debug ro root=/dev/mapper/VolGroup-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD rd_LVM_LV=VolGroup/lv_swap SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=VolGroup/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet crashkernel=128M
initrd /initramfs-2.6.32-358.el6.x86_64.debug.img
title CentOS (2.6.32-358.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-358.el6.x86_64 ro root=/dev/mapper/VolGroup-lv_root rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD rd_LVM_LV=VolGroup/lv_swap SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=VolGroup/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet crashkernel=128M
initrd /initramfs-2.6.32-358.el6.x86_64.img

Step 3: Kdump target configuration

When a kernel crash is captured, the core dump can be either stored as a file in a local file system, written directly to a device, or sent over a network using the NFS (Network File System) or SSH (Secure Shell) protocol. Only one of these options can be set at the moment, and the default option is to store the vmcore file in the /var/crash/ directory of the local file system.

[root@client ~]# vi /etc/kdump.conf

Uncomment the below two lines.

path /var/crash
core_collector makedumpfile -c --message-level 1 -d 31

Step 4: Start the kdump daemon

[root@client ~]# /etc/init.d/kdump start
[root@client ~]# chkconfig --level 35 kdump on

Reboot the server to make this changes effective

[root@client ~]# init 6

Check the kdump status

[root@client ~]# /etc/init.d/kdump status

Step 5: Test kdump

Please do not execute the below command in live environment. These commands will crash your running kernel and reboot the server.

[root@client ~]# echo 1 > /proc/sys/kernel/sysrq
[root@client ~]# echo c > /proc/sysrq-trigger

Now the debug kernel is loaded by kexec and gathers the crash data and stored the dump under /var/crash folder. After that the machine will boot into the default kernel.

Step 6: Analyzing the core dump

Crash command is used to analyze the vmcore dump file. Run the below command to analyze the cause of server reboot.

[root@client ~]# crash /usr/lib/debug/lib/modules/2.6.32-358.el6.x86_64/vmlinux /var/crash/127.0.0.1-2015-05-28-07\:52\:52/vmcore

crash 6.1.0-1.el6
Copyright (C) 2002-2012 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

KERNEL: /usr/lib/debug/lib/modules/2.6.32-358.el6.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2015-05-28-07:52:52/vmcore [PARTIAL DUMP]
CPUS: 1
DATE: Thu May 28 07:52:47 2015
UPTIME: 01:07:33
LOAD AVERAGE: 0.00, 0.15, 0.12
TASKS: 73
NODENAME: client
RELEASE: 2.6.32-358.el6.x86_64
VERSION: #1 SMP Fri Feb 22 00:31:26 UTC 2013
MACHINE: x86_64 (2871 Mhz)
MEMORY: 2 GB
PANIC: "Oops: 0002 [#1] SMP " (check log for details)
PID: 5597
COMMAND: "bash"
TASK: ffff880037fe6ae0 [THREAD_INFO: ffff88007d7cc000]
CPU: 0
STATE: TASK_RUNNING (PANIC)

crash>

Suppose if you want to know more details about the system crash then issue below command in crash prompt.

crash> log

Type help log for more information on the command usage.