您现在的位置是:首页 >技术杂谈 >Linux服务器内核崩溃问题分析网站首页技术杂谈

Linux服务器内核崩溃问题分析

moxiaoran5753 2024-06-17 10:14:33
简介Linux服务器内核崩溃问题分析

阿里云服务器无法使用SSH连接,网站访问也出现异常,登录阿里云平台,系统提示:系统出现了内核Panic,OOM异常或内部宕机、性能抖动。后台询问了阿里云客服,说需要安装和开kdump 服务,于是开始了kdump的学习。

kdump概念:
当系统崩溃时,kdump 使用 kexec 启动到第二个内核,第二个内核通常叫做捕获内核,以很小内存启动以捕获转储镜像。第一个内核保留了内存的一部分给第二内核启动用。由于 kdump 利用 kexec 启动捕获内核,绕过了 BIOS,所以第一个内核的内存得以保留。这是内核崩溃转储的本质。

kdump正常运行的条件:
1. 系统中开启kdump服务
2. 启动文件配置中,合理分配了崩溃内存容量

CentOS7: 检查系统中kdump状态的方法:
systemctl status kdump.service

centos7 默认已安装kdump:

yum install kernel-debuginfo kexec-tools crash

yum install kexec-tools

设置crashkernel预留内存大小,修改/etc/default/grub文件

找到GRUB_CMDLINE_LINUX配置项,修改crashkernel的值,默认auto,须根据服务器内存大小合理设置crashkernel的值,如果系统的内存 <= 8 GB 对kdump kernel不会保留任何内容(等同于关闭kdump),如果系统的内存> 8 GB但是<= 16 GB,crashkernel=auto会保留256M,等同于crashkernel=256M,如果系统内存> 16GB, crashkernel=auto会保留512M, 等同于crashkernel=512M.

3.需要重新生成grub配置文件,重启系统才能生效

grub2-mkconfig -o /boot/grub2/grub.cfg
reboot

4.开启kdump服务:

systemctl start kdump.service //启动kdump
systemctl enable kdump.service //设置开机启动

5.输入命令systemctl status kdump.service检查kdump服务时否开启

 输入命令 systemctl is-active kdump.service 

如果提示Starting kdump:[OK]则启动完成。

6.手动触发一下crash dump

echo 1 >/proc/sys/kernel/sysrq; echo c > /proc/sysrq-trigger

如果没有问题,系统会自动重启,重启后可以看到在/var/crash/目录下生成了coredump文件

打开crash来分析:

# crash vmcore /usr/lib/debug/lib/modules/3.10.0-957.1.3.el7.x86_64/vmlinux
 
 
crash 7.2.3-8.el7
Copyright (C) 2002-2017  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...
 
 
WARNING: kernel relocated [126MB]: patching 85619 gdb minimal_symbol values
 
 
      KERNEL: /usr/lib/debug/lib/modules/3.10.0-957.1.3.el7.x86_64/vmlinux
    DUMPFILE: vmcore  [PARTIAL DUMP]
        CPUS: 4
        DATE: Fri Jun 18 05:32:32 2021
      UPTIME: 00:47:57
LOAD AVERAGE: 0.00, 0.01, 0.05
       TASKS: 413
    NODENAME: localhost.localdomain
     RELEASE: 3.10.0-957.1.3.el7.x86_64
     VERSION: #1 SMP Thu Nov 29 14:49:43 UTC 2018
     MACHINE: x86_64  (3799 Mhz)
      MEMORY: 2 GB
       PANIC: "SysRq : Trigger a crash"
         PID: 12653
     COMMAND: "bash"
        TASK: ffffa1071aca8000  [THREAD_INFO: ffffa1074b32c000]
         CPU: 3
       STATE: TASK_RUNNING (SYSRQ)
 
 
crash>

风语者!平时喜欢研究各种技术,目前在从事后端开发工作,热爱生活、热爱工作。