kump FAQ

1. kdump Service Startup Failure

Symptom

The systemctl status kdump command reports the service status as "failed."

Possible Causes and Solutionss

  1. The crashkernel parameter fails to reserve memory.

    The error log from systemctl status kdump includes:

    bash
    Aug 10 15:26:20 localhost.localdomain kdumpctl[772]: No memory reserved for crash kernel
    Aug 10 15:26:20 localhost.localdomain kdumpctl[772]: Starting kdump: [FAILED]

    The crashkernel parameter typically reserves memory in low memory (below 4 GB). Under heavy memory usage, this reservation may fail, preventing kdump from starting.

    Solution: Modify the boot parameter to crashkernel=size,high, enabling memory reservation from high memory.

Possible Causes and Solutionss

  1. The crashkernel parameter fails to reserve memory.

    The error log from systemctl status kdump includes:

    bash
    Aug 10 15:26:20 localhost.localdomain kdumpctl[772]: No memory reserved for crash kernel
    Aug 10 15:26:20 localhost.localdomain kdumpctl[772]: Starting kdump: [FAILED]

    The crashkernel parameter typically reserves memory in low memory (below 4 GB). Under heavy memory usage, this reservation may fail, preventing kdump from starting.

    Solution: Modify the boot parameter to crashkernel=size,high, enabling memory reservation from high memory.

  2. Kernel configuration mismatch prevents dracut from creating kdump.img.

    systemctl status kdump shows errors similar to the follows:

    bash
    Aug 10 16:25:52 localhost.localdomain kdumpctl[3972]: dracut-install: ERROR: installing 'loop'
    Aug 10 16:25:52 localhost.localdomain kdumpctl[2225]: dracut: FAILED:  /usr/lib/dracut/dracut-install -D /var/tmp/dracut.a9swIC/initramfs -N mdio_gpi|usb_8d>
    Aug 10 16:25:52 localhost.localdomain dracut[2271]: FAILED:  /usr/lib/dracut/dracut-install -D /var/tmp/dracut.a9swIC/initramfs -N mdio_gpi|usb_8dev|et1011c>
    Aug 10 16:25:52 localhost.localdomain kdumpctl[2225]: dracut: installkernel failed in module squash
    Aug 10 16:25:52 localhost.localdomain dracut[2271]: installkernel failed in module squash
    Aug 10 16:25:53 localhost.localdomain kdumpctl[1541]: mkdumprd: failed to make kdump initrd
    Aug 10 16:25:53 localhost.localdomain kdumpctl[1541]: Starting kdump: [FAILED]

    This error occurs because dracut requires squashfs.ko, loop.ko, and delay.ko. If any of these modules are missing, dracut fails.

    This issue is unlikely in official openEuler LTS versions, as they include these .ko files. If you compiled the kernel, verify these configuration options:

    bash
    CONFIG_SQUASHFS=m
    
    CONFIG_BLK_DEV_LOOP=m
    
    CONFIG_OVERLAY_FS=m

    These options must be set to m to build the .ko files, not y.

  3. KASLR is enabled and /proc/sys/kernel/kptr_restrict is set to 2.

    systemctl status kdump returns these errors:

    bash
    Aug 10 14:55:04 localhost.localdomain kdumpctl[637422]: Can't find kernel text map area from kcore
    Aug 10 14:55:04 localhost.localdomain kdumpctl[637422]: Cannot load /boot/vmlinuz-4.18.0-147.5.2.1.h579.hugetlb.eulerosv2r10.x86_64+
    Aug 10 14:55:04 localhost.localdomain kdumpctl[637001]: kexec: failed to load kdump kernel
    Aug 10 14:55:04 localhost.localdomain kdumpctl[637001]: Starting kdump: [FAILED]

    This typically occurs on x86 systems, as Address Space Layout Randomization (KASLR) is not yet enabled on AArch64.

    With KASLR enabled, kdump cannot retrieve kernel layout information from /proc/kcore. Additionally, if /proc/sys/kernel/kptr_restrict is set to 2, information in /proc/kallsyms is hidden. These combined factors prevent kdump from starting.

    Solution: Set /proc/sys/kernel/kptr_restrict to 1, which allows only the root user to view /proc/kallsyms. Then, start kdump as root.

2. kdump Service Active, But vmcore Generation Fails

Symptom

systemctl status kdump indicates the service is active, but no vmcore file is generated after a system crash and reboot.

Possible Causes and Solutions

  1. Insufficient memory is reserved for crashkernel, resulting in out-of-memory (OOM).

    The crash kernel requires sufficient memory to launch. An OOM error likely occurs because a kernel object consumes excessive memory. Official openEuler versions generally avoid this issue, but self-compiled kernels require careful attention. Check serial port logs to confirm if an OOM error occurred.

    Solution: Increase the value of the crashkernel boot parameter. If memory reservation fails after the increase, use crashkernel=size,high to reserve memory.

  2. SECTIONS_SIZE_BITS is incompatible.

    The makedumpfile tool (invoked by the kdump service) completes the vmcore dump. The SECTIONS_SIZE_BITS definition within makedumpfile must match the kernel. SECTIONS_SIZE_BITS is defined in the kernel file arch/arm64/include/asm/sparsemem.h. Official openEuler AArch64 versions define it as 27, and the SECTIONS_SIZE_BITS in kdump is modified to 27 to match. However, the community source code sets SECTIONS_SIZE_BITS to 30, which is incompatible with kdump and causes makedumpfile to fail vmcore generation.

    Solution: Modify SECTIONS_SIZE_BITS in the kernel source code arch/arm64/include/asm/sparsemem.h to 27.

  3. Out-of-band hardware watchdog resets, interrupting the vmcore dump.

    kdump vmcore dumps can be time-consuming, depending on system memory usage and drive write speeds. An out-of-band hardware watchdog might interrupt the vmcore dump process.

    Solution: Disable the out-of-band hardware watchdog or reset its timeout value in kdump.

  4. Drive reporting is abnormal.

    Improper drive reporting can prevent vmcore from being saved correctly. Check serial port logs to confirm these issues.

3. crash Tool Fails to Parse the Generated vmcore

Symptom

Parsing the generated vmcore with crash vmcore vmlinux results in an error, preventing normal parsing.

Possible Causes and Solutions

  1. vmcore and vmlinux versions do not match.

    crash requires a vmlinux file compiled from the kernel source code to parse a vmcore. The vmlinux version must match the system version that dumped the vmcore for successful parsing.

    Solution: Use a vmlinux file with the same version as the vmcore.

  2. Missing strings Command

    crash relies on the strings command for vmcore parsing. Its absence causes parsing failures.

    Solution: The binutils package provides the strings command. Install binutils or manually copy the strings command and its dependencies.

  3. vmcore is corrupted.

    Check kdump_status.log to determine if the kdump vmcore dump process completed successfully.

    Solution: Trigger a system panic again to generate a new vmcore.