What is hung task and how to panic our RHEL/centos system when a task remains hanged for specific period of time ?
The hung task is detected by linux kernel by parsing processes with uninterruptible sleep state(which are waiting for some event or resource and is usually not going to move forward) for long time and which are stalled into this D state.
In this article, we will see how to enable task panic for such hanged tasks and collect core dump for that task to troubleshoot later.
Here we assume that you already have kernel dump enabled on system and core collection working fine.
1. Enable Task Panic on Hung task(Temporarily)
# echo '1' > /proc/sys/kernel/hung_task_panic
2. Enable Task Panic on Hung task(Permanently)
#### a. Open file to write new line # vi /etc/sysctl.conf #### b. Put below line to enable panic on hung task kernel.hung_task_panic = 1 #### c. Now enable the mentioned parameter # sysctl -p
3. Lets Check out all four hung parameters available
There are two ways to check this out.
The first one to grep from /proc files is preferred as it returns output even when there are few parameters not set.
However sysctl command will show null output in case no parameter is set.
# grep -Hv "zz" /proc/sys/kernel/hung* /proc/sys/kernel/hung_task_check_count:4194304 /proc/sys/kernel/hung_task_panic:1 /proc/sys/kernel/hung_task_timeout_secs:240 /proc/sys/kernel/hung_task_warnings:15 # sysctl -q kernel | grep hung | sort kernel.hung_task_check_count = 32768 kernel.hung_task_panic = 1 kernel.hung_task_timeout_secs = 240 kernel.hung_task_warnings = 15
4. Understanding hung parameters
Parameter | Usage |
---|---|
kernel.hung_task_check_count = 32768 | Maximum number of 32768 processes to check on system |
kernel.hung_task_panic = 1 | Tells system to panic if tasks are blocked for more than hung_task_timeout_secs value |
kernel.hung_task_timeout_secs = 240 | A task is considered hung task when its not responding for 240 seconds here and a warning is issued. |
kernel.hung_task_warnings = 15 | Maximum number of warning for a hunged task, after which the task got panic and core dumped. |
[…] https://ngelinux.com/what-is-hung-task-and-how-to-panic-our-rhel-centos-system-when-a-task-remains-h… […]