This article describes how to configure and collect information when a device is Unexpectedly rebooting, crashing or hanging.
The console kernel dump is crucial to understand and help track down the event to its root cause.
Table of Contents
- Scope
- Solution
- Step 1 – Setting up the console connection.
- Step 2 – Configure the terminal application to save those logs on the disk and avoid losing them.
- Step 3 – Prevent the console access from timing out.
- Step 4 – enable additional debug logs.
- Step 5 – Monitor and wait for the problem to happen again.
- Step 6 – BootLoop.
Scope
FortiGate.
Solution
A System Crash refers to a condition where the device unexpectedly stops responding. In this case, administration access may be lost and the unit may become inaccessible and unresponsive.
FortiOS will reboot automatically when a kernel panic happens, but will remain ‘hanged’ when a kernel oops is found.
A System Hang refers to a condition where the device unexpectedly stops responding but doesn’t restart automatically and will remain in a ‘frozen’ state.
A BootLoop is a condition where the internal filesystem or due to a hardware issue the system itself won’t be able to boot up. In this case, follow Step 6 from this guide.
In these cases, the FortiOS kernel may lose access to its peripherals like disk I/O and logs cannot be saved. However, similarly to any other Linux system, it will ‘Dump’ whatever information was loaded into memory, along with whatever the CPU was processing last on COM_0 (console zero).
With SSH being a remote connection, the system may also lose access to its NIC and network interfaces, so it may not have the ability to dump information over remote console connections. Consequently, a console capture is required.
There are a few necessary steps to ensure the information will be collected, and the information will not be lost.
Step 1 – Setting up the console connection.
An easy way to proceed is to use a laptop to connect directly to one or more consoles. One laptop may have multiple console cables and save information from an HA cluster if necessary by spawning 2x terminals. For example: one to each firewall.
Make sure the laptop will not hibernate, sleep, or close the terminal connection where logs may not be saved.
The KB below contains more information about physical console settings when using putty.
How to connect to the FortiGate console port
Virtual devices also have a COM_0 interface, but access to it may vary from virtualization platforms.
Technical Tip: How to get console logs from AWS console contains more information about collecting the console output on AWS, for example.
Important: Make sure the console access is not disabled and is enabled by default.
config system console set login disable end
Step 2 – Configure the terminal application to save those logs on the disk and avoid losing them.
Technical Tip: How to create a log file of a session using PuTTY contains more information on how to ensure all logs will be saved.
Step 3 – Prevent the console access from timing out.
config system global set admin-console-timeout 0 end
Even if the admin user is logged out and it is expected that the kernel dump will be saved into the COM_0, an interactive command may be run to prevent the console from logging out.
For example: the command below will log no information. The only goal is to prevent the console from timing out.
diag sniffer packet any "host 127.0.0.10" 4
Step 4 – enable additional debug logs.
In all cases, it will not be necessary to enable additional commands because, upon a Kernel crash, the information will be dumped into the console. However, it may be desirable to enable additional logging to gain more visibility.
It is important to run the ‘diag debug reset’ command to ensure there are no applications in debug mode, as the kernel output will be valuable when and if it happens.
Only enable ‘diag debug kernel level xx’ under the guidance of Fortinet support, as it may lead to unexpected results if not used properly.
diag debug console timestamp enable diag debug reset diag debug duration 0 diag debug kernel level 7 diag wad debug crash enable diag debug enable
Step 5 – Monitor and wait for the problem to happen again.
Once a system crash happens again, compress the putty.log file, as this may decrease the upload times, and attach it to the ticket so TAC can review it further.
Collect the debug logs from the device and attach them to the ticket, where a copy of the backup may help speed up the process as well. See Technical Tip: Download Debug Logs and ‘execute tac report’.
Keep in mind that hardware failure is a common cause for unexpected reboots, system hangs, or crashes .
It may be wise to run a hardware test to ensure all components are performing well.
Step 6 – BootLoop.
In these cases, it is advisable to re-format the unit, making sure to ‘Format boot device’, and then re-upload the firmware. The information in Technical Tip: Formatting and loading FortiGate firmware image using TFTP will help cover this task.
If the problem persists, run a hardware test and contact Fortinet Support for further assistance.