Skip to Content

How to fix high CPU/Failopen mode due to GRE traffic increasing IPSEngine load

This article describes the behavior seen when FortiGate IPSEngine enters fail open mode due to GRE traffic, causing high CPU and an increased load on the FortiGate.

Scope

FortiGate with pass-through GRE traffic that is IPS inspected/UTM enabled

Solution

FortiGate can perform two types of acceleration (offloading):

  • Network Processor level (NP6/NP7): https://docs.fortinet.com/document/fortigate/6.4.6/hardware-acceleration/575471
  • NTurbo for inspected traffic: Offloads firewall and NAT sessions from the FortiGate CPU to NP7 or NP6 network processors and distributes these sessions to different IPS engine processes spread across multiple CPU cores, ensuring a load-balanced approach for handling IPS signature/pattern matching tasks.

Behavior and symptoms (v7.0/v7.2/v7.4v/7.6)

If IPS is enabled on a policy, and this policy is experiencing a high load of pass-through GRE traffic, notice a high CPU or even IPSEngine entering fail-open mode.

This can be observed by checking the crash log ‘diagnose debug crashlog read’ and looking for occurrences of ‘IPS enter fail open mode’.

This can be further confirmed by observing high consumption of IPSengine processes, consuming high CPU by running ‘diagnose sys top’.

This is mostly due to this GRE traffic not being offloaded by NTurbo. Verify by printing the session:

diagnose sys session filter clear <----- Clear previous filters.
diagnose sys session filter proto 47 <----- Add new filter for GRE traffic (IP protocol type 47).
diagnose sys session list <----- List the GRE session.

If noticing this GRE traffic with ‘proto=47’ and the following: ‘no_ofld_reason: redir-to-ips’, this indicates that this traffic was not offloaded and equals to denied-by-turbo in the no_ofld_reason. (i.e. A session being processed by the IPS that could normally be offloaded is not supported by nTurbo.)

The reason is, that all GRE pass-through traffic will go to IPS through kernel/raw socket since NTurbo offloading only supports TCP/UDP traffic. GRE is not TCP/UDP. Therefore, NTurbo can not offload it.

To overcome this, add a new policy specifically for GRE traffic with no UTM enabled, which means no IPS is involved.

NP hardware can support GRE offloading when UTM is disabled, however, as mentioned, NTurbo does not support GRE traffic, which is why when UTM is enabled, this traffic is not offloaded.

Note: It is possible to to correlate the high CPU/IPSEngine fail-open with a time increase in GRE traffic bandwidth.