PFC Deadlock Detection and Recovery

Overview

Priority-based Flow Control (PFC) helps manage traffic in networks by pausing specific flows during congestion. However, under certain conditions, a deadlock can occur when cyclical dependencies between flows create a loop of PFC pause events that prevents traffic from making forward progress indefinitely.

Priority Flow Control Pause Frames

PFC uses the standard pause frame mechanism with an additional 14 bytes of padding in the frame. This padding contains a 2-byte value for each of the eight priority classes, specifying the pause time in quanta for that class.

Example for priority pause frame:

Copy

Pause Frame:


  Control Opcode: 0x0101 (Priority Pause)
  Pause Time (8 priority classes): 
    Class 0: 0x0000 (no pause)
    Class 1: 0x1234 (pause time in quanta)
    Class 2: 0x5678 (pause time in quanta)
    ...
    Class 7: 0x9ABC (pause time in quanta)

In the example above, the pause time in quanta field defines if the pause frame has XON or XOFF set for that class:

XON (X-On): A control signal sent from the receiver to the transmitter to indicate readiness to accept data. For example, Class 0 represents a no-pause condition.
XOFF (X-Off): A control signal sent from the receiver to the transmitter indicating that it cannot accept additional data due to congestion. For example, Classes 1, 2, and 7 above specify a non-zero pause time (in quanta), signaling the transmitter to temporarily halt transmission.

Workflow of PFC Frames

Transmission: The transmitter sends data to the receiver.
XOFF: The receiver sends an XOFF signal to the transmitter, indicating that it is congested and cannot process more data.
Pause: The transmitter pauses sending data to the receiver.
XON: The receiver sends an XON signal to the transmitter, indicating that it is ready to receive data again.
Resume: The transmitter resumes sending data to the receiver.

PFC deadlock:

A deadlock may occur when the receiver continuously sends XOFF signals for one or more classes, preventing the transmitter from sending any traffic. This feature is designed to detect such deadlocks and initiate recovery mechanisms.

To handle such critical situation, the OcNOS system provides PFC Deadlock Detection and Recovery capability. This chapter describes how to:

Enable PFC deadlock detection and recovery on a specific interface using
Timer mode
PFC state XON mode
Configure the global PFC deadlock detection and recovery action to drop
Global Mode

Feature Characteristics

Deadlock Detection

The system monitors PFC queues for extended periods in the XOFF state.

If a queue remains paused beyond a configurable threshold, a deadlock event is declared.

An interrupt is raised to inform software of the detected deadlock.

Deadlock Recovery

Once a deadlock is detected, software moves the affected queue into an ignore PFC XOFF state, allowing traffic scheduling to resume.

Recovery can be configured on a per-interface basis and supports three modes:

Timer Mode: Recovery ends automatically after a user-defined time interval. The system then clears the interrupt and restarts the detection timer. It is an automatic recovery method and recovery starts after a configurable detection-multiplier times time-granularity period. During that period, traffic will be allowed by default, but can also be dropped if the configuration priority-flow-control deadlock recovery-action drop is set. Recovery also ends automatically after a optionally configurable recovery-time period.

Traffic will gradually decrease to zero if the recovery-mode timer is not configured; otherwise, it will continue indefinitely.

PFC-State-XON Mode: Recovery ends when the interface receives a PFC XON frame, signaling that the pause condition is lifted.
Manual Mode: Recovery requires explicit user action with CLI commands. This option is only valid if no automatic recovery mode is configured.

Limitation:

Manual recovery mode is not supported in Trident3 (TR3) platforms or Tomahawk3 (TH3) platforms.

Trident3 (TR3) platforms support deadlock recovery only in timer mode.

Trident3 (TR3) platforms do not support 1ms time granularity.

Tomahawk 2 (TH2) series platforms are not supported.

Benefits

Prevents indefinite traffic stalls due to PFC loops.

Provides flexible recovery options (automatic or manual).

Improves network reliability in environments that rely on PFC.

Prerequisites

The device should be enabled with PFC.

Configuration

PFC feature supports deadlock detection and recovery. This chapter shows how to:

1. Enable PFC deadlock detection and recovery on an interface
2. Set global PFC deadlock detection and recovery action to drop

Topology

This topology illustrates a spine-leaf router architecture where Priority Flow Control (PFC) manages traffic at the queue level, optimizing the flow from spine to leaf while minimizing packet loss and enhancing overall quality of service.

Figure 163. PFC Enabled Bridge

Configuring an Interface for PFC Deadlock Detection and Recovery on interface can be done in Timer mode or XON mode.

Refer to "PFC Deadlock Detection and Recovery" section in Layer 2 Configuration guide for EVPN-VxLAN topology configuration.

Timer mode

Execute the following steps to configure PFC on both interfaces on leaf router.

1. Set the IP address.
Copy
(config-if)#ip address 1.1.1.1/24
2. Enable the PFC. Configure the advertise flag and start sending DCBX TLVs in LLDP messages.
Copy
(config-if)#priority-flow-control mode on
3. Enable PFC on priorities 0 and 1.
Copy
(config-if)#priority-flow-control enable priority 0 1
4. Enable automatic priority flow control deadlock recovery mode timer with custom detection and recovery time parameters.
Copy
(config-if)#priority-flow-control deadlock recovery-mode timer detection-multiplier 10 
time-granularity 10 recovery-time 1000

PFC state XON mode

1. Set the IP address.
Copy
(config-if)#ip address 2.2.2.1/24
2. Enable the PFC. Configure the advertise flag and start sending DCBX TLVs in LLDP messages.
Copy
(config-if)#priority-flow-control mode on
3. Enable PFC on priorities 0 and 1.
Copy
(config-if)#priority-flow-control enable priority 0 1
4. Enable automatic priority flow control deadlock recovery mode timer with custom detection and recovery time parameters.
Copy
(config-if)#priority-flow-control deadlock recovery-mode timer detection-multiplier 10 
time-granularity 10 recovery-time 1000

Global Mode

When any interface enters deadlock recovery mode, instead of allowing the deadlocked traffic to pass, traffic will be dropped if this command is set

Copy
(config)#priority-flow-control deadlock recovery-action drop

Validation

1. Verifying deadlock config and status for all interfaces.
Copy
#show priority-flow-control deadlock-status

Deadlock Detection and Recovery Configuration
----------------------------------------------------------------------
interface            recovery     detection     detection     recovery
                     mode         multiplier    granularity   time
======================================================================
eth1                 Timer        10            10            1500

---------------------------------------------------------------------------------------------------

Deadlock Detection and Recovery Status
---------------------------------------------------------------------------------------------------
interface           pri    state          detection    last detection             last recovery
                                          count        timestamp                  timestamp
===================================================================================================
eth1                  1       deadlock           39    2025-05-29 19:03:49.481    -
2. Verifying deadlock config and status for a specific interface
Copy
#show priority-flow-control deadlock-status interface eth1

Deadlock Detection and Recovery Configuration
----------------------------------------------------------------------
interface            recovery     detection     detection     recovery
                     mode         multiplier    granularity   time
======================================================================
eth1                 Timer        10            10            1500

-----------------------------------------------------------------------------------------------

Deadlock Detection and Recovery Status
-----------------------------------------------------------------------------------------------
interface           pri    state          detection    last detection             last recovery
                                          count        timestamp                  timestamp
===============================================================================================
eth1                  0    no deadlock            0    -                          -
eth1                  1       deadlock           35    2025-05-29 19:03:34.611    -
eth1                  2    no deadlock            0    -                          -
eth1                  3    no deadlock            0    -                          -
eth1                  4    no deadlock            0    -                          -
eth1                  5    no deadlock            0    -                          -
eth1                  6    no deadlock            0    -                          -
eth1                  7    no deadlock            0    -                          -

3. Clearing deadlock status for a specific interface
Copy
#clear priority-flow-control deadlock-status eth1
4. Clearing deadlock status for all interfaces.
Copy
#clear priority-flow-control deadlock-status