Enabling RPS
Debugging Network Performance Issues: SSH Disconnections and Packet Loss⚑
A comprehensive guide to diagnosing and solving CPU-bound network packet processing bottlenecks using RPS (Receive Packet Steering), with complete monitoring setup.
The Problem⚑
Intermittent network issues manifesting as:
- SSH connections dropping with "Broken pipe" errors
- NFS file streaming experiencing stutters and interruptions
- Problems occurring during network traffic bursts
- No obvious hardware failures or errors
These symptoms suggest packet loss at the network processing layer rather than physical network problems.
Understanding the Linux Network Stack⚑
Before diving into debugging, it's crucial to understand how Linux processes network packets:
Physical Network
↓
Network Interface Card (NIC)
↓
Ring Buffer (hardware queue)
↓
Hardware Interrupt → CPU Assignment
↓
Softnet Processing (per-CPU queues)
↓
Application
Each stage has limited capacity and can become a bottleneck.
Initial Diagnostic Steps⚑
1. Check for Hardware Issues⚑
# Verify no hardware errors
ethtool -S <interface> | grep -E "(drop|error|fifo|overrun|collision)"
# Check ring buffer size
ethtool -g <interface>
What to look for:
- All error counters should be 0
- Ring buffer should not be at minimum size
2. Examine Softnet Statistics⚑
The /proc/net/softnet_stat file contains critical information about packet processing per CPU:
cat /proc/net/softnet_stat
Output format (hexadecimal):
/proc/net/softnet_stat output:
0006a7ac 00000000 00000000 00000000 00000000 ...
│ │ │ └─────────────── Reserved
│ │ └──────────────────────── Time squeeze count
│ └───────────────────────────────── Dropped packet count
└────────────────────────────────────────── Total packets processed
- Each line is one CPU core
- Column 1: Total packets processed by this CPU
- Column 2: Packets dropped (netdev_max_backlog exceeded)
- Column 3: Time squeeze (CPU couldn't finish processing)
- Columns 4-9: Various other counters
- Last column: CPU number
Example output:
0006a7ac 00000000 00000000 ... (CPU 0: 6,949,804 packets)
0005f333 00000000 00000000 ... (CPU 1: 6,169,395 packets)
000cd69c 00000000 00000000 ... (CPU 10: 842,396 packets)
3. Decode Hexadecimal Values⚑
# Convert hex to decimal using Python
cat /proc/net/softnet_stat | python3 -c "
import sys
for i, line in enumerate(sys.stdin):
cols = line.split()
packets = int(cols[0], 16)
drops = int(cols[1], 16) if len(cols) > 1 else 0
print(f'CPU {i:2d}: {packets:10,d} packets, {drops:3d} drops')
"
The Root Cause: Uneven Packet Distribution⚑
Incoming Traffic Burst (10,000 packets/sec)
│
▼
┌───────────────┐
│ NIC Queue │ ← Ring buffer: 4096 packets
│ (RX-0) │
└───────┬───────┘
│ Hardware IRQ
▼
┌───────────────┐
│ CPU 0 │ ← Handles ALL packets
│ ┌─────────┐ │
│ │ Softnet │ │ ← Backlog: 1000 packets
│ │ Queue │ │
│ └─────────┘ │
│ [FULL] │ ← Queue overflows!
└───────┬───────┘
│
▼
Packets DROPPED
SSH keepalive lost → "Broken pipe"
NFS data delayed → Stuttering
Meanwhile:
┌─────────┐ ┌─────────┐ ┌─────────┐
│ CPU 1 │ │ CPU 2 │ │ CPU 3 │
│ [IDLE] │ │ [IDLE] │ │ [IDLE] │
│ 5% │ │ 3% │ │ 4% │
└─────────┘ └─────────┘ └─────────┘
Identifying the Problem⚑
Calculate the packet distribution ratio:
counts = [int(line.split()[0], 16) for line in open('/proc/net/softnet_stat')]
ratio = max(counts) / min(counts)
print(f"Distribution ratio: {ratio:.2f}x")
Interpretation:
- Ratio < 3: Good distribution, packets spread across CPUs
- Ratio 3-5: Moderate imbalance, potential issues under load
- Ratio > 5: Severe imbalance, CPU bottleneck likely
Network Packet Processing per CPU:
CPU 0: ████████████████████████ 100% ← OVERLOADED
CPU 1: ██████████████████████ 95% ← OVERLOADED
CPU 2: ██ 8%
CPU 3: █ 5%
CPU 4: █ 4%
CPU 5: █ 3%
CPU 6: █ 6%
CPU 7: █ 4%
CPU 8: ████████████████████ 88% ← OVERLOADED
CPU 9: ████ 15%
CPU 10: ██ 9%
CPU 11: █ 6%
CPU 12: █ 7%
CPU 13: ██ 8%
CPU 14: █ 6%
CPU 15: █ 5%
Ratio: 100% / 3% = 33x ❌ CRITICAL
Problem: 2-3 CPUs doing all the work
Why This Matters⚑
Without proper packet distribution:
- Hardware interrupts go to only 2-3 CPUs (based on NIC queue count)
- Those CPUs become bottlenecked processing all network traffic
- Their softnet backlog queues fill up (default: 1000 packets)
- New packets get dropped even though the ring buffer has space
- Critical packets (SSH keepalives, NFS data) get lost
- Connections fail or stutter
The key insight: The problem isn't network bandwidth—it's CPU processing capacity for network packets.
The Solution: Receive Packet Steering (RPS)⚑
Incoming Traffic Burst (10,000 packets/sec)
│
▼
┌───────────────┐
│ NIC Queue │
│ (RX-0) │
└───────┬───────┘
│ Hardware IRQ
▼
┌───────────────┐
│ CPU 0 │ ← Initial handler
└───────┬───────┘
│
│ RPS distributes packets
│ based on packet hash
│
┌───────────┼───────────┬───────────┐
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│ CPU 0 │ │ CPU 1 │ │ CPU 2 │ │ CPU 3 │
│┌──────┐│ │┌──────┐│ │┌──────┐│ │┌──────┐│
││Queue ││ ││Queue ││ ││Queue ││ ││Queue ││
││ 250 ││ ││ 250 ││ ││ 250 ││ ││ 250 ││
│└──────┘│ │└──────┘│ │└──────┘│ │└──────┘│
│ [OK] │ │ [OK] │ │ [OK] │ │ [OK] │
└────────┘ └────────┘ └────────┘ └────────┘
│ │ │ │
└───────────┴───────────┴───────────┘
│
▼
All packets processed
No drops, no delays
SSH stable, NFS smooth
What is RPS?⚑
RPS (Receive Packet Steering) is a software mechanism that distributes packet processing across all CPUs after hardware interrupt handling. It's essentially software-based load balancing for network packet processing.
How RPS Works⚑
Without RPS:
NIC → Hardware IRQ → CPU 0, 1 → All packets processed by 2 CPUs
With RPS:
NIC → Hardware IRQ → CPU 0, 1 → Distribute to CPUs 0-15 via RPS
┌─────────────────────────────────────────────────────────────┐
│ Physical Network │
│ (Ethernet cables) │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Network Interface Card (NIC) │
│ Hardware Processing │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Ring Buffer │
│ (Hardware queue: 256-4096 packets) │
│ ┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐ │
│ │Pkt│Pkt│Pkt│Pkt│Pkt│ │ │ │ │ │ │ │
│ └───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘ │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Hardware Interrupt (IRQ) │
│ Assigned to specific CPU(s) │
└────────────────────────┬────────────────────────────────────┘
│
┌────────┴────────┐
│ Without RPS │
└────────┬────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐
│ CPU 0 │ │ CPU 1 │ │ CPU 2 │
│ 100% │ │ 100% │ │ 5% │
│ !!! │ │ !!! │ │ │
└───────┘ └───────┘ └───────┘
⋮
Problem: CPUs 0-1 overwhelmed, others idle
│
┌────────┴────────┐
│ With RPS │
└────────┬────────┘
│
┌────┬────┬────┬─┴─┬────┬────┬────┐
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
┌────┐┌────┐┌────┐┌────┐┌────┐┌────┐┌────┐
│CPU0││CPU1││CPU2││CPU3││CPU4││CPU5││...│
│ 45%││ 42%││ 48%││ 43%││ 46%││ 44%││ │
│ ✓ ││ ✓ ││ ✓ ││ ✓ ││ ✓ ││ ✓ ││ │
└────┘└────┘└────┘└────┘└────┘└────┘└────┘
Solution: Load distributed evenly
Implementing RPS⚑
1. Determine CPU Count⚑
nproc
# Output: 16 (for example)
2. Calculate CPU Mask⚑
Create a hexadecimal bitmask representing which CPUs should process packets:
- 8 CPUs:
ff(binary: 11111111) - 16 CPUs:
ffff(binary: 1111111111111111) - 32 CPUs:
ffffffff
┌─────────────────────────────────────────────────────────┐
│ /sys/class/net/eth0/queues/ │
│ │
│ rx-0/ │
│ rps_cpus ───→ 0000ffff (binary: 1111111111111111)│
│ └─────────────────────────┘│
│ CPU: │││││││││││││││││ │
│ 15 14 13...5 4 3 2 1 0 │
│ All CPUs enabled! │
│ │
│ rx-1/ │
│ rps_cpus ───→ 0000ffff │
│ │
└─────────────────────────────────────────────────────────┘
CPU Mask Calculation:
8 CPUs: 11111111 = 0xFF = ff
16 CPUs: 1111111111111111 = 0xFFFF = ffff
32 CPUs: (32 ones) = 0xFFFFFFFF = ffffffff
Formula: 2^(num_cpus) - 1 in hexadecimal
3. Apply RPS Configuration⚑
# For each RX queue, set the CPU mask
for rx in /sys/class/net/<interface>/queues/rx-*/rps_cpus; do
echo "ffff" > "$rx" # For 16 CPUs
done
4. Verify Configuration⚑
cat /sys/class/net/<interface>/queues/rx-*/rps_cpus
# Should show: 0000ffff (not 00000000)
Making RPS Persistent⚑
Create a systemd service to apply on boot:
cat > /etc/systemd/system/network-rps.service << 'EOF'
[Unit]
Description=Configure RPS for network interface
After=network.target
[Service]
Type=oneshot
ExecStart=/bin/bash -c 'for rx in /sys/class/net/<interface>/queues/rx-*/rps_cpus; do echo "ffff" > "$rx"; done'
ExecStart=/sbin/sysctl -w net.core.netdev_max_backlog=5000
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
EOF
systemctl enable network-rps.service
systemctl start network-rps.service
Additional Tuning⚑
Increase Network Backlog⚑
The default netdev_max_backlog (1000 packets) may be too small:
# Increase to 5000 packets
sysctl -w net.core.netdev_max_backlog=5000
# Make permanent in /etc/sysctl.conf
echo "net.core.netdev_max_backlog=5000" >> /etc/sysctl.conf
Optimize Ring Buffers⚑
If hardware supports larger buffers:
# Check maximum supported size
ethtool -g <interface>
# Increase to maximum
ethtool -G <interface> rx 4096 tx 4096
Monitoring the Solution⚑
Key Metrics to Track⚑
- Packet distribution ratio: Should remain < 3
- Softnet drops: Should stay at 0
- RPS configuration: Should remain enabled
- Network errors: Should stay at 0
Building a Monitoring Script⚑
Create a metrics collector for Prometheus Node Exporter:
#!/usr/bin/env python3
"""
Network statistics collector for Prometheus
Exposes metrics via textfile collector
"""
import os
from pathlib import Path
OUTPUT_DIR = "/var/lib/node_exporter/textfile_collector"
OUTPUT_FILE = f"{OUTPUT_DIR}/network_stats.prom"
INTERFACE = "eth0" # Adjust to your interface
def read_softnet_stat():
"""Parse /proc/net/softnet_stat"""
metrics = []
with open('/proc/net/softnet_stat', 'r') as f:
for cpu, line in enumerate(f):
cols = line.split()
metrics.append({
'cpu': cpu,
'processed': int(cols[0], 16),
'dropped': int(cols[1], 16) if len(cols) > 1 else 0,
})
return metrics
def calculate_distribution_ratio(metrics):
"""Calculate max/min ratio across CPUs"""
counts = [m['processed'] for m in metrics]
min_count = min(counts)
max_count = max(counts)
return max_count / min_count if min_count > 0 else 0
def check_rps_configured(interface):
"""Check if RPS is enabled"""
queue_path = f"/sys/class/net/{interface}/queues"
for rx_queue in Path(queue_path).glob("rx-*/rps_cpus"):
rps_value = rx_queue.read_text().strip()
if rps_value not in ("00000000", "0", ""):
return 1
return 0
def generate_metrics():
"""Generate Prometheus metrics"""
lines = []
# Softnet stats
softnet_metrics = read_softnet_stat()
for m in softnet_metrics:
lines.append(f'node_network_softnet_processed_total{{cpu="{m["cpu"]}"}} {m["processed"]}')
lines.append(f'node_network_softnet_dropped_total{{cpu="{m["cpu"]}"}} {m["dropped"]}')
# Distribution ratio
ratio = calculate_distribution_ratio(softnet_metrics)
lines.append(f'node_network_packet_distribution_ratio {ratio:.2f}')
# RPS status
rps_configured = check_rps_configured(INTERFACE)
lines.append(f'node_network_rps_configured{{device="{INTERFACE}"}} {rps_configured}')
return '\n'.join(lines) + '\n'
def main():
os.makedirs(OUTPUT_DIR, exist_ok=True)
metrics = generate_metrics()
# Write atomically
temp_file = f"{OUTPUT_FILE}.{os.getpid()}"
with open(temp_file, 'w') as f:
f.write(metrics)
os.rename(temp_file, OUTPUT_FILE)
if __name__ == '__main__':
main()
Systemd Timer Setup⚑
Service file (/etc/systemd/system/network-stats.service):
[Unit]
Description=Network Statistics Collector
After=network.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/network_stats.py
User=root
StandardOutput=journal
StandardError=journal
Timer file (/etc/systemd/system/network-stats.timer):
[Unit]
Description=Network Statistics Collector Timer
Requires=network-stats.service
[Timer]
OnBootSec=1min
OnUnitActiveSec=1min
[Install]
WantedBy=timers.target
Enable and start:
systemctl daemon-reload
systemctl enable network-stats.timer
systemctl start network-stats.timer
Prometheus Alert Rules⚑
groups:
- name: network_health
rules:
# Alert when packet distribution is uneven
- alert: NetworkPacketDistributionUneven
expr: node_network_packet_distribution_ratio > 3
for: 5m
labels:
severity: warning
annotations:
summary: "Uneven packet distribution detected"
description: "Ratio: {{ $value }}x. RPS may not be working properly."
# Alert when RPS is disabled
- alert: NetworkRPSNotConfigured
expr: node_network_rps_configured == 0
for: 2m
labels:
severity: critical
annotations:
summary: "RPS is not configured"
description: "Network packet steering is disabled."
# Alert on packet drops
- alert: NetworkSoftnetPacketDrops
expr: rate(node_network_softnet_dropped_total[5m]) > 10
for: 2m
labels:
severity: warning
annotations:
summary: "Network packets being dropped"
description: "CPU {{ $labels.cpu }} dropping {{ $value }} packets/sec"
Verification and Testing⚑
Test Packet Distribution⚑
# Take snapshot before generating traffic
cat /proc/net/softnet_stat > /tmp/before.txt
# Generate network load (e.g., large file transfer)
# ... traffic generation ...
# Take snapshot after
cat /proc/net/softnet_stat > /tmp/after.txt
# Calculate difference per CPU
paste /tmp/before.txt /tmp/after.txt | python3 -c "
import sys
for i, line in enumerate(sys.stdin):
cols = line.split()
before = int(cols[0], 16)
after = int(cols[13], 16) # After hex value position
diff = after - before
print(f'CPU {i:2d}: {diff:8,d} packets processed')
"
Expected Results⚑
Without RPS (bad):
CPU 0: 5,234,123 packets ← Overloaded
CPU 1: 4,891,234 packets ← Overloaded
CPU 2: 123,456 packets ← Idle
CPU 3: 98,234 packets ← Idle
...
Ratio: 53.4x (BAD)
With RPS (good):
CPU 0: 456,789 packets ← Balanced
CPU 1: 423,456 packets ← Balanced
CPU 2: 445,123 packets ← Balanced
CPU 3: 467,890 packets ← Balanced
...
Ratio: 1.8x (GOOD)
Common Pitfalls⚑
1. RPS Not Persisting After Reboot⚑
Problem: RPS settings are lost on reboot.
Solution: Use systemd service (shown above) or add to network interface configuration.
2. Wrong CPU Mask⚑
Problem: Using CPU mask that doesn't match actual CPU count.
Solution: Always calculate based on nproc output:
- For N CPUs: mask = 2^N - 1 (in hex)
3. Monitoring Only Hardware Metrics⚑
Problem: Looking only at ethtool -S which shows hardware-level stats.
Solution: Also monitor /proc/net/softnet_stat for CPU-level packet processing.
4. Ignoring Softnet Backlog⚑
Problem: Ring buffer is large but netdev_max_backlog is default (1000).
Solution: Increase net.core.netdev_max_backlog to 5000+.