Skip to content

avoid timeouts to improve cpu performance#213

Merged
domcyrus merged 5 commits into
domcyrus:mainfrom
deepakpjose:dev/cpu_perf
Mar 31, 2026
Merged

avoid timeouts to improve cpu performance#213
domcyrus merged 5 commits into
domcyrus:mainfrom
deepakpjose:dev/cpu_perf

Conversation

@deepakpjose

@deepakpjose deepakpjose commented Mar 29, 2026

Copy link
Copy Markdown
Contributor

Packet processor fix

Previously, each pcap_rx thread called recv_timeout(1ms) in a loop, timing out and re-entering up to ~1000
times/sec when idle — each timeout translating to a futex syscall. With 4 processor threads this produced ~4000
unnecessary syscalls/sec, visible as ~5% CPU usage per pcap_rx thread even under no traffic.

The new approach moves batching to tx side. This helps in cpu utilizations because they are awaken only on the arrival of messages. So, polling is removed from all threads. I can see visible perrformance improvement of cpu%. Its at ~3% compared to 15-20% range earlier in release version.

This is a trade off between batching, accuracy and cpu utilization on low end systems that have lesser number of packets.

After batching at tx side.
insidecode@insidecode(master)# top
top - 12:38:43 up 1 day, 22:26,  1 user,  load average: 0.52, 0.62, 0.76
Tasks: 322 total,   1 running, 320 sleeping,   0 stopped,   1 zombie
%Cpu(s):  2.2 us,  0.9 sy,  0.0 ni, 60.6 id, 36.3 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7681.5 total,    681.6 free,   5211.4 used,   2599.9 buff/cache
MiB Swap:   4096.0 total,     14.1 free,   4081.9 used.   2470.1 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   1548 insidec+  20   0 4954748 190940  67664 S   7.3   2.4 106:41.37 gnome-shell
  95353 insidec+  20   0   12.3g 544516 123676 S   3.3   6.9  55:01.59 firefox
 118428 insidec+  20   0 2382504  72368  38252 S   3.0   0.9   3:31.92 ghostty
 148430 root      20   0 1008316 133744  32464 S   2.6   1.7   0:21.96 rustnet     <=====
  69805 insidec+  20   0   71.4g 193100  25996 S   1.3   2.5   4:23.72 claude
  95807 insidec+  20   0   18.9g 115344  42852 S   1.3   1.5   1:08.41 WebExtensions
 143691 insidec+  20   0   48816  37152   9076 S   1.3   0.5   0:13.96 nvim
 118630 insidec+  20   0  103340  54432   7212 S   1.0   0.7   2:37.53 nvim
   1691 root      20   0 3602876  20544   6540 S   0.7   0.3   0:16.01 dockerd
 118300 insidec+  20   0  569744  50952  40976 S   0.7   0.6   1:23.97 gnome-terminal-
 148462 insidec+  20   0   14528   5856   3620 R   0.7   0.1   0:03.48 top
    185 root       0 -20       0      0      0 I   0.3   0.0   0:05.09 kworker/2:1H-i915_cleanup
   1279 root      20   0 2097988  11944   4100 S   0.3   0.2   1:20.13 containerd
   1825 insidec+  20   0  459020   6104   5516 S   0.3   0.1   0:02.29 gsd-housekeepin
  12402 insidec+  20   0   71.7g 267856  23344 S   0.3   3.4  19:28.90 claude
Before optimization

top - 12:51:27 up 1 day, 22:39,  1 user,  load average: 0.51, 0.72, 0.74
Tasks: 320 total,   3 running, 316 sleeping,   0 stopped,   1 zombie
%Cpu(s):  2.0 us,  1.1 sy,  0.0 ni, 60.6 id, 36.3 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7681.5 total,    911.5 free,   4840.0 used,   2621.2 buff/cache
MiB Swap:   4096.0 total,      0.5 free,   4095.5 used.   2841.5 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  151662 root      20   0 1007772 130308  31960 S  20.9   1.7   0:19.13 rustnet <===
   1548 insidec+  20   0 4953516 198360  73244 S   1.3   2.5 108:24.76 gnome-shell
  95807 insidec+  20   0   18.9g 129604  57876 S   0.7   1.6   1:10.42 WebExtensions
 118428 insidec+  20   0 2382328  73652  39160 R   0.7   0.9   3:35.59 ghostty
  12402 insidec+  20   0   71.7g 258012  23332 S   0.3   3.3  19:35.05 claude
  95353 insidec+  20   0   12.4g 535584 131724 S   0.3   6.8  56:18.31 firefox
 116535 insidec+  20   0 2636996  78772  53492 S   0.3   1.0   0:11.81 Isolated Web Co
 118300 insidec+  20   0  569744  50964  40976 S   0.3   0.6   1:28.68 gnome-terminal-
 118630 insidec+  20   0  103348  53840   7212 R   0.3   0.7   2:45.11 nvim
 143691 insidec+  20   0   48760  37092   9076 S   0.3   0.5   0:21.18 nvim
 149155 root      20   0       0      0      0 I   0.3   0.0   0:00.34 kworker/u32:0-events_unbound
 151691 insidec+  20   0   14528   5844   3612 R   0.3   0.1   0:00.17 top
      1 root      20   0   23616  11812   7396 S   0.0   0.2   0:14.25 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.11 kthreadd
      3 root      20   0       0      0      0 S   0.0   0.0   0:00.00 pool_workqueue_release
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/R-rcu_gp
      5 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/R-sync_wq
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/R-kvfree_rcu_reclaim
      7 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/R-slub_flushwq
      8 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/R-netns
     10 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-events_highpri

@domcyrus domcyrus left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepakpjose Thanks a lot for this change. I've tested it a bit and the CPU improvement looks good to me. One thing to maybe fix: the fetch_add(1, ...) on the Full error path undercounts dropped packets since it's now dropping entire batches, we probably should use the batch length instead.

Comment thread src/app.rs
Comment thread src/app.rs Outdated

@domcyrus domcyrus left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@domcyrus domcyrus merged commit 7d96bd0 into domcyrus:main Mar 31, 2026
16 checks passed
domcyrus added a commit that referenced this pull request Apr 9, 2026
- Windows restricted token sandbox (#206)
- macOS Seatbelt sandboxing, later tightened (#196, #203)
- Linux sandbox hardening: drop capabilities and clear ambient set (#208)
- UI: process privilege shown in security section (#197)
- Filter: exact port matching and regex support (#195)
- VLAN support in PKTAP/SLL parsers and L3 extraction (#202, #199)
- IGMP protocol parsing (#209)
- Process name for wildcard /proc/net entries (#218)
- CPU efficiency improvements in sort/snapshot/rate/timeout paths (#213, #220, #212, #222) — thanks @deepakpjose
- FreeBSD platform cleanup (#205)
- Fix default interface selection (#194), root detection on Unix (#192)
- Dependency updates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants