Skip to main content

Benchmarks

Methodology

Total data: 4,903 benchmark runs (3,703 relay throughput + 480 latency + 720 tunnel quality).

  • Traffic generator: custom distributed bench tool (derp_scale_test), 20 DERP peers across 4 client VMs, 10 sender/receiver pairs, ~1400-byte messages at WireGuard MTU, token-bucket pacing
  • Duration: 15 seconds per run (3s warmup, 12s measured)
  • Runs: 20 per data point, 95% CIs (Welch's t)
  • Latency: derp_test_client ping/echo, 5,000 samples per run, 2.16M total samples
  • Tunnel quality: iperf3 UDP + TCP + ICMP through WireGuard/Tailscale, 20 runs per point

Software

ComponentVersion
Hyper-DERPkTLS (TLS 1.3 AES-GCM), io_uring DEFER_TASKRUN
Go derperv1.96.4, go1.26.1, release build
Kernel6.12.73+deb13-cloud-amd64

Infrastructure

RoleMachine TypeCountNIC BW
Relayc4-highcpu-16122 Gbps
Clientc4-highcpu-8422 Gbps each

GCP europe-west4-a. NIC bandwidth verified at 22 Gbps on all paths.

Peak Throughput

ConfigHD Peak (Mbps)TS Ceiling (Mbps)HD/TS
2 vCPU (1w)3,7301,87010.8x
4 vCPU (2w)6,0912,7983.5x
8 vCPU (4w)12,3164,6702.7x
16 vCPU (8w)16,5457,8342.1x
Peak throughput across all configs

HD's advantage grows as resources shrink. At 2 vCPU, TS collapses (92% loss at 5G offered) while HD delivers 3.5 Gbps.

The Cost Story

HD delivers the same throughput on half the vCPUs:

TS deploymentTS throughputHD equivalentHD throughputSavings
TS on 16 vCPU7,834 MbpsHD on 8 vCPU8,371 Mbps2x
TS on 8 vCPU4,670 MbpsHD on 4 vCPU5,457 Mbps2x
TS on 4 vCPU2,798 MbpsHD on 2 vCPU3,536 Mbps2x
VM cost comparison

Full methodology, raw data, and tooling are in the benchmark repository.

Detailed Results