Disclaimer: kcptun maintains a single website — github.com/xtaci/kcptun. Any websites other than github.com/xtaci/kcptun are not endorsed by xtaci.
| Target | Supported | Recommended |
|---|---|---|
| System | darwin freebsd linux windows | freebsd linux |
| Memory | >32 MB | > 64 MB |
| CPU | ANY | amd64 with AES-NI & AVX2 |
NOTE: If you are using KVM, ensure that the guest OS supports AES instructions
Download:
curl -L https://raw.githubusercontent.com/xtaci/kcptun/master/download.sh | sh
Increase the number of open files on your server, as:
ulimit -n 65535, or write it in ~/.bashrc.
Suggested sysctl.conf parameters for Linux to improve UDP packet handling:
net.core.rmem_max=26214400 // BDP - Bandwidth Delay Product
net.core.rmem_default=26214400
net.core.wmem_max=26214400
net.core.wmem_default=26214400
net.core.netdev_max_backlog=2048 // Proportional to -rcvwnd
FreeBSD-related sysctl settings can be found here: https://github.com/xtaci/kcptun/blob/master/dist/freebsd/sysctl_freebsd
You can also increase the per-socket buffer by adding the parameter (default 4MB):
-sockbuf 16777217
For slow processors, increasing this buffer is CRITICAL for proper packet reception.
Download the appropriate binary from the precompiled Releases.
KCP Client: ./client_darwin_amd64 -r "KCP_SERVER_IP:4000" -l ":8388" -mode fast3 -nocomp -autoexpire 900 -sockbuf 16777217 -dscp 46
KCP Server: ./server_linux_amd64 -t "TARGET_IP:8388" -l ":4000" -mode fast3 -nocomp -sockbuf 16777217 -dscp 46
The above commands will establish a port forwarding channel for port 8388/tcp as follows:
Application -> KCP Client(8388/tcp) -> KCP Server(4000/udp) -> Target Server(8388/tcp)
which tunnels the original connection:
Application -> Target Server(8388/tcp)
OR START WITH THESE COMPLETE CONFIGURATION FILES: client --> server
$ git clone https://github.com/xtaci/kcptun.git
$ cd kcptun
$ ./build-release.sh
$ cd build
All precompiled releases are generated using the build-release.sh script.
Practical bandwidth graph with parameters: -mode fast3 -ds 10 -ps 3
Q: I have a high-speed network link. How can I maximize bandwidth?
A: Increase
-rcvwndon the KCP Client and-sndwndon the KCP Server simultaneously and gradually. The minimum of these values determines the maximum transfer rate of the link using the formulawnd * mtu / rtt. Then test your connection by downloading content to verify it meets your requirements. (The MTU can be adjusted using the-mtuparameter.)
Q: I'm using kcptun for gaming and want to minimize latency.
A: Latency spikes often indicate packet loss. You can reduce lag by adjusting the
-modeparameter.
For example:
-mode fast3
Retransmission aggressiveness/responsiveness for embedded modes:
fast3 > fast2 > fast > normal > default
Since streams are multiplexed into a single physical channel, head-of-line blocking may occur. Increasing -smuxbuf to a larger value (default is 4MB) can mitigate this issue, though it will consume more memory.
For versions >= v20190924, you can switch to smux version 2. Smux v2 provides options to limit per-stream memory usage. Set -smuxver 2 to enable smux v2, and adjust -streambuf to control per-stream memory consumption. For example: -streambuf 2097152 limits per-stream memory usage to 2MB. Limiting the stream buffer on the receiver side applies back-pressure to the sender, preventing buffer overflow along the link. (The -smuxver setting MUST be IDENTICAL on both sides; the default is 1.)
kcptun uses Reed-Solomon Codes for packet recovery, which requires substantial computational resources. Low-end ARM devices may experience performance issues with kcptun. For optimal performance, a multi-core x86 server CPU such as AMD Opteron is recommended. If you must use ARM routers, it's advisable to disable FEC and use salsa20 for encryption.
$ ./client_freebsd_amd64 -h
NAME:
kcptun - client(with SMUX)
USAGE:
client_freebsd_amd64 [global options] command [command options] [arguments...]
VERSION:
20251124
COMMANDS:
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--localaddr value, -l value local listen address (default: ":12948")
--remoteaddr value, -r value kcp server address, eg: "IP:29900" a for single port, "IP:minport-maxport" for port range (default: "vps:29900")
--key value pre-shared secret between client and server (default: "it's a secrect") [$KCPTUN_KEY]
--crypt value aes, aes-128, aes-128-gcm, aes-192, salsa20, blowfish, twofish, cast5, 3des, tea, xtea, xor, sm4, none, null (default: "aes")
--mode value profiles: fast3, fast2, fast, normal, manual (default: "fast")
--QPP enable Quantum Permutation Pads(QPP)
--QPPCount value the prime number of pads to use for QPP: The more pads you use, the more secure the encryption. Each pad requires 256 bytes. (default: 61)
--conn value set num of UDP connections to server (default: 1)
--autoexpire value set auto expiration time(in seconds) for a single UDP connection, 0 to disable (default: 0)
--scavengettl value set how long an expired connection can live (in seconds) (default: 600)
--mtu value set maximum transmission unit for UDP packets (default: 1350)
--ratelimit value set maximum outgoing speed (in bytes per second) for a single KCP connection, 0 to disable. Also known as packet pacing. (default: 0)
--sndwnd value set send window size(num of packets) (default: 128)
--rcvwnd value set receive window size(num of packets) (default: 512)
--datashard value, --ds value set reed-solomon erasure coding - datashard (default: 10)
--parityshard value, --ps value set reed-solomon erasure coding - parityshard (default: 3)
--dscp value set DSCP(6bit) (default: 0)
--nocomp disable compression
--sockbuf value per-socket buffer in bytes (default: 4194304)
--smuxver value specify smux version, available 1,2 (default: 2)
--smuxbuf value the overall de-mux buffer in bytes (default: 4194304)
--framesize value smux max frame size (default: 8192)
--streambuf value per stream receive buffer in bytes, smux v2+ (default: 2097152)
--keepalive value seconds between heartbeats (default: 10)
--closewait value the seconds to wait before tearing down a connection (default: 0)
--snmplog value collect snmp to file, aware of timeformat in golang, like: ./snmp-20060102.log
--snmpperiod value snmp collect period, in seconds (default: 60)
--log value specify a log file to output, default goes to stderr
--quiet to suppress the 'stream open/close' messages
--tcp to emulate a TCP connection(linux)
-c value config from json file, which will override the command from shell
--pprof start profiling server on :6060
--help, -h show help
--version, -v print the version
$ ./server_freebsd_amd64 -h
NAME:
kcptun - server(with SMUX)
USAGE:
server_freebsd_amd64 [global options] command [command options] [arguments...]
VERSION:
20251124
COMMANDS:
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--listen value, -l value kcp server listen address, eg: "IP:29900" for a single port, "IP:minport-maxport" for port range (default: ":29900")
--target value, -t value target server address, or path/to/unix_socket (default: "127.0.0.1:12948")
--key value pre-shared secret between client and server (default: "it's a secrect") [$KCPTUN_KEY]
--crypt value aes, aes-128, aes-128-gcm, aes-192, salsa20, blowfish, twofish, cast5, 3des, tea, xtea, xor, sm4, none, null (default: "aes")
--QPP enable Quantum Permutation Pads(QPP)
--QPPCount value the prime number of pads to use for QPP: The more pads you use, the more secure the encryption. Each pad requires 256 bytes. (default: 61)
--mode value profiles: fast3, fast2, fast, normal, manual (default: "fast")
--mtu value set maximum transmission unit for UDP packets (default: 1350)
--ratelimit value set maximum outgoing speed (in bytes per second) for a single KCP connection, 0 to disable. Also known as packet pacing. (default: 0)
--sndwnd value set send window size(num of packets) (default: 1024)
--rcvwnd value set receive window size(num of packets) (default: 1024)
--datashard value, --ds value set reed-solomon erasure coding - datashard (default: 10)
--parityshard value, --ps value set reed-solomon erasure coding - parityshard (default: 3)
--dscp value set DSCP(6bit) (default: 0)
--nocomp disable compression
--sockbuf value per-socket buffer in bytes (default: 4194304)
--smuxver value specify smux version, available 1,2 (default: 2)
--smuxbuf value the overall de-mux buffer in bytes (default: 4194304)
--framesize value smux max frame size (default: 8192)
--streambuf value per stream receive buffer in bytes, smux v2+ (default: 2097152)
--keepalive value seconds between heartbeats (default: 10)
--closewait value the seconds to wait before tearing down a connection (default: 30)
--snmplog value collect snmp to file, aware of timeformat in golang, like: ./snmp-20060102.log
--snmpperiod value snmp collect period, in seconds (default: 60)
--pprof start profiling server on :6060
--log value specify a log file to output, default goes to stderr
--quiet to suppress the 'stream open/close' messages
--tcp to emulate a TCP connection(linux)
-c value config from json file, which will override the command from shell
--help, -h show help
--version, -v print the versionkcptun can dial across a port range to avoid ISP QoS throttling or port-based interference.
How it works:
- Address format:
IP:min-max(e.g.,1.2.3.4:3000-4000). - On each new connection, the client randomly picks one port in the given range and connects to it.
- The server must listen on the same port range so it can accept connections on any port within that range.
Under the hood, addresses are parsed and validated as [host, minPort, maxPort] (see std/multiport.go), and the client selects a random port in [minPort, maxPort] for each session (see client/dial.go).
Usage:
-
Server — listen on a range:
./server_linux_amd64 -l ":3000-4000" ...Note: open the UDP ports 3000–4000 in your firewall.
-
Client — dial to a range:
./client_linux_amd64 -r "SERVER_IP:3000-4000" ...
Each new kcptun UDP/KCP session uses one randomly selected port from the range; sessions do not hop ports mid-connection.
Notes:
- Valid ranges are
1–65535withmin <= max. - Single-port usage still works:
IP:29900(no hyphen). - Works with
--tcpmode as well; the remote port is still chosen from the range before initializing the connection.
kcptun supports userspace packet pacing to smooth out data transmission.
Why use it?
Without pacing, KCP may send data in large bursts (micro-bursts). These sudden spikes can overflow the network interface card (NIC) buffers or the OS kernel's UDP buffer, causing local packet drops before the data even leaves your server. This is especially common on high-speed links or restricted environments.
How to use:
Use --ratelimit <value> to set the maximum outgoing speed (in bytes per second) for a single KCP connection.
- Example:
--ratelimit 1048576limits the speed to 1MB/s. - Default:
0(unlimited).
Benefits:
- Prevents Kernel Drops: Reduces the risk of
ENOBUFSerrors and kernel-level packet drops. - Smoother Traffic: Creates a more consistent flow of packets, which is friendlier to intermediate routers and reduces jitter.
- Bandwidth Control: Useful for limiting upload speed on asymmetric networks (e.g., ADSL/Cable).
kcptun uses Reed-Solomon Codes to recover lost packets, which significantly improves data throughput on lossy networks.
You can configure the FEC parameters using the following flags:
--datashard, -ds: Number of data shards (default: 10).--parityshard, -ps: Number of parity shards (default: 3).
How it works:
For every datashard packets sent, parityshard redundant packets are generated and sent. This allows the receiver to recover the original data even if up to parityshard packets are lost within the group of datashard + parityshard packets.
Overhead:
The bandwidth overhead can be calculated as: parityshard / datashard.
For the default setting (10 data, 3 parity), the overhead is 30%.
Configuration Guide:
- AutoTune: The receiver automatically detects and adapts to the sender's FEC parameters (DataShard/ParityShard), so you can adjust them on one side without restarting the other.
- Tuning:
- Increase
-parityshardto improve reliability on highly lossy networks, at the cost of higher bandwidth usage. - Decrease
-parityshardto reduce bandwidth overhead if the network quality is good.
- Increase
- Disable FEC: Set
--parityshard 0to disable Forward Error Correction. This saves CPU and bandwidth but reduces reliability on unstable networks.
Long-Distance Communication:
In long-distance communication (e.g., cross-continent), the Round-Trip Time (RTT) is high. If a packet is lost, waiting for retransmission (RTO) incurs a significant latency penalty. FEC allows the receiver to reconstruct lost packets immediately without waiting for retransmission, making it highly effective for reducing latency in high-RTT environments.
Differentiated Services (DiffServ) is a computer networking architecture that specifies a simple, scalable, and coarse-grained mechanism for classifying and managing network traffic and providing Quality of Service (QoS) on modern IP networks. DiffServ can, for example, be used to provide low-latency service to critical network traffic such as voice or streaming media while providing simple best-effort service to non-critical traffic such as web browsing or file transfers.
DiffServ uses a 6-bit differentiated services code point (DSCP) in the 8-bit differentiated services field (DS field) in the IP header for packet classification purposes. The DS field and ECN field replace the outdated IPv4 TOS field.
Set each side with -dscp value. Here are some commonly used DSCP values.
kcptun includes built-in packet encryption powered by various block encryption algorithms operating in Cipher Feedback Mode or AEAD. For each packet to be sent, the encryption process begins by encrypting a nonce from the system entropy, ensuring that encrypting identical plaintexts never produces identical ciphertexts.
Packet contents are fully encrypted, including headers (FEC, KCP), checksums, and data. Note that regardless of which encryption method you use in your upper layer, if you disable kcptun encryption by specifying -crypt none, the transmission will be insecure because the header remains PLAINTEXT, making it susceptible to tampering attacks such as manipulation of the sliding window size, round-trip time, FEC properties, and checksums. aes-128 is recommended for minimal encryption since modern CPUs include AES-NI instructions and perform even better than salsa20 (see the table below).
AEAD Support:
Starting from v20251212, kcptun supports aes-128-gcm, which is an Authenticated Encryption with Associated Data (AEAD) algorithm. It provides both confidentiality and data integrity, effectively preventing ciphertext tampering.
Other possible attacks against kcptun include:
-
Traffic analysis - data flow patterns from specific websites may be identifiable during data exchange. This type of eavesdropping has been mitigated by adapting smux to mix data streams and introduce noise. A perfect solution has not yet emerged; theoretically, shuffling/mixing messages across a larger-scale network may further mitigate this problem.
-
Replay attack - since asymmetric encryption has not been integrated into kcptun, capturing and replaying packets on a different machine is possible. (Note: hijacking sessions and decrypting contents remains impossible). Therefore, upper layers must implement an asymmetric cryptosystem or a derived MAC to guarantee authenticity and prevent replay attacks (ensuring each message is processed exactly once). This vulnerability can only be eliminated by signing requests with private keys or employing an HMAC-based mechanism following initial authentication.
Important:
-cryptand-keymust be identical on both the KCP Client and KCP Server.-crypt xoris insecure and vulnerable to known-plaintext attacks. Do not use this unless you fully understand the implications. (Cryptanalysis note: any type of counter mode is insecure for packet encryption due to shortened counter periods that lead to IV/nonce collisions.)
Benchmarks for crypto algorithms supported by kcptun:
BenchmarkSM4-4 50000 32087 ns/op 93.49 MB/s 0 B/op 0 allocs/op
BenchmarkAES128-4 500000 3274 ns/op 916.15 MB/s 0 B/op 0 allocs/op
BenchmarkAES192-4 500000 3587 ns/op 836.34 MB/s 0 B/op 0 allocs/op
BenchmarkAES256-4 300000 3828 ns/op 783.60 MB/s 0 B/op 0 allocs/op
BenchmarkTEA-4 100000 15359 ns/op 195.32 MB/s 0 B/op 0 allocs/op
BenchmarkXOR-4 20000000 90.2 ns/op 33249.02 MB/s 0 B/op 0 allocs/op
BenchmarkBlowfish-4 50000 26885 ns/op 111.58 MB/s 0 B/op 0 allocs/op
BenchmarkNone-4 30000000 45.8 ns/op 65557.11 MB/s 0 B/op 0 allocs/op
BenchmarkCast5-4 50000 34370 ns/op 87.29 MB/s 0 B/op 0 allocs/op
Benchmark3DES-4 10000 117893 ns/op 25.45 MB/s 0 B/op 0 allocs/op
BenchmarkTwofish-4 50000 33477 ns/op 89.61 MB/s 0 B/op 0 allocs/op
BenchmarkXTEA-4 30000 45825 ns/op 65.47 MB/s 0 B/op 0 allocs/op
BenchmarkSalsa20-4 500000 3282 ns/op 913.90 MB/s 0 B/op 0 allocs/op
Benchmark result from openssl
$ openssl speed -evp aes-128-cfb
Doing aes-128-cfb for 3s on 16 size blocks: 157794127 aes-128-cfb's in 2.98s
Doing aes-128-cfb for 3s on 64 size blocks: 39614018 aes-128-cfb's in 2.98s
Doing aes-128-cfb for 3s on 256 size blocks: 9971090 aes-128-cfb's in 2.99s
Doing aes-128-cfb for 3s on 1024 size blocks: 2510877 aes-128-cfb's in 2.99s
Doing aes-128-cfb for 3s on 8192 size blocks: 310865 aes-128-cfb's in 2.98s
OpenSSL 1.0.2p 14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,64) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: clang -I. -I.. -I../include -fPIC -fno-common -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -arch x86_64 -O3 -DL_ENDIAN -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cfb 847216.79k 850770.86k 853712.05k 859912.39k 854565.80k
The encryption performance in kcptun is as fast as in openssl library(if not faster).
Quantum Resistance, also known as quantum-secure, post-quantum, or quantum-safe cryptography, refers to cryptographic algorithms that can withstand potential code-breaking attempts by quantum computers. Starting with version v20240701, kcptun adopts QPP based on Kuang's Quantum Permutation Pad for quantum-resistant communication.
To enable QPP in kcptun, set the following parameters:
--QPP Enable Quantum Permutation Pads (QPP)
--QPPCount value The prime number of pads to use for QPP. More pads provide greater encryption security. Each pad requires 256 bytes. (default: 61)
You can also specify
"qpp":true,
"qpp-count":61,in your client and server-side JSON configuration files. These two parameters must be identical on both sides.
- To achieve effective quantum resistance, specify at least 211 bytes in the
-keyparameter and ensure-QPPCountis at least 7. - Ensure that
-QPPCountis COPRIME (互素) to 8 (or simply set it to a PRIME number) such as:101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199...
Routers and mobile devices are susceptible to memory constraints. Setting the GOGC environment variable (e.g., GOGC=20) will cause the garbage collector to recycle memory more aggressively. Reference: https://blog.golang.org/go15gc
Primary memory allocation is performed using a global buffer pool xmit.Buf in kcp-go. When bytes need to be allocated, they are obtained from this pool, and a fixed-capacity 1500-byte buffer (mtuLimit) is returned. The rx queue, tx queue, and fec queue all receive bytes from this pool and return them after use to prevent unnecessary zeroing of bytes.
The pool mechanism maintains a high watermark for slice objects. These in-flight objects from the pool survive periodic garbage collection, while the pool retains the ability to return memory to the runtime when idle. The parameters -sndwnd, -rcvwnd, -ds, and -ps affect this high watermark; larger values result in greater memory consumption.
The -smuxbuf parameter also affects maximum memory consumption and maintains a delicate balance between concurrency and resource usage. You can increase this value (default 4MB) to boost concurrency if you have many clients to serve and a powerful server. Conversely, you can decrease this value to serve only 1-2 clients if you're running the program on an embedded SoC system with limited memory. (Note that the -smuxbuf value is not directly proportional to concurrency; testing is required.)
kcptun has builtin snappy algorithms for compressing streams:
Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger.
Reference: http://google.github.io/snappy/
Compression can save bandwidth for PLAINTEXT data and is particularly useful for specific scenarios such as cross-datacenter replication. Compressing redologs in database management systems or Kafka-like message queues before transferring data streams across continents can significantly improve speed.
Compression is enabled by default. You can disable it by setting -nocomp on BOTH the KCP Client and KCP Server (the setting MUST be IDENTICAL on both sides).
type Snmp struct {
BytesSent uint64 // bytes sent from upper level
BytesReceived uint64 // bytes received to upper level
MaxConn uint64 // max number of connections ever reached
ActiveOpens uint64 // accumulated active open connections
PassiveOpens uint64 // accumulated passive open connections
CurrEstab uint64 // current number of established connections
InErrs uint64 // UDP read errors reported from net.PacketConn
InCsumErrors uint64 // checksum errors from CRC32
KCPInErrors uint64 // packet input errors reported from KCP
InPkts uint64 // incoming packets count
OutPkts uint64 // outgoing packets count
InSegs uint64 // incoming KCP segments
OutSegs uint64 // outgoing KCP segments
InBytes uint64 // UDP bytes received
OutBytes uint64 // UDP bytes sent
RetransSegs uint64 // accumulated retransmitted segments
FastRetransSegs uint64 // accumulated fast retransmitted segments
EarlyRetransSegs uint64 // accumulated early retransmitted segments
LostSegs uint64 // number of segs inferred as lost
RepeatSegs uint64 // number of segs duplicated
FECRecovered uint64 // correct packets recovered from FEC
FECErrs uint64 // incorrect packets recovered from FEC
FECParityShards uint64 // FEC segments received
FECShortShards uint64 // number of data shards that's not enough for recovery
}Sending a SIGUSR1 signal to the KCP Client or KCP Server will dump SNMP information to the console, similar to /proc/net/snmp. You can use this information for fine-grained tuning.
These parameters MUST be IDENTICAL on BOTH sides:
- --key and --crypt
- --QPP and --QPPCount
- --nocomp
- --smuxver
https://github.com/skywind3000/kcp/blob/master/README.en.md#protocol-configuration
-mode manual -nodelay 1 -interval 20 -resend 2 -nc 1
Low-level KCP configuration can be modified using manual mode as shown above. Make sure you fully UNDERSTAND what these parameters mean before making ANY manual adjustments.
- https://github.com/skywind3000/kcp -- KCP - A Fast and Reliable ARQ Protocol.
- https://github.com/xtaci/kcp-go/ -- A Production-Grade Reliable-UDP Library for golang
- https://github.com/klauspost/reedsolomon -- Reed-Solomon Erasure Coding in Go.
- https://en.wikipedia.org/wiki/Differentiated_services -- DSCP.
- http://google.github.io/snappy/ -- A fast compressor/decompressor.
- https://www.backblaze.com/blog/reed-solomon/ -- Reed-Solomon Explained.
- http://www.qualcomm.cn/products/raptorq -- RaptorQ Forward Error Correction Scheme for Object Delivery.
- https://en.wikipedia.org/wiki/PBKDF2 -- Key stretching.
- http://blog.appcanary.com/2016/encrypt-or-compress.html -- Should you encrypt or compress first?
- https://github.com/hashicorp/yamux -- Connection multiplexing library.
- https://tools.ietf.org/html/rfc6937 -- Proportional Rate Reduction for TCP.
- https://tools.ietf.org/html/rfc5827 -- Early Retransmit for TCP and Stream Control Transmission Protocol (SCTP).
- http://http2.github.io/ -- What is HTTP/2?
- http://www.lartc.org/ -- Linux Advanced Routing & Traffic Control
- https://en.wikipedia.org/wiki/Noisy-channel_coding_theorem -- Noisy channel coding theorem
- https://zhuanlan.zhihu.com/p/53849089 -- kcptun开发小记
Click here to donate.
(注意:kcptun没有任何社交网站的账号,请小心骗子。)