A lightweight and production-ready machine learning system for detecting DNS tunneling attacks through advanced traffic analysis, feature engineering, and automated classification. The system parses DNS traffic from PCAP/CSV files, extracts behavioral features, and applies a trained ML model to identify covert data‑exfiltration channels.
The project provides a complete pipeline:
- DNS traffic parsing (PCAP/PCAPNG/CSV)
- Feature extraction & preprocessing
- ML-based classification (normal vs tunneling)
- FastAPI backend for real-time prediction
- Research notebook for experimentation
src
┣ artifacts
┃ ┣ detector.pkl
┃ ┗ processor.pkl
┣ assets/files
┃ ┣ dnsfiltered/dnsfiltered.pcap
┃ ┣ dns_testing/dns_testing.csv
┃ ┗ malll/malll.pcapng
┣ controllers
┃ ┗ FeatureExtractorController.py
┣ data
┃ ┣ info/information.py
┃ ┣ processed/dns_train_dataset_processed.csv
┃ ┗ raw/dns_testing.csv
┃ ┗ dns_tunneling_dataset.csv
┣ notebooks/DNS_Tunneling.ipynb
┣ routes
┃ ┣ data.py
┃ ┗ prediction.py
┣ services
┃ ┣ detector.py
┃ ┣ parser.py
┃ ┗ processor.py
┣ utilities/file.py
┣ app.py
┗ main.py
git clone https://github.com/USERNAME/DNS-Tunneling-Detection.git
cd DNS-Tunneling-Detection
pip install -r requirements.txt$ uvicorn main:app --reload --port 5000
$ straemlit run app.pyOpen API Docs:
http://127.0.0.1:5000/docs
Extracts DNS queries from PCAP/CSV using the internal parser.
Includes:
- Query length
- Entropy
- Subdomain depth
- Frequency patterns
- NXDomain ratios
- Time‑delta features
Uses:
detector.pkl(ML model)processor.pkl(scaler/encoder pipeline)
Output:
0 = Normal Traffic
1 = DNS Tunneling
DNS_Tunneling.ipynb contains:
- Data exploration
- Feature engineering
- ML training
- Visualization
| Format | Description |
|---|---|
| .pcap | Raw DNS packets |
| .pcapng | Modern PCAP |
| .csv | Tabular DNS data |
Pull requests are welcome.