A high-performance K-Nearest Neighbors classifier implementation in C, optimized for embedded systems like ESP32. This project classifies ultrasonic signals from electrical insulators based on their condition (normal vs. defective).
knn_c_v2/
βββ π bin/ # Compiled executables
β βββ knn.exe # Main classifier with accuracy testing
β βββ time.exe # Inference time measurement
β βββ main_esp.exe # Embedded version (no printf)
βββ π obj/ # Object files
β βββ ft_knn.o
β βββ main.o
βββ π data/ # Datasets
β βββ dataset_features.csv # Main feature dataset (1465 samples)
β βββ dataset_improved.csv # Enhanced dataset
β βββ dataset_lda.csv # LDA-processed dataset
βββ π main.c # Main program with debugging
βββ π infer_time.c # Performance measurement
βββ π main_esp.c # Embedded system version
βββ π ft_knn.h # Header with function declarations
βββ π ft_knn.c # Core KNN implementation
βββ π Makefile # Multi-platform build system
βββ π README.md # This file
The project uses a smart Makefile that auto-detects your platform (Windows/Linux) and provides convenient aliases:
# Build main classifier
make
# Build inference time measurement
make time
# Build embedded version
make esp
# Clean all build artifacts
make clean
# Rebuild everything
make re
# Show all available targets
make help# Run main classifier (shows dataset info and accuracy)
./bin/knn.exe
# Measure inference time and energy consumption
./bin/time.exe
# Run embedded version (silent, for microcontrollers)
./bin/main_esp.exe- Total samples: 1,465
- Classes: 5 (Sem Corona, Oxidado, Fezes, Salgado, Flashover)
- Features per sample: 12 (mean, std, and 10 largest peaks)
- Train/Test split: 80%/20% (1,172 train, 293 test)
- Class 0 (Sem Corona): 416 samples (28.4%)
- Class 1 (Oxidado): 290 samples (19.8%)
- Class 2 (Fezes): 282 samples (19.3%)
- Class 3 (Salgado): 238 samples (16.2%)
- Class 4 (Flashover): 239 samples (16.3%)
- Average inference time: ~0.287 ms (1000 executions)
- Estimated energy consumption: ~0.000076 joules per inference
- Optimal K value: 15 neighbors
- Platform: C implementation with Z-score normalization
- Voltage: 3.3V (typical for ESP32)
- Current: 0.08A (estimated during processing)
- Formula: E = V Γ I Γ t
- K-Nearest Neighbors: Euclidean distance with qsort optimization
- Z-score Normalization: Feature standardization for better accuracy
- Dataset Shuffling: Fisher-Yates algorithm for random data splitting
- CSV Parsing: Robust file loading with error handling
- Multi-platform support: Windows, Linux, ESP32
- Organized output: Executables in
bin/, objects inobj/ - Convenient aliases:
timeandespfor quick building - Dependency management: Automatic directory creation
- Clean targets: Complete artifact removal
- Modular design: Separate files for different purposes
- Embedded-ready: Version without stdio dependencies
- Memory efficient: Static arrays, no dynamic allocation
- Cross-platform: Standard C99 with minimal dependencies
// Load and prepare dataset
Dataset dataset;
load_dataset_from_csv(DATASET_PATH, &dataset);
// Split and normalize
Dataset train, test;
shuffle_dataset(&dataset);
split_dataset(&dataset, &train, &test, 0.8f);
float mean[MAX_FEATURES], std[MAX_FEATURES];
compute_mean_std(&train, mean, std);
apply_normalization(&train, mean, std);
apply_normalization(&test, mean, std);
// Classify a sample
int k = 15;
int predicted_class = classify(&test.samples[0], &train, train.num_samples, k);| Target | Purpose | Output | Use Case |
|---|---|---|---|
make |
Main classifier | bin/knn.exe |
Development and testing |
make time |
Performance test | bin/time.exe |
Benchmarking |
make esp |
Embedded version | bin/main_esp.exe |
Microcontroller deployment |
make help |
Show help | Terminal output | Quick reference |
- Compiler: GCC with C99 support
- Platform: Windows, Linux, or ESP32 toolchain
- Dependencies: Standard C library, math library (-lm)
| File | Purpose |
|---|---|
ft_knn.c/h |
Core KNN algorithm implementation |
main.c |
Development version with debugging output |
infer_time.c |
Performance measurement and energy estimation |
main_esp.c |
Production version for embedded systems |
Makefile |
Cross-platform build automation |
-Wall -Wextra -Werror: Strict error checking-std=c99: C99 standard compliance-lm: Math library linking
- Fast sorting: Uses standard library
qsort()for neighbor ranking - Memory layout: Contiguous arrays for cache efficiency
- Minimal dependencies: Only standard C library required
- Platform detection: Automatic Windows/Linux build configuration
Problem: Build fails with "command not found" Solution: Ensure GCC is installed and in your PATH
Problem: Dataset not found
Solution: Verify data/dataset_features.csv exists in the project root
Problem: Different platforms show different performance Solution: Performance varies by hardware; times shown are for reference
# Show all available build targets and options
make help- Average time per inference (
k = 15): 0.286 ms - Measurement done with
clock()directly over theclassify()function - Executed 1000 times with the same sample (
test.samples[0]) - No I/O or normalization steps included
- Formula:
E = V Γ I Γ t - Voltage: 3.3 V
- Estimated current: 80 mA (0.08 A)
- Average energy per inference: ~0.000075 J
| Memory Type | Value | Measurement Method |
|---|---|---|
| Code memory (.text) | 10,480 bytes (~10.2 KB) |
Via size bin/time.exe |
| Global memory (.bss + .data) | 2,388 bytes (~2.3 KB) |
Via size bin/time.exe |
| Local memory (stack) | ~90.1 KB |
Manual estimation (see Stack Breakdown table below) |
| Total estimated RAM | ~92.4 KB |
Sum of global RAM + stack |
| Component | Estimated Size |
|---|---|
Dataset (1500 samples Γ 52 B) |
~78,000 bytes (~78 KB) |
Neighbor neighbors[1500] |
~12,000 bytes (~12 KB) |
mean[12] + std[12] |
96 bytes |
| Total estimated stack | ~90.1 KB |
This project implements the KNN classifier described in the research paper about ultrasonic signal classification for electrical insulator condition monitoring. The C implementation provides:
- Portability: Runs on desktop and embedded systems
- Performance: Optimized for real-time classification
- Accuracy: Maintains classification quality from Python prototype
- Efficiency: Low memory footprint and fast inference
Project: SEMB Course: PPGCC IFCE Name: Alley P.