This project demonstrates gesture recognition using the Infineon BGT60TR13C FMCW radar, implemented for a custom TPU board. It focuses on simple gestures such as up, down, and hold, which are used to control a PyGame Dino game in real time. The goal is to showcase real-time embedded inference with minimal latency.
The main input is a Range-Doppler map, which is projected into the time domain to capture temporal features. This projection enables efficient temporal learning while reducing input dimensionality. An additional range-angle map projection has also been implemented, but it currently introduces latency and affects real-time performance.
Data Collection
To collect gesture data, run the following script:
python3 -m src.utils.rawdata_collectBefore starting the recording, set the appropriate parameters in the script: `
self.recording_type: specify the target gesture class (e.g., 'push', 'pull', 'hold', 'nothing')self.num_frames: define the number of frames to record per sequence
Furthermore, ensure that the time_per_frame setting used during data collection matches the one used during inference, to maintain consistent temporal resolution across the pipeline.
Gestures can be performed repeatedly during a single recording session. The raw data format is stored as numpy array to ensure flexibility of the data usage in the future. The data will be annotated later.
Annotation
To annotate the recorded data, use the annotation tool:
python3 -m src.utils.annotationThis script allows you to label and automatically store individual gesture segments in the collected recordings as CSV file. The annotation consists of file_name, gesture, start_frame defining where the gesture starts, and number of samples in the recording.
Using IFX Dataset
python3 -m src.train_utils.build_dataset.build_transformed \
--data_dir /home/phd_li/dataset/radar_gesture_dataset/fulldataset/ \
--output_dir (output directory) \
--none_class
--stepsize 3
none_class: include none class. step_size: use frame every step_size to reduce the frequency of the dataset.
A simple CNN model is used to allow efficient deployment on the custom TPU board.
Training
To train the model on annotated radar data, run from the root:
python3 -m src.train_distributed --config config/config_file.yamlThis script loads the preprocessed dataset, trains the CNN on gesture classes, and saves the resulting model for inference.
Config File
input_channels corresponds to range-doppler-angle channel. output_classes corresponds to number of gestures.
Inference
To run real-time inference with radar input and visualize outputs:
python3 -m src.realtime_inferenceThis script handles live radar input and displays both the Range-Time Map (RTM) and Doppler-Time Map (DTM) in real time. It also performs live classification and outputs the predicted gesture.
The trained PyTorch model is converted to ONNX or TFLite format, and then compiled using Apache TVM for deployment on a custom TPU board. The final runtime model is executed using the C++ runtime located in the cpp_inference/ directory.
To convert the PyTorch model to ONNX or TFLite format, change the run_id and run:
python3 -m src.utils.runtime_convertThis creates an intermediate format suitable for further optimization.
We use TVM v0.13.0 for compiling the model due to its flexibility in targeting multiple hardware platforms. Instead of converting from ONNX, we use TFLite as the input format to TVM for faster deployment after the conversion.
To compile the model using TVM, run:
python3 -m src.utils.tvm_transformYou can configure the output format by setting the compile_to argument:
-
so: produces a device-specific shared object (.so) compiled model (C/C++ runtime). -
c: produces a tar archive containing C source code, which can be compiled later for any target (not device-specific).
The resulting .so file or .tar file (extracted) should be placed in cpp_inference/.
To compile the C++ runtime for the TVM-generated model, navigate to the build/ directory. Adapt the CMakeList with your extracted folder (e.g., model-micro) and run the following:
cd cpp_inference
mkdir build
cd build
cmake ..
makeDepending on the output format from TVM (.so or C source), run the corresponding executable:
- If TVM output is a shared object (.so):
./run_so- If TVM output is C source code (.c in tar archive):
./run_cMake sure the compiled artifacts and tvm_model.so or C-generated source are correctly placed in the build directory before running. They should be in cpp_inference/.
Infineon Radar SDK: Download version ifxAvian 3.3.1 and run:
radar_sdk/radar_sdk/libs/linux_x64/
python3 -m pip install ifxAvian-3.3.1-py3-none-any.whl