I am a machine learning engineer with over three years of experience in applied ML and software development.
I currently work at EPR Labs, where I develop software and data pipelines for training, evaluating, and deploying predictive and generative ML models in Python.
Previously, during a three-year paid research internship at the PRODIS project, I built and maintained the full machine learning and data processing stack. My work included developing the first phoneme-level GPT model for Polish, CI pipelines for survey processing, GUI transcription QA tools, a batch ASR wrapper, and a web interface for data collection.
Outside of work, I develop cross-platform C++ applications and maintain a Linux home server for self-hosting.
ML & Data
Testing & Deployment
Highlights include:
| Name | Stack | Type | Description |
|---|---|---|---|
| model | Python, PyTorch, NumPy, Pandas | CLI tool | Pipeline for training a phoneme-level GPT model to predict surprisal in Polish. Custom IPA tokenizer, parallelized steps for formant extraction, alignment, and stress annotation. |
| vroom | C++20, SFML3, ImGui | Game Engine | 2D racing game with arcade drift physics, procedurally-generated tracks, and waypoint AI. |
| asr | Python, Whisper, FFmpeg | CLI tool | Pipeline for batch automatic speech recognition with stereo-to-mono conversion. |
| header-warden | C++17 | CLI tool | Multithreaded static analysis tool that reports missing standard library headers in C++ code. |
| aegyo | C++20, SFML3 | Desktop app | GUI app for learning Korean Hangul. |
Full list: ryouze.net/projects
The unlinked projects belong to the science project and remain private for now.