Machine Learning & Data Science researcher with a Ph.D. in Medical Imaging and years of R&D and consulting experience. I solve real-world problems by crafting state-of-the-art algorithms and turning them into robust, community-driven Python libraries π. Passionate about open source, reproducible research, and scalable ML infrastructure.
- Create & maintain several open-source Python packages used by thousands of developers
- Contributed code, CI/CD pipelines, issue reports & reviews across the ML ecosystem
- Strong focus on testing, automation, and developer experience β from pre-commit hooks to GitHub Actions
achieved top-tier rankings across notebooks, competitions, and datasets, applying practical ML skills to real-world challenge problems and sharing my findings
- Built and led a team to deliver a scalable video-analysis platform from prototype to production
- Director of Open Source at Lightning AI β led the OSS team for 3+ years, driving feature roadmaps, release cycles, and cross-team coordination across PyTorch Lightning, TorchMetrics, and the broader Lightning ecosystem. Mentored contributors, scaled community engagement, and ensured quality across 10+ active repositories
- LinkedIn Learning certified in Leadership Foundations, Leadership: Practical Skills, and Leading Your Team Through Change
- Ph.D. in Medical Imaging β Czech Technical University in Prague
- 15+ journal articles & 20+ conference papers (ISBI, ICIP, ACCV, MICCAI workshops)
- Reviewer for IEEE TMI, TCIA and major international conferences
- Co-organized the ANHIR challenge on histological image registration
Long-term open-source contributor and maintainer. My work spans ML frameworks, developer tooling, and computer vision β always aiming to make research more reproducible and engineering more enjoyable.
Active projects I maintain:
-
ποΈ supervision
The go-to Python toolkit for plugging any detection or segmentation model into real-world CV pipelines. Unlike framework-specific tools, it works with YOLO, Transformers, or any custom model out of the box β providing a unified API for tracking, filtering, annotating, and chaining operations that would otherwise require glue code.
-
π― RF-DETR
A new take on real-time object detection that brings transformer accuracy to YOLO-level speeds. Stands out by matching or beating state-of-the-art on COCO while being straightforward to fine-tune on custom datasets β no complex anchor tuning or NMS hacks needed.
-
β»οΈ pyDeprecate
Born from the pain of managing API changes in large libraries like PyTorch Lightning. A zero-dependency tool that lets library authors deprecate, rename, and redirect functions or classes with automatic call forwarding β so users get clear migration warnings instead of silent breakage.
-
ποΈ cachier
Unlike
functools.lru_cache, cachier persists results across sessions and even across machines. Ideal for caching expensive computations like API calls or data processing β supports MongoDB and file-based backends with built-in staleness handling, so cached results stay fresh without manual invalidation. -
π pyRepoStats
Fills the gap between
git logand full analytics platforms by generating quick contribution stats that include issues and PR activity. Built for maintainers who want a lightweight health check on their projects without setting up dashboards.
Emeritus maintainer β projects I co-created and still partially supervise:
-
β‘ PyTorch Lightning
The most widely adopted framework for scaling PyTorch β used by thousands of teams from academic labs to Fortune 500 companies. Eliminates training loop boilerplate and lets the same code run on a laptop GPU or a 10,000-GPU cluster without changes, bridging the gap between research prototypes and production systems.
-
π TorchMetrics
The standard metrics library for the PyTorch ecosystem, solving the surprisingly hard problem of computing correct metrics in distributed training. Ships 100+ metrics for classification, regression, NLP, and retrieval β all with automatic accumulation and device synchronization that just works across multi-GPU setups.
Past core maintainer projects:
-
π οΈ Lightning Utilities
The shared foundation that keeps all Lightning projects consistent and maintainable. Extracts common patterns β packaging helpers, testing utilities, CLI tooling, and CI/CD workflows β into one place so that fixes and improvements propagate across the entire ecosystem automatically.
-
π© Lightning Bolts
A community-driven collection of reference implementations β VAEs, GANs, SimCLR, and more β built on PyTorch Lightning. Designed to give researchers battle-tested baselines they can reproduce in one command and extend for their own experiments.
-
β‘ Lightning Flash
Made transfer learning as simple as a few lines of code across 15+ tasks β image classification, object detection, text classification, tabular data, and more. Built on PyTorch Lightning, it let practitioners go from idea to baseline in minutes instead of hours.
-
π©οΈ Lightning Thunder
A source-to-source compiler for PyTorch that delivers up to 40% faster training and inference through kernel fusion, operator optimization, and GPU memory management. Unlike opaque compilers, Thunder provides a transparent, Pythonic IR that developers can inspect and customize β with composable plugins for distributed training, quantization, and CUDA Graphs.
-
π Lightning Tutorials
The official tutorial collection powering the PyTorch Lightning documentation. Uses a script-based format instead of heavy notebooks β automatically converting to executable notebooks with full reproducibility tracking, CI-tested across CPU, GPU, and TPU to ensure every example actually runs.
-
π Ecosystem CI
The safety net for the entire Lightning ecosystem β automatically runs downstream test suites against every nightly build and release candidate. Catches breaking changes before they ship, ensuring that hundreds of dependent projects don't break on upgrade day.
-
π§ LitGPT
An opinionated, hackable codebase for working with 20+ LLMs β GPT, Llama, Mistral, and more. Unlike heavyweight frameworks, LitGPT uses plain PyTorch with no abstraction layers, making it easy to modify any part of the training pipeline while still getting optimized performance out of the box.
Past research projects:
-
πΌοΈ pyImSegm
A complete image segmentation pipeline developed during Ph.D. research, combining superpixels, graph cuts, and region growing for medical imaging. Used in multiple published studies on histological tissue analysis and designed to be reproducible from raw data to final results.
-
π BIRL
The benchmarking engine behind the ANHIR grand challenge at ISBI, which brought together teams worldwide to compare image registration methods on histological data. Automates the full pipeline from running registration to evaluating alignment accuracy using expert-annotated landmarks.
Notable contributions to other projects: ultralytics/YOLOv5, DIPY and more...
If you find my open-source work useful, consider sponsoring me π I'm also available for consulting & contract work in ML, MLOps, and Python engineering β see SUPPORT.md for details.