I'm a Data Science student at the Faculty of Mathematics and Computer Science, Warsaw University of Technology, working toward a B.Sc. in Data Science (Inżynieria i Analiza Danych).
My work sits at the intersection of machine learning, data engineering, and software development — I'm equally comfortable building an ELT pipeline on Azure, training a neural network from scratch, or scraping and analysing a messy real-world dataset.
Mathematically, I'm drawn to probability theory, statistics, and optimisation — I find them endlessly useful for understanding what data is actually trying to say. On the applied side, I'm currently most excited about deep reinforcement learning, in the near future I plan to dive deeper into eXplainable AI.
Outside of code I spend time in the mountains — climbing, cycling, and kayaking.
Languages
Data Science & ML
Data Engineering & Cloud
Dev Tools
Deep RLMCTSPyTorchPython
Designing a hybrid AI agent for the Catan board game (1v1 variant) by combining Monte Carlo Tree Search with deep reinforcement learning. The project explores policy and value network architectures capable of reasoning about long-horizon, multi-resource strategies under partial information.
Allegro Product Analysis (June 2026)
Web scrapingComputer VisionNLPDjangoSQLite
A full data-science pipeline centred on road-bike listings from Allegro.pl. I scraped thousands of offers into a local SQLite database, then extracted multi-modal features from tabular data, text descriptions (NLP), and product images (CV) to enrich each record. Clustering and anomaly detection surface underpriced listings; everything is exposed through a Django dashboard with filtering and price-trend charts.
Hierarchical Concepts in Images (March – June 2026)
XAIResNet18CLIPSUN datasetPython
Co-authored a study on disentangling hierarchical Concept Activation Vectors (CAVs) to improve interpretability in computer vision models. Extracted and compared empirical and semantic concept hierarchies from the SUN dataset using ResNet18 and CLIP backbones. Implemented hierarchy-aware CAV constructions that improved conditional AUC by ~1% over standard baselines.
End-to-End BI Pipeline (March – June 2026)
Azure Data FactoryPower BIREST APIStar SchemaSCD2
Built a fully automated ELT pipeline using Azure Data Factory to ingest real-time job-market data from the Adzuna REST API. Modelled a Star Schema in Azure SQL Database with Slowly Changing Dimensions (SCD Type 2) to track historical role and salary trends. Delivered an interactive Power BI dashboard for slice-and-dice analysis of data-science job demand vs. cost-of-living indicators.
NumPyMLPKohonen NetworkEvolutionary Algorithms
Three linked implementations, all using only NumPy:
- MLP — fully-connected network with SGD/Adam/RMSProp, early stopping, and L1/L2 regularisation
- Kohonen Network — Self-Organising Map for unsupervised clustering
- Evolutionary Algorithms — genetic search as an alternative to backpropagation for network training
AutoML System for Binary Classification (January 2026)
AutoMLEnsemblesScikit-learnPython
End-to-end AutoML class that takes any binary classification dataset from raw input to calibrated predictions in under 20 minutes: automated preprocessing → model screening across 8+ algorithms → stacked ensemble construction. Ranked 3rd out of 20 teams in a university competition evaluated across 3 benchmark datasets.
Hyperparameter Optimisation Study (December 2025)
Random ForestLightGBMSVMBayesian Search
Analysed the tunability of hyperparameters for Random Forest, LightGBM, and SVM across four datasets. Compared grid, random, and Bayesian sampling strategies; identified configurations that consistently outperformed library defaults and proposed new default hyperparameter sets with improved average performance, documented with ablation-style analysis.
Science Club Project Manager (September 2025)
DjangoAJAXSQLiteDockerRBAC
Full-stack web app for managing science-club projects with role-based access control across three user tiers (member, project leader, admin). Real-time UI updates via AJAX reduce page reloads and improve UX for project application and approval workflows. Fully containerised with Docker for reproducible deployment.
R Shiny Dashboard (2025)
RShinyggplot2Data Visualisation
Interactive dashboard built in R Shiny to explore and visualise lifestyle data collected from a group of friends. Focused on making personal data explorable through clean, interactive charts.