Skip to content
View Pablo1337PL's full-sized avatar

Highlights

  • Pro

Block or report Pablo1337PL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Pablo1337PL/README.md

Hello, I'm Jan Taran! 👋

Typing SVG

LinkedIn Email


About me

I'm a Data Science student at the Faculty of Mathematics and Computer Science, Warsaw University of Technology, working toward a B.Sc. in Data Science (Inżynieria i Analiza Danych).

My work sits at the intersection of machine learning, data engineering, and software development — I'm equally comfortable building an ELT pipeline on Azure, training a neural network from scratch, or scraping and analysing a messy real-world dataset.

Mathematically, I'm drawn to probability theory, statistics, and optimisation — I find them endlessly useful for understanding what data is actually trying to say. On the applied side, I'm currently most excited about deep reinforcement learning, in the near future I plan to dive deeper into eXplainable AI.

Outside of code I spend time in the mountains — climbing, cycling, and kayaking.


Tools & Technologies

Languages

Python R SQL

Data Science & ML

NumPy Pandas scikit-learn PyTorch Matplotlib Seaborn Plotly R Shiny

Data Engineering & Cloud

MicrosoftSQLServer Azure Power Bi Docker

Dev Tools

Git Bash LaTeX Arch Linux


Some of my Projects

Bachelor's Thesis — Settlers of Catan Deep RL Agent (in progress, Feb 2027)

Deep RL MCTS PyTorch Python

Designing a hybrid AI agent for the Catan board game (1v1 variant) by combining Monte Carlo Tree Search with deep reinforcement learning. The project explores policy and value network architectures capable of reasoning about long-horizon, multi-resource strategies under partial information.

Web scraping Computer Vision NLP Django SQLite

A full data-science pipeline centred on road-bike listings from Allegro.pl. I scraped thousands of offers into a local SQLite database, then extracted multi-modal features from tabular data, text descriptions (NLP), and product images (CV) to enrich each record. Clustering and anomaly detection surface underpriced listings; everything is exposed through a Django dashboard with filtering and price-trend charts.

Hierarchical Concepts in Images (March – June 2026)

XAI ResNet18 CLIP SUN dataset Python

Co-authored a study on disentangling hierarchical Concept Activation Vectors (CAVs) to improve interpretability in computer vision models. Extracted and compared empirical and semantic concept hierarchies from the SUN dataset using ResNet18 and CLIP backbones. Implemented hierarchy-aware CAV constructions that improved conditional AUC by ~1% over standard baselines.

End-to-End BI Pipeline (March – June 2026)

Azure Data Factory Power BI REST API Star Schema SCD2

Built a fully automated ELT pipeline using Azure Data Factory to ingest real-time job-market data from the Adzuna REST API. Modelled a Star Schema in Azure SQL Database with Slowly Changing Dimensions (SCD Type 2) to track historical role and salary trends. Delivered an interactive Power BI dashboard for slice-and-dice analysis of data-science job demand vs. cost-of-living indicators.

Neural Networks from Scratch (April – June 2026)

NumPy MLP Kohonen Network Evolutionary Algorithms

Three linked implementations, all using only NumPy:

  • MLP — fully-connected network with SGD/Adam/RMSProp, early stopping, and L1/L2 regularisation
  • Kohonen Network — Self-Organising Map for unsupervised clustering
  • Evolutionary Algorithms — genetic search as an alternative to backpropagation for network training

AutoML Ensembles Scikit-learn Python

End-to-end AutoML class that takes any binary classification dataset from raw input to calibrated predictions in under 20 minutes: automated preprocessing → model screening across 8+ algorithms → stacked ensemble construction. Ranked 3rd out of 20 teams in a university competition evaluated across 3 benchmark datasets.

Random Forest LightGBM SVM Bayesian Search

Analysed the tunability of hyperparameters for Random Forest, LightGBM, and SVM across four datasets. Compared grid, random, and Bayesian sampling strategies; identified configurations that consistently outperformed library defaults and proposed new default hyperparameter sets with improved average performance, documented with ablation-style analysis.

Django AJAX SQLite Docker RBAC

Full-stack web app for managing science-club projects with role-based access control across three user tiers (member, project leader, admin). Real-time UI updates via AJAX reduce page reloads and improve UX for project application and approval workflows. Fully containerised with Docker for reproducible deployment.

R Shiny ggplot2 Data Visualisation

Interactive dashboard built in R Shiny to explore and visualise lifestyle data collected from a group of friends. Focused on making personal data explorable through clean, interactive charts.

Contact

Pinned Loading

  1. TWD-Project-2 TWD-Project-2 Public

    Project about us, tracking daily activities

    R 1 1

  2. Web-App-in-Django Web-App-in-Django Public

    Early-pass assignment for the Web Programming course.

    Python

  3. Wysocki-Piotr/Hyperparameter-optimization Wysocki-Piotr/Hyperparameter-optimization Public

    Jupyter Notebook 1

  4. Wysocki-Piotr/AutoML-system-binary-classification Wysocki-Piotr/AutoML-system-binary-classification Public

    Python 1

  5. PrzetwarzanieObrazow PrzetwarzanieObrazow Public

    Python 1