-
-
-
-
RNMP_homework1 Public
This project simulates message production and consumption using Kafka, with real-time data transformations via Flink, all running within a Docker environment. Requires: Docker, Git, and Python.
Python UpdatedDec 23, 2024 -
RNMP_homework2 Public
A recommendation system project that uses the Spark MLlib's ALS model to train and evaluate on the MovieLens dataset. Includes Dockerized setup, hyperparameter tuning, and evaluation metrics (RMSE,…
Python UpdatedDec 23, 2024 -
RNMP_homework3 Public
A Spark Streaming and Kafka-based project for processing health data in real-time. Includes a machine learning pipeline for predictions, Dockerized infrastructure, and scripts for data ingestion, m…
Python UpdatedDec 21, 2024 -
music_analytics Public
This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and mana…
Jupyter Notebook UpdatedDec 10, 2024 -
This project implements models for Protein-Protein Interaction (PPI) prediction, focusing on graph-based methods such as the SOTA GNN model call GIPA.
-
crypto_stats Public
A project that provides a cloud-native solution for ingesting, transforming, and visualizing cryptocurrency data, utilizing modern tools and workflows for scalability and automation.
Python UpdatedDec 7, 2024 -
-
-