Kubernetes-native platform to run massively parallel data/streaming jobs
-
Updated
Dec 17, 2025 - Rust
Kubernetes-native platform to run massively parallel data/streaming jobs
A Java Stochastic Dynamic Programming Library
YouTube & URL summarizer using Whisper (audio→text), yt-dlp, LangChain, and Groq Llama models. Supports full Map-Reduce summarization for long videos and webpages.
Long-content summarizer using LangChain and Groq Llama models. Supports YouTube transcription, website scraping, text chunking, and Map-Reduce LLM summarization.
A LangChain-powered summarizer demonstrating advanced LLM summarization techniques (Stuff, Map-Reduce, Refine) using Groq’s high-speed Llama models for long-form speech and PDF processing.
Course repository for 95-885 Data Science & Big Data, Fall 2025. Contains Python implementations, covering multiple classes in Data Science, Big Data, Machine Learning, and related topics. Includes notebooks, code, and practice exercises across probability, optimization, algorithms, and applied computing.
This project implements a distributed MapReduce system using Python and Flask REST APIs. It enables multiple client nodes to connect to a central server and collaboratively process large text files using the MapReduce paradigm.
Data analysis of Airbnb listings in Sydney & Montreal using MongoDB Aggregation Framework and Python. Insights on pricing, trends, and reviewer behaviour.
Iterable Java8 style Streams for Python
A Python consensus based RAG pipeline that isolates documents and applies semantic agreement across extracted evidence to identify and drop poisoned documents.
Implementations of MST calculation in a distributed setting, assuming super-linear memory.
Web Application Message Async Server and WAMP/MQTT bridge
Parallelized Base functions
Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.
Efficient transducers for Julia
slice, map, iter.Seq, iter.Seq2 APIs for transforming, filtering, reducing, summing and other iteration-based tasks.
Lab works of CSE4252 Distributed Database Management System Lab course.
Search Assistant Built Using LangGraph
The miscellaneous practices for python
Add a description, image, and links to the map-reduce topic page so that developers can more easily learn about it.
To associate your repository with the map-reduce topic, visit your repo's landing page and select "manage topics."