Bachelor of Science, Computer Science & Engineering · May 2022 – June 2026
Ahsanullah University of Science and Technology, Dhaka, Bangladesh
RAG Vector Embeddings Semantic Search Compression Multi-Agent Systems LLM
Embedding-Driven Wind Forecasting with Semantic Tokenization, Phase Prediction, and Human-in-the-Loop Review
Ananto Nayan Bala, et al.
A wind forecasting system built on NOAA/METAR data that combines LSTM-based speed prediction with compressed semantic representations of wind states. Wind conditions are tokenized into discrete phases, and a GRU model learns to predict upcoming regimes from those tokens. A Human-in-the-Loop interface lets domain experts review, correct, and confirm forecasts — keeping a human in the decision loop for high-stakes outputs. The system also retrieves historically similar wind states and supports live data feeds.
Conference Paper · Under review at ACM RecSys 2026
Multi-Agent Routing as Set-Valued Prediction: A WildChat Benchmark and Cost-Aware Evaluation
Ananto Nayan Bala, et al.
Treats the problem of deciding which AI agents to route a query to as a set-valued prediction task rather than a single-choice classification. A benchmark is built from WildChat conversations, and five routing strategies — KNN, linear multilabel, dependency-aware, encoder-based, and zero-shot LLM — are compared on accuracy, utility, latency, and reproducibility. A cost-aware Weighted Agent Routing (WAR) layer is proposed to balance performance against compute cost.
Survey · Under review at EMNLP 2026
From Retrieval to Reasoning: Retrieval-Augmented Generation Architectures, Strategies, and Deployment Realities
Partha Sarker, Ananto Nayan Bala, et al.
A survey of 40 RAG systems organized not by benchmark rankings but by the design problems each generation of work set out to solve. Six evolutionary groups are identified — covering foundational retrieval architectures, context-window optimization, self-correcting pipelines, graph-based multi-hop reasoning, agentic and domain-specific variants, and efficiency-focused designs. The paper traces a causal thread through the field: what broke in earlier approaches, what insight fixed it, and what gap that fix left open.
Adversarial Forecasting with LSTM vs GAN-LSTM · GitHub
Standard LSTMs tend to produce over-smoothed long-horizon forecasts. This work adds a lightweight discriminator that scores how realistic each prediction looks compared to actual sequences, pushing the LSTM toward outputs that better preserve the structural patterns in the data.
Customer Segmentation using PySpark · GitHub
Applies unsupervised clustering to the Online Retail dataset at scale using PySpark. KMeans and Gaussian Mixture Model (GMM) are run and compared to surface distinct customer groups — distinguishing high-value repeat buyers from low-frequency occasional ones.
Diabetes Prediction — Decision Tree vs KNN · GitHub
Side-by-side comparison of a Decision Tree and a K-Nearest Neighbours classifier on a diabetes dataset, evaluating where each model's decision boundaries hold up and where they break down.
Phishing Website Detection · GitHub
A WEKA-based ML pipeline that takes raw URLs and decides whether they are phishing attempts or legitimate sites. Beyond classification, the pipeline uncovers hidden clusters in URL structure and generates human-readable rules that explain which patterns signal risk.
| Period | Award |
|---|---|
| Ongoing | Competitive Programming & Problem Solving Excellence — LeetCode · ~1000 Problems Solved · Top 8% Content Rating · 11 badges milestone, including the 500-day code submission badge. · Expert DSA mastery. |
| Fall 24 / Spring 24 / Spring 23 / Fall 22 | Scholarship for outstanding academic performance · Tuition waiver for demonstrated academic excellence |
| Category | Tools |
|---|---|
| Programming | Python 3 (Anaconda), C, C#, Dart, C++, Java, PHP, JavaScript, MATLAB |
| Deep Learning Libraries | PyTorch, NumPy, Scikit-learn, Pandas, TensorFlow-Keras |
| Data Processing | Map-reduce computing, PySpark |
| LLM, RAG Libraries | LangChain, LangGraph, LlamaIndex, Claude Agent SDK, OpenAI |
| Embeddings | Sentence-transformers, Chroma vector database, FAISS |
| LLM Agents Implementation & Automation | Programming Agent context, skills, memory, scope, command |
| Agents Integration API Frameworks | Python-Flask, FastAPI, Uvicorn, Pydantic |
| Tools & Platforms | Git, GPT Codex — Coding Agent, Jupyter Notebook, Docker, VS Code |