Skip to content
View nayanananto's full-sized avatar

Block or report nayanananto

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
nayanananto/README.md

LeetCode


EDUCATION

Bachelor of Science, Computer Science & Engineering  ·  May 2022 – June 2026
Ahsanullah University of Science and Technology, Dhaka, Bangladesh


RESEARCH INTEREST

RAG   Vector Embeddings   Semantic Search   Compression   Multi-Agent Systems   LLM


THESIS

Embedding-Driven Wind Forecasting with Semantic Tokenization, Phase Prediction, and Human-in-the-Loop Review
Ananto Nayan Bala, et al.

A wind forecasting system built on NOAA/METAR data that combines LSTM-based speed prediction with compressed semantic representations of wind states. Wind conditions are tokenized into discrete phases, and a GRU model learns to predict upcoming regimes from those tokens. A Human-in-the-Loop interface lets domain experts review, correct, and confirm forecasts — keeping a human in the decision loop for high-stakes outputs. The system also retrieves historically similar wind states and supports live data feeds.


PUBLICATIONS UNDER REVIEW

Conference Paper  ·  Under review at ACM RecSys 2026
Multi-Agent Routing as Set-Valued Prediction: A WildChat Benchmark and Cost-Aware Evaluation
Ananto Nayan Bala, et al.

Treats the problem of deciding which AI agents to route a query to as a set-valued prediction task rather than a single-choice classification. A benchmark is built from WildChat conversations, and five routing strategies — KNN, linear multilabel, dependency-aware, encoder-based, and zero-shot LLM — are compared on accuracy, utility, latency, and reproducibility. A cost-aware Weighted Agent Routing (WAR) layer is proposed to balance performance against compute cost.


Survey  ·  Under review at EMNLP 2026
From Retrieval to Reasoning: Retrieval-Augmented Generation Architectures, Strategies, and Deployment Realities
Partha Sarker, Ananto Nayan Bala, et al.

A survey of 40 RAG systems organized not by benchmark rankings but by the design problems each generation of work set out to solve. Six evolutionary groups are identified — covering foundational retrieval architectures, context-window optimization, self-correcting pipelines, graph-based multi-hop reasoning, agentic and domain-specific variants, and efficiency-focused designs. The paper traces a causal thread through the field: what broke in earlier approaches, what insight fixed it, and what gap that fix left open.


PROJECTS

Adversarial Forecasting with LSTM vs GAN-LSTM  ·  GitHub

Standard LSTMs tend to produce over-smoothed long-horizon forecasts. This work adds a lightweight discriminator that scores how realistic each prediction looks compared to actual sequences, pushing the LSTM toward outputs that better preserve the structural patterns in the data.

Customer Segmentation using PySpark  ·  GitHub

Applies unsupervised clustering to the Online Retail dataset at scale using PySpark. KMeans and Gaussian Mixture Model (GMM) are run and compared to surface distinct customer groups — distinguishing high-value repeat buyers from low-frequency occasional ones.

Diabetes Prediction — Decision Tree vs KNN  ·  GitHub

Side-by-side comparison of a Decision Tree and a K-Nearest Neighbours classifier on a diabetes dataset, evaluating where each model's decision boundaries hold up and where they break down.

Phishing Website Detection  ·  GitHub

A WEKA-based ML pipeline that takes raw URLs and decides whether they are phishing attempts or legitimate sites. Beyond classification, the pipeline uncovers hidden clusters in URL structure and generates human-readable rules that explain which patterns signal risk.


AWARDS & ACHIEVEMENTS

Period Award
Ongoing Competitive Programming & Problem Solving Excellence — LeetCode · ~1000 Problems Solved · Top 8% Content Rating · 11 badges milestone, including the 500-day code submission badge. · Expert DSA mastery.
Fall 24 / Spring 24 / Spring 23 / Fall 22 Scholarship for outstanding academic performance · Tuition waiver for demonstrated academic excellence

IMPLEMENTATION SKILLS

Category Tools
Programming Python 3 (Anaconda), C, C#, Dart, C++, Java, PHP, JavaScript, MATLAB
Deep Learning Libraries PyTorch, NumPy, Scikit-learn, Pandas, TensorFlow-Keras
Data Processing Map-reduce computing, PySpark
LLM, RAG Libraries LangChain, LangGraph, LlamaIndex, Claude Agent SDK, OpenAI
Embeddings Sentence-transformers, Chroma vector database, FAISS
LLM Agents Implementation & Automation Programming Agent context, skills, memory, scope, command
Agents Integration API Frameworks Python-Flask, FastAPI, Uvicorn, Pydantic
Tools & Platforms Git, GPT Codex — Coding Agent, Jupyter Notebook, Docker, VS Code

Pinned Loading

  1. leetcode-solutions leetcode-solutions Public

    Python 1

  2. Adversarial-Forecasting-with-LSTM-vs-GAN-LSTM Adversarial-Forecasting-with-LSTM-vs-GAN-LSTM Public

    Jupyter Notebook

  3. Customer-Segmentation-using-PySpark Customer-Segmentation-using-PySpark Public

    Jupyter Notebook

  4. Diabetes-Prediction-Decision-Tree-vs-KNN Diabetes-Prediction-Decision-Tree-vs-KNN Public

    Jupyter Notebook

  5. Graph-Analytics-on-Employee-Collaboration-Network Graph-Analytics-on-Employee-Collaboration-Network Public

    Jupyter Notebook

  6. java-weka-phishing-detection java-weka-phishing-detection Public

    Java