AI & agents engineer. I build agent systems, LLM dev-tools, and ship products end-to-end.
Engineering at Rimo (Tokyo AI startup) · GSoC '25 @ Scala Center · LFX @ Open Mainframe Project.
Open-source by default.
TypeScript · Go · Scala · Python · LLMs · agents · backends
Open to interesting problems in AI / agents / dev-tools — say hi.
Final-year CS @ IIIT Jabalpur. I spend most of my time on AI agents and the tooling around them — orchestration loops, evals, Claude Code skills, and turning prompts into real products. I ship the whole thing: backend, frontend, and the deploy. Across GSoC and LFX I've also shipped production full-stack features into established open-source projects (Scala, mainframe tooling).
Google Summer of Code 2025 · Scala Center — workflows4s Built a production full-stack Web UI for tracking & debugging workflows: a Scala.js + Tyrian (Elm-style) frontend, a type-safe Tapir REST/OpenAPI backend, client-side Mermaid execution-graph visualization, and a Dockerized Fly.io CI/CD deploy. 8 PRs on the project. ↳ GSoC project · write-up (Business4s Blog) · final report
LFX Mentorship 2024 · Open Mainframe Project (Linux Foundation) — Zowe
Worked on the Zowe App Store UI and Zowe server stability, dev-environment, and the Installation Wizard (zlux-app-server). Mentored by Leanid Astakou (Rocket Software).
↳ Open Mainframe Project write-up
Selected open-source contributions:
workflows4s (Scala, 8) ·
Zowe zlux-app-server ·
KubeArmor (build info → systemd packaging) ·
doodle (gradient fill/stroke on the Canvas backend) ·
cats-effect (Typelevel)
agentclash.dev ·
docs ·
npm i -g agentclash ·
github · 20 stars
Race agents against the same workload, capture exactly what they did, score the outcome, and turn failures into repeatable regression gates. Built for teams shipping agents, not leaderboard demos — it evaluates the whole run: final answer, tool choices, artifacts, latency, cost, and the evidence trail that explains why one agent passed and another failed.
- Challenge packs — package real tasks, inputs, validators, and scoring rules.
- Scorecards & replays — correctness, reliability, latency, cost + the step-by-step trajectory.
- Release gates & CI — compare a candidate against a saved baseline; gate PRs; promote escaped failures into regression suites.
- Try CLI — interactive in-browser terminal demos of real agent CLIs on disposable E2B sandboxes.
| Project | What it is | Stack | Links |
|---|---|---|---|
| learnframe | YouTube-first learning toolkit — CLI + SDK that turns public videos into local courses with transcripts, study artifacts, and timestamp-cited Q&A. Open-source, local-first, no vendor lock-in. | TS · yt-dlp · OpenAI | repo · npm i -g learnframe |
| chalkboard | Open-source engine that turns a prompt into a narrated whiteboard explainer video — real images, diagrams, subtitles, music, vision self-correction. MIT, self-hostable, $0 with local models. | TS · Playwright · ffmpeg · LLMs | repo · demo |
| agentic-memory | Cognitive memory for AI agents — separate semantic / episodic / procedural stores, multimodal embeddings (Gemini), grounded in DeepMind's AGI cognitive framework. | Python · Gemini · embeddings | repo · live |
| skillware | An AI learning harness that orchestrates the learner: pick a topic, get a syllabus + guided path. (deployed; source private) | TS · Next.js · LLMs | live |
| agent-trace | Full observability into a Claude Code run — trace every step an agent takes. | TypeScript | repo |
| voicey | Real-time voice translation web app. | TypeScript | repo |
| e2b-go | Unofficial Go SDK for E2B sandboxes. | Go | repo |
Agent skills I built and use daily — drop the repo into ~/.claude/skills/.
- review-checkpoint — enforces a structured, self-reviewing implementation workflow (write expectations → implement → review → ship).
- grill-my-plan — stress-tests a technical plan against your codebase + outside engineering evidence.
- repo-standup — generates a standup from git history, branches, and TODOs.
- deep-research · founder-outreach · x-article-publisher — research fan-out, personalized outreach drafting, and Markdown → X Articles publishing.
- Handling LLM-generated code & vibe coding in 2025 — Medium
- Building a Web UI for Workflows4s with Scala.js and Tyrian — Business4s Blog (GSoC)
- I also post build logs and AI/agents takes on X @attharrva15.
Rimo — Software Engineer · Tokyo (AI startup) Building Rimo Voice — AI that transcribes and summarizes business meetings. ~80 merged PRs across the backend (Go), the frontend (TypeScript/React), and the LLM gateway — shipping whole features end-to-end, not just tickets:
- Meeting Groups — designed and shipped the subsystem end-to-end: group CRUD + access-control APIs, per-group document templates, calendar events, and auto-applied note settings/titles — plus the participant-management and notes UI.
- Outgoing webhooks — built the outgoing-webhook platform from scratch: settings model, feature flag,
action.completedevents, a URL-validation test endpoint, and fire-and-forget dispatch — backend and settings UI. - LLM prompt caching & cost tracking — multi-message prompt caching across Claude/Gemini in the LLM gateway, per-query AI-cost accounting via an SSE tee, and sequential-then-parallel template dispatch with cache warming to cut latency.
- Transcription quality — participant dictionary / pronunciation support so names transcribe correctly across the ElevenLabs and Soniox engines.
- Knowledge & desktop auth — folder upload + folder-tree grouping for linked knowledge, and a custom-token flow powering browser-based desktop-app sign-in.
IIIT Jabalpur (IIITDM-J) — B.Tech, Computer Science.