Master's-trained engineer turned AI/ML researcher β I bridge the gap between rigorous engineering methodology and modern AI systems. Based in Kathmandu, Nepal π³π΅
I hold an M.S. in Civil Engineering (Earthquake Engineering) from Tribhuvan University, where my thesis combined Python, Finite Element Methods, and a modified Particle Swarm Optimization algorithm to reduce structural computation costs by 51.7%. That project sparked a deeper interest in machine learning, LLM evaluation, and agentic AI β which is where my focus lies today.
- LLM Evaluation β benchmarking reasoning quality, designing error taxonomies (hallucinations, reasoning gaps, instruction failures), and building human-in-the-loop workflows
- AI/ML Research β experiment design, model evaluation, feature engineering, and statistical analysis
- Agentic Systems β building document research agents with retrieval, prompt orchestration, and multi-step QA
- Optimization β metaheuristic methods (PSO), FEM-based frameworks, computational efficiency
Languages & Data
Python NumPy Pandas SciPy Matplotlib
ML & AI
scikit-learn TensorFlow Keras PyTorch supervised learning regression cross-validation
LLM & Evaluation
prompt design LLM benchmarking error analysis structured annotation quality assurance
Tools
Jupyter Notebook Git/GitHub VS Code Docker
Advanced AI Video Trainer @ Verity Labs (2025 β Present, Remote) Evaluate AI-generated technical content for factual accuracy, reasoning quality, and clarity. Provide structured feedback on model errors and contribute to human-in-the-loop evaluation pipelines.
Data Specialist @ CloudFactory (Apr 2022 β Jun 2024, Remote) Processed and validated large structured datasets with a focus on quality, consistency, and specification adherence.
Document Research Agent for Multi-Step QA (2026) LLM-based research assistant that retrieves information from documents and answers multi-step questions. Implements document chunking, retrieval, and prompt orchestration β evaluated on factual accuracy and reasoning reliability.
LLM Reasoning Benchmark & Error Analysis (2025) Built a benchmark of multi-step reasoning tasks (logical reasoning, quantitative problem-solving, instruction-following). Designed an error taxonomy for systematic model evaluation.
AI Output Evaluation Pipeline (2025) Structured workflow to assess AI responses for factual accuracy, reasoning quality, and instruction adherence. Includes evaluation templates, scoring rubrics, and error analysis reports.
Structural Optimization via FEM + PSO (2025 β M.S. Thesis) Python-based framework combining Finite Element Methods with modified Particle Swarm Optimization for planar truss size optimization. Reduced computational cost by 51.7%.
- Machine Learning Specialization β DeepLearning.AI (2025)
- ML for Engineering Applications β Skill Shiksha (2024)
- Python Programming β Skill Shiksha (2024)
- Advanced LLM reasoning evaluation and benchmark design
- Agentic AI systems and retrieval-augmented generation (RAG)
- SQL for data analysis workflows
Open to collaborations in LLM evaluation, AI research, and applied ML. Feel free to reach out!