Bijay Karki Bijaykars

Hi, I'm Bijay Karki 👋

Master's-trained engineer turned AI/ML researcher — I bridge the gap between rigorous engineering methodology and modern AI systems. Based in Kathmandu, Nepal 🇳🇵

I hold an M.S. in Civil Engineering (Earthquake Engineering) from Tribhuvan University, where my thesis combined Python, Finite Element Methods, and a modified Particle Swarm Optimization algorithm to reduce structural computation costs by 51.7%. That project sparked a deeper interest in machine learning, LLM evaluation, and agentic AI — which is where my focus lies today.

🔬 What I Work On

LLM Evaluation — benchmarking reasoning quality, designing error taxonomies (hallucinations, reasoning gaps, instruction failures), and building human-in-the-loop workflows
AI/ML Research — experiment design, model evaluation, feature engineering, and statistical analysis
Agentic Systems — building document research agents with retrieval, prompt orchestration, and multi-step QA
Optimization — metaheuristic methods (PSO), FEM-based frameworks, computational efficiency

🛠️ Tech Stack

Languages & Data Python NumPy Pandas SciPy Matplotlib

ML & AI scikit-learn TensorFlow Keras PyTorch supervised learning regression cross-validation

LLM & Evaluation prompt design LLM benchmarking error analysis structured annotation quality assurance

Tools Jupyter Notebook Git/GitHub VS Code Docker

💼 Experience

Advanced AI Video Trainer @ Verity Labs (2025 – Present, Remote) Evaluate AI-generated technical content for factual accuracy, reasoning quality, and clarity. Provide structured feedback on model errors and contribute to human-in-the-loop evaluation pipelines.

Data Specialist @ CloudFactory (Apr 2022 – Jun 2024, Remote) Processed and validated large structured datasets with a focus on quality, consistency, and specification adherence.

🚀 Projects

Document Research Agent for Multi-Step QA (2026) LLM-based research assistant that retrieves information from documents and answers multi-step questions. Implements document chunking, retrieval, and prompt orchestration — evaluated on factual accuracy and reasoning reliability.

LLM Reasoning Benchmark & Error Analysis (2025) Built a benchmark of multi-step reasoning tasks (logical reasoning, quantitative problem-solving, instruction-following). Designed an error taxonomy for systematic model evaluation.

AI Output Evaluation Pipeline (2025) Structured workflow to assess AI responses for factual accuracy, reasoning quality, and instruction adherence. Includes evaluation templates, scoring rubrics, and error analysis reports.

Structural Optimization via FEM + PSO (2025 — M.S. Thesis) Python-based framework combining Finite Element Methods with modified Particle Swarm Optimization for planar truss size optimization. Reduced computational cost by 51.7%.

📜 Certifications

Machine Learning Specialization — DeepLearning.AI (2025)
ML for Engineering Applications — Skill Shiksha (2024)
Python Programming — Skill Shiksha (2024)

🌱 Currently Exploring

Advanced LLM reasoning evaluation and benchmark design
Agentic AI systems and retrieval-augmented generation (RAG)
SQL for data analysis workflows

📊 GitHub Stats

📫 Let's Connect

Open to collaborations in LLM evaluation, AI research, and applied ML. Feel free to reach out!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bijay Karki Bijaykars

Achievements