AI Quality & Evaluation Strategist | Building expertise in AI system reliability, safety, and trustworthiness
I'm on a systematic journey to deepen my expertise in AI quality and evaluation, combining:
- 12 years at Microsoft (Test β SDET β SDE)
- MS in Computer Science from Johns Hopkins University (2023)
- Outlier (Scale AI) β Expert evaluator for programming projects, ensuring high-quality outputs across C++, C#, Python, Java, and SQL
- Hands-on AI tool evaluation β documenting what works, what breaks, and why it matters
Systematically evaluating AI development tools through a quality engineering lens:
- Assessing AI coding assistants for reliability, safety, and real-world performance
- Documenting edge cases, failure modes, and unexpected behaviors
- Building reusable evaluation frameworks for AI-assisted development
This GitHub tracks my AI evaluation work:
/ai-tool-evaluationsβ Structured assessments of AI tools/edge-casesβ Documented failures and interesting behaviors/evaluation-frameworksβ Reusable testing approaches