Skip to content
View harshada-javeri's full-sized avatar

Block or report harshada-javeri

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
harshada-javeri/README.md

Harshada Javeri

Applied AI Engineer • AI Reliability • Agent Systems • ML Platform Engineering

Building production-grade AI systems that are scalable, observable, and trustworthy.


About Me

I am a Senior Applied AI Engineer with 7 years of experience spanning machine learning, software engineering, intelligent automation, and AI platform development.

My work focuses on building and evaluating production AI systems, with particular interest in:

  • Agentic AI Systems
  • LLM Evaluation & Benchmarking
  • AI Reliability Engineering
  • Production ML Platforms
  • AI Governance & Safety
  • Model Observability
  • Enterprise AI Deployment

I enjoy operating at the intersection of:

AI Research × Engineering × Product × Deployment


Current Focus

Agentic AI & LLM Systems

  • Multi-agent orchestration
  • RAG architectures
  • Tool-using AI agents
  • LangGraph & CrewAI systems
  • Agent evaluation frameworks
  • Behavioral consistency testing

AI Reliability Engineering

  • Model validation pipelines
  • Drift detection
  • Inference monitoring
  • Reproducibility testing
  • Deployment quality gates
  • AI observability

Applied AI Platforms

  • Production ML workflows
  • Evaluation infrastructure
  • Enterprise AI systems
  • MLOps and CI/CD
  • Compliance-aware AI systems
  • Governance and auditability

Technical Expertise

AI & Agent Systems

Python • LLMs • RAG • LangGraph • LangChain • CrewAI • Multi-Agent Systems • AI Evaluation • LLM Benchmarking • AI Safety • AI Governance

Machine Learning

Scikit-Learn • NLP • Classification • Anomaly Detection • Feature Engineering • Explainable AI • Model Monitoring

MLOps & Infrastructure

MLflow • Docker • Kubernetes • GitHub Actions • Jenkins • AWS • CI/CD • Experiment Tracking

Observability

Prometheus • Grafana • Drift Detection • Inference Monitoring • Reliability Metrics

Data & Backend

SQL • PostgreSQL • Snowflake • IBM DB2 • REST APIs


Selected Projects

Multi-Agent LLM Evaluation System

Built an evaluation and observability platform for agentic systems that:

  • Tracks agent behavior across runs
  • Detects failure patterns and regressions
  • Measures consistency and reliability
  • Supports large-scale benchmarking

Tech: Python, LangGraph, CrewAI, OpenAI APIs, AgentOps, MLflow


Pi-Bench: Policy Intelligence Benchmark

Designed a benchmarking framework for evaluating policy adherence in agentic AI systems.

Capabilities include:

  • Tool-call validation
  • Escalation verification
  • Safety policy enforcement
  • Deterministic evaluation workflows
  • Compliance-focused testing

ML Reliability & Observability Platform

Built validation pipelines and monitoring systems for production ML models.

Focus areas:

  • Drift detection
  • Latency monitoring
  • Reproducibility checks
  • Deployment validation
  • Automated quality gates

Professional Interests

  • Applied AI Engineering
  • Forward Deployed AI
  • Agent Infrastructure
  • AI Reliability Engineering
  • AI Safety & Governance
  • Evaluation Systems
  • Production LLM Applications
  • Human-AI Collaboration

Philosophy

The next generation of AI systems will not be won by larger models alone.

They will be won by teams that can build systems that are:

  • Reliable
  • Observable
  • Auditable
  • Safe
  • Useful in production

I enjoy building the infrastructure and evaluation systems that make this possible.


Connect

📧 harshada.javeri@gmail.com

💼 LinkedIn: linkedin.com/in/harshada-javeri-mle

💻 GitHub: github.com/harshada-javeri


Open to discussions around:

Applied AI • Forward Deployed Engineering • Agent Systems • AI Infrastructure • LLM Evaluation • AI Reliability

Pinned Loading

  1. multiagent-ops-orchestrator multiagent-ops-orchestrator Public

    Automated CI/CD Failure Triage & Remediation using AI Agents for Enterprise Operations

    Python 3 1

  2. printiq printiq Public

    AI-Driven Print Failure & Quality Intelligence Platform

    Python 1

  3. gaia-agentbeats gaia-agentbeats Public

    Forked from pradeepdas/gaia-agentbeats

    GAIA Benchmark on AgentBeats - General AI Assistants evaluation with multi-step reasoning, web search, and tool use

    Python 1

  4. cosmic-trails cosmic-trails Public

    "Discover, Explore, Conquer: Forge Your Trail Among the Stars."

    C#

  5. smart-content-app smart-content-app Public

    Python

  6. radar-pay radar-pay Public

    A blockchain-backed trust layer that verifies QR codes, barcodes, tickets, and coupons before humans or AI agents act on them.

    TypeScript