dylanbouchard

Dylan Bouchard dylanbouchard

AI Researcher, Author of UQLM & LangFair

66 followers · 47 following

CVS Health
United States
in/dylan-bouchard-phd-52594664

Achievements

x2 x3 x3

Achievements

x2 x3 x3

Stars

k4black / codebleu

Pip compatible CodeBLEU metric implementation available for linux/macos/win

Python 129 28 Updated Mar 31, 2025

LiveCodeBench / LiveCodeBench

Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"

Python 786 165 Updated Jul 16, 2025

langchain-ai / langgraph

Build resilient language agents as graphs.

Python 24,248 4,221 Updated Feb 4, 2026

badges / shields

Concise, consistent, and legible badges in SVG and raster format

JavaScript 26,034 5,581 Updated Feb 3, 2026

networkx / networkx

Network Analysis in Python

Python 16,586 3,460 Updated Feb 2, 2026

excalidraw / excalidraw

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 116,021 12,384 Updated Feb 4, 2026

martin-majlis / Wikipedia-API

Python wrapper for Wikipedia

Python 714 86 Updated Feb 2, 2026

shmsw25 / FActScore

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Python 415 63 Updated Apr 13, 2025

jlko / semantic_uncertainty

Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).

Python 404 57 Updated Apr 12, 2024

IINemo / lm-polygraph

Python 431 57 Updated Feb 3, 2026

huggingface / datasets

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Python 21,171 3,091 Updated Feb 4, 2026

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 156,147 31,958 Updated Feb 4, 2026

KRLabsOrg / verbatim-rag

Python 171 21 Updated Feb 2, 2026

KRLabsOrg / LettuceDetect

LettuceDetect is a hallucination detection framework for RAG applications.

Python 529 37 Updated Sep 9, 2025

vibrantlabsai / ragas

Supercharge Your LLM Application Evaluations 🚀

Python 12,500 1,228 Updated Jan 31, 2026

Arize-ai / phoenix

AI Observability & Evaluation

Jupyter Notebook 8,454 709 Updated Feb 4, 2026

deepchecks / deepchecks

Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and mo…

Python 3,974 290 Updated Dec 28, 2025

python-poetry / poetry

Python packaging and dependency management made easy

Python 34,178 2,394 Updated Feb 1, 2026

langchain-ai / open_deep_research

Python 10,473 1,534 Updated Aug 27, 2025

bluewave-labs / verifywise

Complete AI governance and LLM Evals platform with support for EU AI Act, ISO 42001, ISO 27001 and NIST AI RMF. Join our Discord channel: https://discord.com/invite/d3k3E4uEpR

TypeScript 218 83 Updated Feb 4, 2026

facebookresearch / anli

Adversarial Natural Language Inference Benchmark

Python 397 46 Updated May 12, 2022

vectara / open-rag-eval

RAG evaluation without the need for "golden answers"

Python 338 21 Updated Dec 15, 2025

vectara / vectara-answer

LLM-powered Conversational AI experience using Vectara

TypeScript 270 71 Updated May 8, 2025

scikit-learn / scikit-learn

scikit-learn: machine learning in Python

Python 64,899 26,668 Updated Feb 4, 2026

IBM / UQ360

Uncertainty Quantification 360 (UQ360) is an extensible open-source toolkit that can help you estimate, communicate and use uncertainty in machine learning model predictions.

Python 268 64 Updated Sep 17, 2025

Trusted-AI / AIX360

Interpretability and explainability of data and machine learning models

Python 1,759 325 Updated Feb 26, 2025

ibm-granite / granite-guardian

The Granite Guardian models are designed to detect risks in prompts and responses.

Jupyter Notebook 130 13 Updated Oct 8, 2025

ENSTA-U2IS-AI / awesome-uncertainty-deeplearning

This repository contains a collection of surveys, datasets, papers, and codes, for predictive uncertainty estimation in deep learning models.

783 76 Updated Dec 5, 2025

jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

811 52 Updated May 21, 2025

NannyML / nannyml

nannyml: post-deployment data science in python

Python 2,124 178 Updated Jul 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dylan Bouchard dylanbouchard

Achievements

Achievements

Block or report dylanbouchard

Stars

k4black / codebleu

LiveCodeBench / LiveCodeBench

langchain-ai / langgraph

badges / shields

networkx / networkx

excalidraw / excalidraw

martin-majlis / Wikipedia-API

shmsw25 / FActScore

jlko / semantic_uncertainty

IINemo / lm-polygraph

huggingface / datasets

huggingface / transformers

KRLabsOrg / verbatim-rag

KRLabsOrg / LettuceDetect

vibrantlabsai / ragas

Arize-ai / phoenix

deepchecks / deepchecks

python-poetry / poetry

langchain-ai / open_deep_research

bluewave-labs / verifywise

facebookresearch / anli

vectara / open-rag-eval

vectara / vectara-answer

scikit-learn / scikit-learn

IBM / UQ360

Trusted-AI / AIX360

ibm-granite / granite-guardian

ENSTA-U2IS-AI / awesome-uncertainty-deeplearning

jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness

NannyML / nannyml