Skip to content
View dylanbouchard's full-sized avatar

Block or report dylanbouchard

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Pip compatible CodeBLEU metric implementation available for linux/macos/win

Python 129 28 Updated Mar 31, 2025

Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"

Python 786 165 Updated Jul 16, 2025

Build resilient language agents as graphs.

Python 24,248 4,221 Updated Feb 4, 2026

Concise, consistent, and legible badges in SVG and raster format

JavaScript 26,034 5,581 Updated Feb 3, 2026

Network Analysis in Python

Python 16,586 3,460 Updated Feb 2, 2026

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 116,021 12,384 Updated Feb 4, 2026

Python wrapper for Wikipedia

Python 714 86 Updated Feb 2, 2026

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Python 415 63 Updated Apr 13, 2025

Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).

Python 404 57 Updated Apr 12, 2024
Python 431 57 Updated Feb 3, 2026

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Python 21,171 3,091 Updated Feb 4, 2026

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 156,147 31,958 Updated Feb 4, 2026
Python 171 21 Updated Feb 2, 2026

LettuceDetect is a hallucination detection framework for RAG applications.

Python 529 37 Updated Sep 9, 2025

Supercharge Your LLM Application Evaluations 🚀

Python 12,500 1,228 Updated Jan 31, 2026

AI Observability & Evaluation

Jupyter Notebook 8,454 709 Updated Feb 4, 2026

Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and mo…

Python 3,974 290 Updated Dec 28, 2025

Python packaging and dependency management made easy

Python 34,178 2,394 Updated Feb 1, 2026

Complete AI governance and LLM Evals platform with support for EU AI Act, ISO 42001, ISO 27001 and NIST AI RMF. Join our Discord channel: https://discord.com/invite/d3k3E4uEpR

TypeScript 218 83 Updated Feb 4, 2026

Adversarial Natural Language Inference Benchmark

Python 397 46 Updated May 12, 2022

RAG evaluation without the need for "golden answers"

Python 338 21 Updated Dec 15, 2025

LLM-powered Conversational AI experience using Vectara

TypeScript 270 71 Updated May 8, 2025

scikit-learn: machine learning in Python

Python 64,899 26,668 Updated Feb 4, 2026

Uncertainty Quantification 360 (UQ360) is an extensible open-source toolkit that can help you estimate, communicate and use uncertainty in machine learning model predictions.

Python 268 64 Updated Sep 17, 2025

Interpretability and explainability of data and machine learning models

Python 1,759 325 Updated Feb 26, 2025

The Granite Guardian models are designed to detect risks in prompts and responses.

Jupyter Notebook 130 13 Updated Oct 8, 2025

This repository contains a collection of surveys, datasets, papers, and codes, for predictive uncertainty estimation in deep learning models.

783 76 Updated Dec 5, 2025

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

811 52 Updated May 21, 2025

nannyml: post-deployment data science in python

Python 2,124 178 Updated Jul 12, 2025
Next