benchmark-datasets

Here are 43 public repositories matching this topic...

dreadnode / AIRTBench-Code

Code Repository for: AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models

security benchmarking benchmark research ai evaluations hacking artificial-intelligence cybersecurity ctf agents offensive-security ai-agents benchmark-datasets llm cyber-evals

Updated Dec 14, 2025
Jupyter Notebook

Event-AHU / OpenPAR

Star

[OpenPAR] An open-source framework for Pedestrian Attribute Recognition, based on PyTorch

mamba benchmark-datasets pedestrian-attribute-recognition msp60k-dataset

Updated Dec 12, 2025
Python

opendatalab-raiser / Envision

Star

Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights

benchmark vlm benchmark-datasets t2i mllm

Updated Dec 2, 2025
JavaScript

North-Shore-AI / crucible_datasets

Star

Dataset management and caching for AI research benchmarks

Updated Dec 1, 2025
Elixir

gagolews / clustering-results-v1

Star

A framework for benchmarking clustering algorithms – Benchmark results (for version 1 of the Suite)

data benchmark machine-learning clustering dataset datasets benchmark-datasets

Updated Nov 5, 2025
Python

Yusen-Peng / CE-Bench

Star

[BlackboxNLP Workshop @ EMNLP, 2025] CE-Bench: A Contrastive Evaluation Benchmark of LLM Interpretability with Sparse Autoencoders

sparse-autoencoder benchmark-datasets mechanistic-interpretability

Updated Nov 4, 2025
Jupyter Notebook

tlu-dt-nlp / EstGEC-L2-Corpus

Star

Estonian Grammatical Error Correction (GEC) test and development corpus that contains L2 learner texts error-annotated in the M2 format.

annotation corpus error-corpora estonian-language language-resources benchmark-datasets gold-standard grammatical-error-correction

Updated Sep 28, 2025
Python

arka-lsik / Statistical-Model-for-Power-Consumption-of-VLSI-Circuits-and-Effect-of-Quantized-Audio-Signal

Star

My Master's Thesis Project at IIT Kharagpur, (May'24 - June'25), [Place: IPCV Lab, E&ECE, IIT Kharagpur]

machine-learning-algorithms expectation-maximization audio-processing sequential-logic benchmark-datasets vlsi-circuits gmm-clustering powerconsumption

Updated Sep 21, 2025
MATLAB

madhav1ag / CDeCNet

Star

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

table pytorch object-detection sota benchmark-datasets table-detection table-detection-using-deep-learning cdec-net

Updated Sep 11, 2025
Python

rohit901 / VANE-Bench

Star

[NAACL'25] Contains code and documentation for our VANE-Bench paper.

benchmark-datasets multimodal-deep-learning video-anomaly-detection large-language-models multimodal-large-language-models large-multimodal-models

Updated Aug 19, 2025
Python

krishnanlab / obnb

Star

A Python toolkit for setting up benchmarking dataset using biomedical networks

machine-learning computational-biology benchmark-datasets network-biology

Updated Dec 8, 2025
Python

futianfan / clinical-trial-outcome-prediction

Star

benchmark dataset and Deep learning method (Hierarchical Interaction Network, HINT) for clinical trial approval probability prediction, published in Cell Patterns 2022.

benchmark machine-learning deep-learning dataset clinical-trials dataset-generation drug-design life-sciences clinical-research clinical-data benchmark-datasets graph-neural-networks drug-development therapeutics clinical-research-data-warehouse

Updated Jun 24, 2025
Python

AI-team-UoA / GeoQuestions1089

Star

Crowdsourced Geospatial Question-Answering dataset containing triples of question-queries-answers.

sparql geospatial-data dataset knowledge-graph question-answering benchmark-datasets geosparql

Updated Jun 9, 2025
PostScript

Belluxx / LocalAIME

Star

Test your local LLMs on the AIME problems

benchmark-datasets local-llm llm-benchmarking

Updated Jun 7, 2025
Python

gagolews / clustering-data-v1

Star

A framework for benchmarking clustering algorithms – Benchmark suite, version 1

data benchmark machine-learning clustering dataset datasets benchmark-datasets

Updated May 21, 2025
Jupyter Notebook

google-deepmind / forest_typology

Star

Datasets to protect Earth's forests and biodiversity

machine-learning sustainability deep-learning forest remote-sensing biodiversity earth-observation benchmark-datasets habitat-mapping

Updated May 6, 2025
Jupyter Notebook

tznurmin / TEA_datasets

Star

Pathogen Identifier and Strain Tagger datasets

machine-learning taxonomy open-data biomedical benchmark-datasets nlp-datasets pathogen-detection ner-datasets strain-recognition

Updated Mar 22, 2025

soubhiksanyal / now_evaluation

Star

This is the official repository for evaluation on the NoW Benchmark Dataset. The goal of the NoW benchmark is to introduce a standard evaluation metric to measure the accuracy and robustness of 3D face reconstruction methods from a single image under variations in viewing angle, lighting, and common occlusions.

python computer-vision python3 face flame 3d-reconstruction face-reconstruction face-alignment triplet-loss 3d-data 3d-face-alignment 2d-3d 3d-mesh benchmark-datasets 3d-landmarks flame-model single-image-reconstruction now-evaluation face-reconstruction-challenge

Updated Feb 23, 2025
Python

PasanBhanu / time-series-forcasting-benchmark-dataset-preprocessing

Star

Benchmark Datasets for Time Series Forecasting Preprocessing - NASA HTTP Dataset, WorldCup98 Dataset

machine-learning datasets benchmark-datasets

Updated Feb 19, 2025
Jupyter Notebook

ali-vilab / IDEA-Bench

Star

Official repository of IDEA-Bench

image-editing text-to-image image-to-image benchmark-datasets text-to-image-generation text-to-image-evaluation

Updated Jan 24, 2025
Python

Improve this page

Add a description, image, and links to the benchmark-datasets topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the benchmark-datasets topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark-datasets

Here are 43 public repositories matching this topic...

dreadnode / AIRTBench-Code

Event-AHU / OpenPAR

opendatalab-raiser / Envision

North-Shore-AI / crucible_datasets

gagolews / clustering-results-v1

Yusen-Peng / CE-Bench

tlu-dt-nlp / EstGEC-L2-Corpus

arka-lsik / Statistical-Model-for-Power-Consumption-of-VLSI-Circuits-and-Effect-of-Quantized-Audio-Signal

madhav1ag / CDeCNet

rohit901 / VANE-Bench

krishnanlab / obnb

futianfan / clinical-trial-outcome-prediction

AI-team-UoA / GeoQuestions1089

Belluxx / LocalAIME

gagolews / clustering-data-v1

google-deepmind / forest_typology

tznurmin / TEA_datasets

soubhiksanyal / now_evaluation

PasanBhanu / time-series-forcasting-benchmark-dataset-preprocessing

ali-vilab / IDEA-Bench

Improve this page

Add this topic to your repo