Stars
decoder model experimentation for PARTAGES WP4 phase 1
A full spaCy pipeline and models for scientific/biomedical documents.
System for Medical Concept Extraction and Linking
🪄 Interpreto is an interpretability toolbox for LLMs
🩺 MedInjection-FR — A French biomedical instruction dataset and model suite for analyzing how data origin affects LLM adaptation.
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
IntentGrasp: A Comprehensive Benchmark for Intent Understanding
A Dataset for Discourse-Level Temporal Ordering of Events
The Benchmark of Discourse Understanding in the Era of Reasoning Language Models (eacl 2026)
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
Source code for the paper Dialogue Discourse Parsing as Generation: a Sequence-to-Sequence LLM-based Approach (SIGDial 2024).
Source code for the paper Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer’s Disease Detection (ACL 2025).
ICIP 2019: Frame Attention Networks for Facial Expression Recognition in Videos
[ICLR 2025] Automated Design of Agentic Systems
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
Official implementation of MAIA, A Multimodal Automated Interpretability Agent
Source code for the paper Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization (Findings EMNLP 2025).
A library for mechanistic interpretability of GPT-style language models
Testing Theory of Mind (ToM) in language models with epistemic logic
An extremely fast Python package and project manager, written in Rust.
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent (ACL 2026 Main)
The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".
The Dataset and Official Implementation for <Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations> @ ACL 2024
The Dataset and Official Implementation for <Discursive Circuits: How Do Language Models Understand Discourse Relations?> @ EMNLP 2025