Skip to content
View danqi's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Organizations

@stanfordnlp @mrqa @princeton-nlp

Block or report danqi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"

Python 48 4 Updated Jul 29, 2025

Easily fine-tune, evaluate and deploy Gemma 4, Qwen3.5, Qwen3.6, gpt-oss, DeepSeek-R1, or any open source LLM / VLM!

Python 9,312 777 Updated Jun 16, 2026

[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333

Python 1,168 88 Updated Jan 11, 2024

The HELMET Benchmark

Jupyter Notebook 217 42 Updated Apr 17, 2026

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 254 13 Updated Sep 12, 2025

[NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".

Python 68 15 Updated Aug 15, 2025

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

Python 955 77 Updated Feb 16, 2025

[ACL 2024] Long-Context Language Modeling with Parallel Encodings

Python 169 11 Updated Jun 13, 2024

[ICML 2024] Selecting High-Quality Data for Training Language Models

Python 204 14 Updated Dec 8, 2025

https://acl2023-retrieval-lm.github.io/

JavaScript 157 15 Updated Oct 18, 2023

Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888

Python 37 3 Updated Jun 10, 2024

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Python 643 59 Updated Mar 4, 2024

[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following

Python 138 11 Updated Jul 8, 2024

[EMNLP 2023] C-STS: Conditional Semantic Textual Similarity

Python 74 7 Updated May 23, 2024

[EMNLP 2023] Adapting Language Models to Compress Long Contexts

Python 333 25 Updated Sep 9, 2024

[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions

Jupyter Notebook 125 14 Updated Sep 12, 2024

[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627

Python 518 51 Updated Oct 9, 2024

[NeurIPS 2023] Learning Transformer Programs

Python 166 24 Updated May 21, 2024

Findings of ACL'2023: Optimizing Test-Time Query Representations for Dense Retrieval

Python 30 2 Updated Oct 24, 2023

EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975

Python 38 2 Updated Dec 14, 2023

EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560

Jupyter Notebook 59 1 Updated Feb 28, 2025

A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643

Python 78 5 Updated Sep 4, 2023

[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674

Python 193 13 Updated Jun 14, 2023

Official repo for the paper: Recovering Private Text in Federated Learning of Language Models (in NeurIPS 2022)

Python 61 8 Updated Mar 13, 2023

EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443

Python 86 15 Updated Sep 15, 2024

NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790

Python 27 1 Updated Nov 21, 2022

Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃

Python 117 9 Updated Oct 27, 2022

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

Python 198 31 Updated May 9, 2023

[ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question Answering

Python 43 2 Updated Jun 18, 2022

[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240

Python 168 22 Updated Oct 7, 2022
Next