GitHub - YecanLee/Mink: [ACL 2026 Main] Official PyTorch Implementation of "Min-k Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics"

Min-k Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

[Project Page] | Run Analysis Baseline

📖 Table of Contents [Back to Top]

🌠 Datasets [Back to Top]

The datasets used in this paper are located in the project's datasets folder, including AQuA, GPQA-main, GSM8K, MATH500, and Alpaca Creative Writing.

datasets/
-aqua.parquet
-gpqa-main.csv
-gsm8k.parquet
-math500.jsonl
-alpaca_eval.json

🛸 Dependency Installation [Back to Top]

To install all the dependencies for our paper, run the following command:

pip install -r requirements.txt

We recommend you to build a new conda environment to use the repository.

conda create -n mink python=3.11
conda activate mink
pip install -r requirements.txt

🚀 Run Paper Inference Experiments [Back to Top]

You could choose to run the inference experiments for our proposed method by using one of the following ways:

Run with huggingface transformers library

To run the inference experiments for our proposed method by using the huggingface transformers library, please run the following command:

conda activate mink

python llm_mink.py \ 
--τ 3.0 \
--temperature 1.0 \
--model_name Qwen3-4B-Instruct \

When switching to different datasets, replace the code line from AQuA import * with the corresponding dataset, such as GPQA, GSM8K, or MATH500, and also replace the associated functions accordingly.

🚀 Run Benchmark Inference Reasoning Experiments [Back to Top]

We compared 4 different decoding methods with our proposed method in our paper, those are: Top-k Sampling, Top-p Sampling, Min-p Sampling and Top-nσ Sampling. We compare those methods with the following hyperparameter combinations:

Top-k Sampling: k=20
Top-p Sampling: p=0.9
Min-p Sampling: p=0.1
Top-nσ Sampling: n=1.0

We run the decoding methods on the following 4 models:

We then benchmark the decoding quality of those decoding methods.

We used the dataset for model comparison in our paper to run the experiments.

To run the LLM inference experiments for top-k sampling decoding method, run the following command:

python llm_topk.py \
--k 20 \
--temperature 1.0\
--model_name Qwen3-4B-Instruct \

To run the LLM inference experiments for top-p sampling decoding method, run the following command:

python llm_top-p.py \
--p 0.9 \
--temperature 1.0 \
--model_name Qwen3-4B-Instruct \

To run the LLM inference experiments for min-p sampling decoding method, run the following command:

python llm_minp.py \
--p 0.1\
--temperature 1.0 \
--model_name Qwen3-4B-Instruct \

To run the LLM inference experiments for top-nσ sampling decoding method, run the following command:

python llm_top-nσ.py \
--n 1.0 \
--temperature 1.0 \
--model_name Qwen3-4B-Instruct \

When switching to different datasets, replace the code line from AQuA import * with the corresponding dataset, such as GPQA, GSM8K, or MATH500, and also replace the associated functions accordingly.

🚀 Run Benchmark Creative Writing Experiments [Back to Top]

We compared 6 different decoding methods with our proposed method in our paper, those are: Top-k Sampling, Top-p Sampling, Mirostat, η-Sampling, Min-p Sampling and Top-nσ Sampling. We compare those methods with the following hyperparameter combinations:

Top-k Sampling: k=20
Top-p Sampling: p=0.9
Mirostat: τ=5.0
η-Sampling: η=9×10^-4
Min-p Sampling: p=0.1
Top-nσ Sampling: n=1.0

We run the decoding methods on the following 2 models:

We use llm-as-judge Deepseek V3.2-Exp

To run the creative writing experiments, run the following command:

python creative writing.py \
--model_name Qwen3-4B-Instruct \
--num_prompt 500\

🧪 Benchmark Decoding Methods [Back to Top]

To benchmark the decoding methods, please make sure you have all the dependencies installed.

💪 Enhancements [Back to Top]

Generation could likely be speed-up by:

using torch.compile in PyTorch 2.0, we implemented this by using max_autotune mode in the generation scripts, you may need to modify the torch.compile codes to fit your needs.

TF32 Note (important for Ampere, Hopper, and other recent NVIDIA GPUs users).
When we ran the above generation scripts, TF32 matmuls were disabled per PyTorch's defaults.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
datasets		datasets
human-result		human-result
llm-result		llm-result
AQuA.py		AQuA.py
GPQA.py		GPQA.py
GSM8K.py		GSM8K.py
MATH500.py		MATH500.py
README.md		README.md
creative writing.py		creative writing.py
llm_mink.py		llm_mink.py
llm_minp.py		llm_minp.py
llm_topk.py		llm_topk.py
llm_topnσ.py		llm_topnσ.py
llm_topp.py		llm_topp.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Min-k Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

[Project Page] | Run Analysis Baseline

📖 Table of Contents [Back to Top]

🌠 Datasets [Back to Top]

🛸 Dependency Installation [Back to Top]

🚀 Run Paper Inference Experiments [Back to Top]

Run with huggingface transformers library

🚀 Run Benchmark Inference Reasoning Experiments [Back to Top]

🚀 Run Benchmark Creative Writing Experiments [Back to Top]

🧪 Benchmark Decoding Methods [Back to Top]

💪 Enhancements [Back to Top]

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Min-k Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

[Project Page] | Run Analysis Baseline

📖 Table of Contents [Back to Top]

🌠 Datasets [Back to Top]

🛸 Dependency Installation [Back to Top]

🚀 Run Paper Inference Experiments [Back to Top]

Run with huggingface transformers library

🚀 Run Benchmark Inference Reasoning Experiments [Back to Top]

🚀 Run Benchmark Creative Writing Experiments [Back to Top]

🧪 Benchmark Decoding Methods [Back to Top]

💪 Enhancements [Back to Top]

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages