Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech

This is the repo for ACL 2026 paper "Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech". It only contains the RL post-training part. For SFT, please refer to InfiniSST repo.

How to run the HPO training?

We provide slurm script to run it on 3 8xH100 nodes.

bash docker_sbatch_3node.sh YAML_NAME

You can find the following yaml files under examples/configs

For En-Zh

grpo_infinisst_4b_laal0.5_as_tgtq-5.0 (multiplier=1)
grpo_infinisst_4b_laal0.5_as_tgtq-5.0_m{2...6} (multiplier=2...6)

For En-De

grpo_infinisst_4b_laal0.5_as_tgtq-5.0_de (multiplier=1)
grpo_infinisst_4b_laal0.5_as_tgtq-5.0_de_m{2...6} (multiplier=2...6)

For En-Ja

grpo_infinisst_4b_laal0.5_as_tgtq-5.0_ja (multiplier=1)
grpo_infinisst_4b_laal0.5_as_tgtq-5.0_ja_m{2...6} (multiplier=2...6)

Running this requires the SFT model checkpoint, the speech data pre-encoded into features with the speech encoder, and a docker container.

We provide the example SFT checkpoint for en-zh [here], and the example pre-encoded data [manifest] [encoded feature] and the docker image [here].

Name		Name	Last commit message	Last commit date
Latest commit History 327 Commits
.github		.github
3rdparty		3rdparty
docker		docker
docs		docs
examples		examples
nemo_rl		nemo_rl
plots		plots
tests		tests
tools		tools
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README-nemo-rl.md		README-nemo-rl.md
README.md		README.md
codecov.yml		codecov.yml
default_runtime_env.yaml		default_runtime_env.yaml
dev.ipynb		dev.ipynb
docker_interactive.sh		docker_interactive.sh
docker_interactive_draco.sh		docker_interactive_draco.sh
docker_interactive_eos.sh		docker_interactive_eos.sh
docker_sbatch_3node.sh		docker_sbatch_3node.sh
docker_sbatch_4node.sh		docker_sbatch_4node.sh
docker_sbatch_debug.sh		docker_sbatch_debug.sh
docker_sbatch_draco.sh		docker_sbatch_draco.sh
docker_sbatch_eos_3node.sh		docker_sbatch_eos_3node.sh
docker_sbatch_eos_4node.sh		docker_sbatch_eos_4node.sh
dp_core.cpython-39-x86_64-linux-gnu.so.reload1		dp_core.cpython-39-x86_64-linux-gnu.so.reload1
mypy.ini		mypy.ini
plot.ipynb		plot.ipynb
plot.pdf		plot.pdf
pyproject.toml		pyproject.toml
ray.sub		ray.sub
ray_eos.sub		ray_eos.sub
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech

How to run the HPO training?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Hierarchical Policy Optimization for Simultaneous Translation of Unbounded Speech

How to run the HPO training?

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages