Skip to content

Pull requests: huggingface/nanotron

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Making nanotron working out of the box
#399 opened Mar 11, 2026 by giux78 Loading…
a
#397 opened Feb 25, 2026 by RaghavSinghal10 Loading…
6 tasks
Add validation loss for the carbon runs
#396 opened Feb 18, 2026 by loubnabnl Contributor Draft
WIP: local fixes for carbon training
#395 opened Feb 18, 2026 by loubnabnl Contributor Draft
[Carbon] hybrid loss function
#394 opened Feb 16, 2026 by kashif Loading…
6 tasks
SmolLM3 HF-to-nanotron conversion
#393 opened Feb 11, 2026 by loubnabnl Contributor Loading…
Fixes #384
#385 opened Jul 29, 2025 by EliMCosta Loading…
6 tasks done
SmolLM3 nanotron->hf converter
#382 opened Jul 7, 2025 by anton-l Member Loading…
6 tasks
Removed assertion for s3 datasets and handled string and object cases
#381 opened Jul 3, 2025 by SulRash Loading…
2 of 6 tasks
Fixed nanoset data stage handling during pretraining
#380 opened Jul 3, 2025 by SulRash Loading…
2 of 6 tasks
Fix issue while running tiny llama script on ADA 4000 gpu
#379 opened Jul 2, 2025 by chetandhembre Loading…
2 of 6 tasks
Extra name argument to select configuration of hf dataset
#378 opened Jun 30, 2025 by SulRash Loading…
1 of 6 tasks
Fixed llama parameterization config use
#377 opened Jun 30, 2025 by SulRash Loading…
2 of 6 tasks
lighteval fixes
#374 opened Jun 23, 2025 by NouamaneTazi Member Loading…
6 tasks
Expert Parallelism
#373 opened Jun 11, 2025 by xrsrke Contributor Loading…
[WIP] Fix Llama inference
#370 opened May 29, 2025 by duynht Loading…
2 of 6 tasks
Hynky/lighteval fix
#367 opened May 16, 2025 by hynky1999 Loading…
6 tasks
Expert Parallelism
#361 opened Apr 29, 2025 by xrsrke Contributor Loading…
6 tasks
quicks
#338 opened Apr 4, 2025 by NouamaneTazi Member Draft
6 tasks
calcuate mean token accuracy metric while training
#337 opened Apr 4, 2025 by kashif Loading…
[WIP] Add multilingual evals
#336 opened Apr 2, 2025 by anton-l Member Loading…
6 tasks
ProTip! Updated in the last three days: updated:>2026-06-07.