🫡
艰难困苦,玉汝于成
Student @ USTC BDAA-BASE,
Currently interested in LLM, specifically post-training/reasoning(math/code)/agent4se.
-
University of Science and Technology of China
- Hefei, Anhui, China
-
19:39
(UTC -12:00)
Highlights
- Pro
Stars
3
stars
written in TeX
Clear filter
A Survey of Reinforcement Learning for Large Reasoning Models