Skip to content
View ZihanWang314's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Block or report ZihanWang314

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ZihanWang314/README.md

Pinned Loading

  1. mll-lab-nu/RAGEN mll-lab-nu/RAGEN Public

    RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

    Python 2.3k 185

  2. deepseek-ai/ESFT deepseek-ai/ESFT Public

    Expert Specialized Fine-Tuning

    Python 705 259

  3. CoE CoE Public

    Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models

    Python 220 27

  4. xingyaoww/mint-bench xingyaoww/mint-bench Public

    Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Zihan Wang*, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng and …

    Python 130 8

  5. mll-lab-nu/TStar mll-lab-nu/TStar Public

    TStar is a unified temporal search framework for long-form video question answering

    Python 68 1

  6. yeruimeng/TraTree yeruimeng/TraTree Public

    Trajectory optimization methods for improving LLM agents via weak-to-strong learning.

    Python 3 1