si0wang

Xiyao Wang si0wang

15 followers · 1 following

University of Maryland, College Park
https://si0wang.github.io/

Achievements

Stars

tyxiong23 / Multi-Crit

Python 14 1 Updated Feb 21, 2026

si0wang / ViCrit

Python 24 1 Updated Jun 18, 2025

morse-benchmark / morse-500

Jupyter Notebook 31 3 Updated Feb 26, 2026

AnsonZnl / RehabilitationGuide

颈椎病腰突康复指南，为程序员群体提供简单可靠的康复指南。

Python 3,425 218 Updated Dec 25, 2023

si0wang / ThinkLite-VL

Python 107 6 Updated Jun 10, 2025

2U1 / Qwen-VL-Series-Finetune

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,817 210 Updated Apr 10, 2026

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Python 9,372 920 Updated Apr 18, 2026

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 1,528 72 Updated Feb 8, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,847 289 Updated Dec 23, 2025

si0wang / VisVM

Python 48 5 Updated Dec 30, 2024

Julia-LiuJ / NLFT

The official implementation of Natural Language Fine-Tuning

Python 54 4 Updated Jan 7, 2025

HJYao00 / Mulberry

[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS

Python 1,243 113 Updated Jan 16, 2026

tianyi-lab / Cherry_LLM

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models

Python 413 27 Updated Jun 25, 2025

JiuhaiChen / CVPR2025-Florence-VL

Python 245 10 Updated Dec 7, 2024

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 70,270 8,600 Updated Apr 12, 2026

YuxiXie / MCTS-DPO

This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.

Jupyter Notebook 329 36 Updated Jan 29, 2026

umd-huang-lab / SIMA

Forked from si0wang/SIMA

Python 9 Updated Apr 30, 2025

shenmishajing / toy_project

Python 8 1 Updated Jul 1, 2024

umd-huang-lab / Mementos

Forked from si0wang/Mementos

Jupyter Notebook 32 Updated Feb 8, 2024

si0wang / COPlanner

Python 23 2 Updated Apr 2, 2024

si0wang / Mementos

Jupyter Notebook 7 1 Updated Feb 28, 2024

weipu-zhang / STORM

Python 132 23 Updated Mar 18, 2026

burchim / DreamerV3-PyTorch

PyTorch implementation of DreamerV3, Mastering Diverse Domains through World Models.

Python 11 2 Updated Feb 16, 2024

apache / singa

a distributed deep learning platform

C++ 3,606 1,270 Updated Mar 23, 2026

kngwyu / mujoco-maze

Simple maze environments using mujoco-py

Python 60 12 Updated Dec 27, 2023

NM512 / dreamerv3-torch

Implementation of Dreamer v3 in pytorch.

Python 836 213 Updated Mar 8, 2026

ARISE-Initiative / robosuite

robosuite: A Modular Simulation Framework and Benchmark for Robot Learning

Python 2,374 700 Updated Mar 3, 2026

opendilab / awesome-model-based-RL

A curated list of awesome model based RL resources (continually updated)

1,334 76 Updated Dec 20, 2025

schroederdewitt / multiagent_mujoco

Benchmark for Continuous Multi-Agent Robotic Control, based on OpenAI's Mujoco Gym environments.

Python 371 35 Updated Mar 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xiyao Wang si0wang

Achievements

Achievements

Block or report si0wang

Stars

tyxiong23 / Multi-Crit

si0wang / ViCrit

morse-benchmark / morse-500

AnsonZnl / RehabilitationGuide

si0wang / ThinkLite-VL

2U1 / Qwen-VL-Series-Finetune

OpenRLHF / OpenRLHF

EvolvingLMMs-Lab / open-r1-multimodal

hkust-nlp / simpleRL-reason

si0wang / VisVM

Julia-LiuJ / NLFT

HJYao00 / Mulberry

tianyi-lab / Cherry_LLM

JiuhaiChen / CVPR2025-Florence-VL

hiyouga / LlamaFactory

YuxiXie / MCTS-DPO

umd-huang-lab / SIMA

shenmishajing / toy_project

umd-huang-lab / Mementos

si0wang / COPlanner

si0wang / Mementos

weipu-zhang / STORM

burchim / DreamerV3-PyTorch

apache / singa

kngwyu / mujoco-maze

NM512 / dreamerv3-torch

ARISE-Initiative / robosuite

opendilab / awesome-model-based-RL

schroederdewitt / multiagent_mujoco