Ethylyikes

Ethylyikes

Stars

33 results for source starred repositories

NVlabs / GDPO

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Python 376 21 Updated Jan 9, 2026

ekonwang / GeometryZero

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

Python 9 Updated Sep 1, 2025

zfr00 / Fact-R1

the official code for Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Python 41 Updated Nov 26, 2025

Hui-design / TSPO

[AAAI 2026] ✨ TSPO: Temporal Sampling Policy Optimization for Long-form Video Language Understanding

Python 117 8 Updated Nov 12, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,571 354 Updated Jan 29, 2026

NOVAglow646 / Monet

Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"

Python 126 2 Updated Feb 4, 2026

genvidbench / GenVidBench

【AAAI 2026】GenVidBench: A 6-Million Benchmark for AI-Generated Video Detection

Python 62 2 Updated Dec 26, 2025

UMass-Embodied-AGI / Mirage

Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)

Python 240 16 Updated Aug 2, 2025

hwanyu112 / Latent-Sketchpad

Python 63 2 Updated Feb 1, 2026

shijian2001 / Video-Thinker

Sparking "Thinking with Videos" via Reinforcement Learning

Python 143 6 Updated Oct 30, 2025

MiniMax-AI / MiniMax-M2

MiniMax-M2, a model built for Max coding & agentic workflows.

2,356 187 Updated Nov 13, 2025

ThinkMorph / ThinkMorph

[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"

Jupyter Notebook 143 3 Updated Jan 26, 2026

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,641 501 Updated Oct 27, 2025

shiwk24 / MathCanvas

This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"

Python 59 Updated Dec 29, 2025

verl-project / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,051 3,205 Updated Feb 6, 2026

RUC-NLPIR / Tool-Star

🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning

Python 315 20 Updated Jan 3, 2026

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 8,965 873 Updated Feb 6, 2026

ICTMCG / FakingRecipe

Official Repository for "FakingRecipe: Detecting Fake News on Short Video Platforms from the Perspective of Creative Process", ACM MM 2024

Python 61 5 Updated Oct 5, 2025

jd-opensource / OxyGent

Multi-agent collaboration framework

Python 1,893 272 Updated Feb 4, 2026

agno-agi / agno

Build multi-agent systems that learn and improve with every interaction.

Python 37,643 4,984 Updated Feb 8, 2026

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 51,724 4,271 Updated Feb 7, 2026

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 12,584 1,195 Updated Feb 7, 2026

QiWang98 / VideoRFT

[NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning

Python 63 2 Updated Jan 6, 2026

StarsfieldAI / R1-V

Witness the aha moment of VLM with less than $3.

Python 4,029 287 Updated May 19, 2025

Cartus / Automated-Fact-Checking-Resources

Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).

555 60 Updated Feb 23, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,192 1,582 Updated Jan 30, 2026

xinyan-cxy / MINT-CoT

[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Python 99 4 Updated Sep 19, 2025

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 67,020 8,144 Updated Feb 4, 2026

Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,349 60 Updated Dec 7, 2025

MLNLP-World / DeepLearning-MuLi-Notes

Notes about courses Dive into Deep Learning by Mu Li

Jupyter Notebook 3,746 592 Updated Apr 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly