Skip to content
View Ethylyikes's full-sized avatar

Block or report Ethylyikes

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
33 results for source starred repositories
Clear filter

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Python 376 21 Updated Jan 9, 2026

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

Python 9 Updated Sep 1, 2025

the official code for Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Python 41 Updated Nov 26, 2025

[AAAI 2026] ✨ TSPO: Temporal Sampling Policy Optimization for Long-form Video Language Understanding

Python 117 8 Updated Nov 12, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,571 354 Updated Jan 29, 2026

Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"

Python 126 2 Updated Feb 4, 2026

【AAAI 2026】GenVidBench: A 6-Million Benchmark for AI-Generated Video Detection

Python 62 2 Updated Dec 26, 2025

Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)

Python 240 16 Updated Aug 2, 2025
Python 63 2 Updated Feb 1, 2026

Sparking "Thinking with Videos" via Reinforcement Learning

Python 143 6 Updated Oct 30, 2025

MiniMax-M2, a model built for Max coding & agentic workflows.

2,356 187 Updated Nov 13, 2025

[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"

Jupyter Notebook 143 3 Updated Jan 26, 2026

Open-source unified multimodal model

Python 5,641 501 Updated Oct 27, 2025

This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"

Python 59 Updated Dec 29, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,051 3,205 Updated Feb 6, 2026

🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning

Python 315 20 Updated Jan 3, 2026

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 8,965 873 Updated Feb 6, 2026

Official Repository for "FakingRecipe: Detecting Fake News on Short Video Platforms from the Perspective of Creative Process", ACM MM 2024

Python 61 5 Updated Oct 5, 2025

Multi-agent collaboration framework

Python 1,893 272 Updated Feb 4, 2026

Build multi-agent systems that learn and improve with every interaction.

Python 37,643 4,984 Updated Feb 8, 2026

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 51,724 4,271 Updated Feb 7, 2026

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 12,584 1,195 Updated Feb 7, 2026

[NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning

Python 63 2 Updated Jan 6, 2026

Witness the aha moment of VLM with less than $3.

Python 4,029 287 Updated May 19, 2025

Links to conference/journal publications in automated fact-checking (resources for the TACL22/EMNLP23 paper).

555 60 Updated Feb 23, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,192 1,582 Updated Jan 30, 2026

[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Python 99 4 Updated Sep 19, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 67,020 8,144 Updated Feb 4, 2026

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,349 60 Updated Dec 7, 2025

Notes about courses Dive into Deep Learning by Mu Li

Jupyter Notebook 3,746 592 Updated Apr 11, 2023
Next