Skip to content
View SivilTaram's full-sized avatar
🐕
Working on something
🐕
Working on something

Organizations

@buaase @sail-sg @MLNLP-World @sea-sailor

Block or report SivilTaram

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A benchmark for LLMs on complicated tasks in the terminal

Python 1,238 438 Updated Dec 20, 2025
Python 71 3 Updated Nov 17, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,298 116 Updated Dec 11, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 21,885 3,825 Updated Dec 22, 2025

Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Python 61 4 Updated May 22, 2025

Defeating the Training-Inference Mismatch via FP16

Python 165 13 Updated Nov 14, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,928 353 Updated Dec 22, 2025

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Python 187 14 Updated Dec 16, 2025

Open-source Trading OS with pluggable AI brain | From market data → AI reasoning → Trade execution | Self-hosted & Multi-exchange

Go 9,200 2,393 Updated Dec 21, 2025

MiniMax-M2, a model built for Max coding & agentic workflows.

2,053 156 Updated Nov 13, 2025

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Python 56 3 Updated Oct 13, 2025

Post-training with Tinker

Python 2,595 253 Updated Dec 20, 2025

The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models scaling law..

Python 45 1 Updated Nov 6, 2025

All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.

Python 1,752 150 Updated Dec 16, 2025

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 868 72 Updated Dec 22, 2025

Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"

Python 377 15 Updated Sep 15, 2025
Python 50 5 Updated Jun 7, 2025

End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Python 341 20 Updated Sep 22, 2025

[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Python 188 21 Updated Jul 7, 2025
Python 48 7 Updated Aug 21, 2025

GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 2,066 140 Updated Dec 18, 2025

SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution

Python 101 5 Updated Sep 24, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,448 1,999 Updated Nov 1, 2025

Renderer for the harmony response format to be used with gpt-oss

Rust 4,083 240 Updated Dec 15, 2025

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 2,347 299 Updated Dec 21, 2025

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,394 204 Updated Dec 20, 2025

The absolute trainer to light up AI agents.

Python 9,776 790 Updated Dec 22, 2025

Qwen Code is a coding agent that lives in the digital world.

TypeScript 16,626 1,422 Updated Dec 22, 2025
Python 44 8 Updated Oct 28, 2025
Next