A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modular configuration across SFT, RLVR, and evaluation workflows.

Python 575 43 Updated May 18, 2026

BIT-DataLab / Edit-Banana

Edit Banana: A framework for converting statistical formats into editable.

Python 5,361 363 Updated Jun 23, 2026

blader / humanizer

Claude Code skill that removes signs of AI-generated writing from text

25,751 2,428 Updated Jun 7, 2026

Leey21 / awesome-ai-research-writing

Elevate your AI research writing, no more tedious polishing ✨

29,324 2,257 Updated May 18, 2026

Orchestra-Research / AI-Research-SKILLs

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 10,002 745 Updated Jun 16, 2026

MurrayTom / ToolSafe

Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedback"

Python 67 5 Updated Mar 25, 2026

agentscope-ai / OpenJudge

OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards

Python 679 57 Updated Jun 17, 2026

OpenDCAI / Paper2Any

Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

Python 2,641 184 Updated Jun 17, 2026

opendatalab / MinerU-HTML

MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG applications, and training data generation.

Python 262 26 Updated Mar 27, 2026

llm2014 / llm_benchmark

1,370 14 Updated Jun 23, 2026

MigoXLab / dingo

Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool

Python 717 74 Updated Jun 18, 2026

OpenMOSS / Llamascopium

Performant framework for training, analyzing and visualizing Sparse Autoencoders (SAEs) and their frontier variants.

Python 222 29 Updated Jun 23, 2026

szn-nzs / ElGamal

C++ 1 Updated Oct 17, 2025

JackHCC / PKU-Lessons-Summary

北京大学软件与微电子学院硕士生课程知识点、作业等汇总【Summary of Knowledge Points and Assignments of Peking University Integrated Circuit Major Courses】

169 22 Updated Apr 19, 2022

opendatalab / Meta-rater

[ACL 2025 Best Theme Paper] This is the official implementation for the paper: "Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models"

Python 195 15 Updated Aug 29, 2025

OpenDataArena / OpenDataArena-Tool

Tools for OpenDataArena: Fair, Open, and Transparent Arena for Data

Python 144 16 Updated Mar 15, 2026

decoderesearch / SAELens

Training Sparse Autoencoders on Language Models

Python 1,435 241 Updated Jun 23, 2026

THU-KEG / RM-Bench

[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

Python 83 3 Updated Jul 18, 2025

oslook / cursor-ai-downloads

All Cursor AI's official download links for both the latest and older versions, making it easy for you to update, downgrade, and choose any version. 🚀

TypeScript 3,205 173 Updated Jun 23, 2026

google-gemini / gemini-cli

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 105,511 14,153 Updated Jun 23, 2026

textstat / textstat

📝 python package to calculate readability statistics of a text object - paragraphs, sentences, articles.

Python 1,373 183 Updated Feb 18, 2026

jxmorris12 / language_tool_python

a free, non-AI python grammar checker 📝✅

Python 524 72 Updated Jun 22, 2026

NVIDIA-NeMo / Curator

Scalable data pre processing and curation toolkit for LLMs

Python 1,629 289 Updated Jun 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sun Mengyuan MySunX

Highlights

Organizations

Block or report MySunX

Stars

Leey21 / data-lineage

planepig / rubricbench

Leey21 / arxiv-translator

aiming-lab / AutoResearchClaw

claw-eval / claw-eval

petergpt / bullshit-benchmark

facebookresearch / AbstentionBench

stepfun-ai / SteptronOss