Skip to content
View wanghia's full-sized avatar

Block or report wanghia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,980 3,320 Updated Apr 28, 2026

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 91,058 10,375 Updated Apr 26, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 366,030 75,046 Updated Apr 29, 2026

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 118,979 19,750 Updated Apr 29, 2026

The LLM Evaluation Framework

Python 15,049 1,390 Updated Apr 28, 2026

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 1,395 93 Updated Apr 27, 2026
Python 345 18 Updated May 24, 2025

Abseil Common Libraries (C++)

C++ 17,223 3,006 Updated Apr 29, 2026

Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".

Python 684 83 Updated Apr 18, 2026
Python 4 1 Updated Nov 20, 2025

A lightweight suffix-sorting library

C 406 93 Updated Mar 25, 2020

High-Performance Text Deduplication Toolkit

C++ 62 3 Updated Aug 25, 2025

Tooling for exact and MinHash deduplication of large-scale text datasets

Rust 81 8 Updated Mar 24, 2026

[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation

Python 1,692 169 Updated Apr 20, 2026

PyTorch-native post-training at scale

Python 675 98 Updated Apr 28, 2026

Fast State-of-the-Art Static Embeddings

Python 2,040 121 Updated Apr 21, 2026

🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!

Python 48,525 6,125 Updated Apr 28, 2026

[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"

Python 692 59 Updated Mar 16, 2025

Train transformer language models with reinforcement learning.

Python 18,199 2,675 Updated Apr 29, 2026

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 3,024 257 Updated Apr 20, 2026

AI memory OS for LLM and Agent systems(moltbot,clawdbot,openclaw), enabling persistent Skill memory for cross-task skill reuse and evolution.

TypeScript 8,791 789 Updated Apr 29, 2026

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,415 1,324 Updated Jul 9, 2025

Fast Multimodal Semantic Deduplication & Filtering

Python 916 56 Updated Jan 20, 2026

A PyTorch native platform for training generative AI models

Python 5,279 801 Updated Apr 29, 2026

[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

Python 189 16 Updated Jun 25, 2025

🔥 The API to search, scrape, and interact with the web for AI

TypeScript 112,947 7,196 Updated Apr 28, 2026

🛏 An HTML to Markdown converter written in JavaScript

HTML 11,120 981 Updated Apr 3, 2026
Next