Highlights
- Pro
Stars
A configuration framework that enhances Claude Code with specialized commands, cognitive personas, and development methodologies.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance
A lightweight framework for building LLM-based agents
MoBA: Mixture of Block Attention for Long-Context LLMs
Deep learning with cats (^._.^)
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems.
Megvii FILE Library - Working with Files in Python same as the standard library
[NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
An object detection codebase based on MegEngine.
A multi-language code evaluation tool.