Skip to content
View teowu's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@VQAssessment @Q-Future

Block or report teowu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Arena-Hard-Auto: An automatic LLM benchmark.

Python 972 137 Updated Jun 21, 2025

The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.

Dockerfile 276 7 Updated Sep 26, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,438 1,991 Updated Nov 1, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

9,734 703 Updated Nov 7, 2025

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,371 1,348 Updated Jul 9, 2025

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 87,998 10,077 Updated Dec 19, 2025

open-source coding LLM for software engineering tasks

Python 1,070 127 Updated Sep 30, 2025

A benchmark for evaluating vision-centric, complex video reasoning.

Python 35 1 Updated Aug 26, 2025

Muon is Scalable for LLM Training

1,384 78 Updated Aug 3, 2025

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

1,129 69 Updated Jul 15, 2025

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

Python 3,009 220 Updated Nov 17, 2025

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 66,538 9,521 Updated Dec 16, 2025

PyTorch code for our paper "Grounding-IQA: Grounding Multimodal Language Model for Image Quality Assessment"

54 2 Updated Oct 5, 2025
Python 35 Updated Nov 8, 2024

[ACMMM2025] Official released code for VQA² series models

Python 60 2 Updated Oct 19, 2025
Python 105 2 Updated Dec 30, 2024

[CVPR 2025] Official Dataloader and Evaluation Scripts for VideoAutoArena.

Python 5 3 Updated Nov 29, 2024

[CVPR 2025] Official Dataloader and Evaluation Scripts for VideoAutoBench.

Python 11 1 Updated Nov 28, 2024

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

Jupyter Notebook 348 51 Updated Sep 29, 2025

[CVPR 2025 满分论文 Ratings: 555]

36 Updated May 9, 2025

[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei Li, Sishuo Chen, Xu Sun, Lu Hou

Python 126 4 Updated Apr 4, 2025

[Nips 2025] EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

Python 126 Updated Jul 31, 2025

A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability

104 Updated Nov 28, 2024

VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs

Python 52 1 Updated Mar 9, 2025

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

Python 237 5 Updated Aug 21, 2025

PyTorch code for our paper "Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grain Image Quality Assessment"

26 Updated Oct 7, 2024

Codebase for Aria - an Open Multimodal Native MoE

Jupyter Notebook 1,085 91 Updated Jan 22, 2025

Code for DeCo: Decoupling token compression from semanchc abstraction in multimodal large language models

Python 75 3 Updated Jul 14, 2025

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Python 663 61 Updated Jun 1, 2024
Next