Skip to content
View Gavinic's full-sized avatar
  • chengdu.CHINA

Block or report Gavinic

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Vision-OPD is a regional-to-global on-policy self-distillation framework that transfers a model's own privileged crop-conditioned perception to its full-image policy, enabling fine-grained visual u…

Python 118 3 Updated Jun 14, 2026

【ACL 2026】LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety

Python 8 1 Updated May 30, 2026

We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reache…

Python 306 6 Updated Jun 3, 2026

Code for CVPR 2026 paper "MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning"

Python 10 Updated Mar 27, 2026

[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence

Python 1,550 200 Updated Mar 24, 2026

[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training

Python 1,348 177 Updated Dec 29, 2024

Andrej Karpathy的认知操作系统。不是语录合集,是可运行的思维框架。Made with 女娲.skill

231 70 Updated May 28, 2026

LLM Wiki is a cross-platform desktop application that turns your documents into an organized, interlinked knowledge base — automatically. Instead of traditional RAG (retrieve-and-answer from scratc…

TypeScript 11,591 1,411 Updated Jun 14, 2026
Python 70 3 Updated May 8, 2026

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

176,187 17,989 Updated Apr 20, 2026

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025

Python 16 2 Updated Aug 25, 2025

Clone of DeepSeek Thinking-with-Visual-Primitives

Makefile 139 109 Updated Apr 30, 2026
Python 9 7 Updated Dec 22, 2025

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]

Python 231 10 Updated Sep 29, 2024

The evaluation benchmark on MCP servers

Python 247 16 Updated Sep 3, 2025

Storybook plugin for Roblox UI

Luau 120 10 Updated Jun 16, 2026

Streaming Thinking for VideoLLM Streaming Video Understanding

Python 105 1 Updated May 21, 2026
TypeScript 1 Updated Jun 4, 2026

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Python 131 5 Updated Dec 9, 2024

An agentic framework for omni-modal question-answer tasks.

Python 13 Updated Mar 30, 2026
Python 1,383 113 Updated Feb 12, 2026

Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞

Python 13,427 1,574 Updated Jun 3, 2026
Python 282 12 Updated Mar 4, 2026

Keep tabs on your tabs. Turn your "New tabs" page into a mission control, so you can close them easily. Built for people who open too many tabs and never close them.

JavaScript 1,478 426 Updated Apr 14, 2026

The agent that grows with you

Python 194,494 34,102 Updated Jun 16, 2026

AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a querya…

Python 67,712 6,848 Updated Jun 16, 2026

reverse engineering Gemini's SynthID detection

Python 4,383 476 Updated Apr 29, 2026

Video dataset dedicated to portrait-mode video recognition.

Python 58 1 Updated Oct 13, 2025
Next