Gavinic

Richard Tseng Gavinic

Master from UESTC

15 followers · 31 following

chengdu.CHINA

Achievements

Starred repositories

VisionOPD / Vision-OPD

Vision-OPD is a regional-to-global on-policy self-distillation framework that transfers a model's own privileged crop-conditioned perception to its full-image policy, enabling fine-grained visual u…

Python 118 3 Updated Jun 14, 2026

thu-coai / lasa-multilingual-safety

【ACL 2026】LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety

Python 8 1 Updated May 30, 2026

tongjingqi / Thinking-with-Video

We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reache…

Python 306 6 Updated Jun 3, 2026

wangclnlp / MSRL

Code for CVPR 2026 paper "MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning"

Python 10 Updated Mar 27, 2026

Intellindust-AI-Lab / DEIM

[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence

Python 1,550 200 Updated Mar 24, 2026

Sense-X / Co-DETR

[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training

Python 1,348 177 Updated Dec 29, 2024

alchaincyf / karpathy-skill

Andrej Karpathy的认知操作系统。不是语录合集，是可运行的思维框架。Made with 女娲.skill

231 70 Updated May 28, 2026

nashsu / llm_wiki

LLM Wiki is a cross-platform desktop application that turns your documents into an organized, interlinked knowledge base — automatically. Instead of traditional RAG (retrieve-and-answer from scratc…

TypeScript 11,591 1,411 Updated Jun 14, 2026

KnowledgeXLab / EvolveR

Python 70 3 Updated May 8, 2026

EinsiaLab / Frontier-Engineering

Python 91 16 Updated May 24, 2026

multica-ai / andrej-karpathy-skills

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

176,187 17,989 Updated Apr 20, 2026

zhaoyuzhi / ICM-Assistant

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025

Python 16 2 Updated Aug 25, 2025

mitkox / Thinking-with-Visual-Primitives

Clone of DeepSeek Thinking-with-Visual-Primitives

Makefile 139 109 Updated Apr 30, 2026

Tele-EVOL / Aetheria

Python 9 7 Updated Dec 22, 2025

thu-coai / ShieldLM

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]

Python 231 10 Updated Sep 29, 2024

modelscope / MCPBench

The evaluation benchmark on MCP servers

Python 247 16 Updated Sep 3, 2025

flipbook-labs / flipbook

Storybook plugin for Roblox UI

Luau 120 10 Updated Jun 16, 2026

1ranGuan / VST

Streaming Thinking for VideoLLM Streaming Video Understanding

Python 105 1 Updated May 21, 2026

catatsumuri / thinkstream

TypeScript 1 Updated Jun 4, 2026

OFA-Sys / AIR-Bench

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Python 131 5 Updated Dec 9, 2024

longvideoagent / LongVideoAgent

Python 116 5 Updated Apr 8, 2026

jackefn / OmniRAG-Agent

An agentic framework for omni-modal question-answer tasks.

Python 13 Updated Mar 30, 2026

ByteDance-Seed / m3-agent

Python 1,383 113 Updated Feb 12, 2026

aiming-lab / AutoResearchClaw

Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞

Python 13,427 1,574 Updated Jun 3, 2026

FireRedTeam / FireRed-OCR

Python 282 12 Updated Mar 4, 2026

zarazhangrui / tab-out

Keep tabs on your tabs. Turn your "New tabs" page into a mission control, so you can close them easily. Built for people who open too many tabs and never close them.

JavaScript 1,478 426 Updated Apr 14, 2026

NousResearch / hermes-agent

The agent that grows with you

Python 194,494 34,102 Updated Jun 16, 2026

safishamsi / graphify

AI coding assistant skill (Claude Code, Codex, OpenCode, Cursor, Gemini CLI, and more). Turn any folder of code, SQL schemas, R scripts, shell scripts, docs, papers, images, or videos into a querya…

Python 67,712 6,848 Updated Jun 16, 2026

aloshdenny / reverse-SynthID

reverse engineering Gemini's SynthID detection

Python 4,383 476 Updated Apr 29, 2026

bytedance / Portrait-Mode-Video

Video dataset dedicated to portrait-mode video recognition.

Python 58 1 Updated Oct 13, 2025

Richard Tseng Gavinic

Starred repositories

chinese-fonts

document-layout-analysis

React

traffic-sign-recognition

traffic-sign-detection