Ray-ui

Follow

Tianyu Zhang Ray-ui

Follow

3 followers · 3 following

Achievements

Achievements

Stars

mbzuai-oryx / VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Python 293 20 Updated Aug 5, 2025

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,490 127 Updated Aug 5, 2025

activitynet / ActivityNet

This repository is intended to host tools and demos for ActivityNet

Jupyter Notebook 967 328 Updated Mar 21, 2024

mystorm16 / FastVGGT

[ICLR 2026] FastVGGT: Fast Visual Geometry Transformer

Python 677 39 Updated Jan 28, 2026

TIGER-AI-Lab / VLM2Vec

This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]

Python 572 50 Updated Feb 11, 2026

jiyanggao / TALL

TALL: Temporal Activity Localization via Language Query

Python 217 49 Updated Mar 15, 2018

OptimusPrimus / tacos

Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining

Python 14 1 Updated Oct 12, 2025

facebookresearch / vjepa2

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 3,003 338 Updated Aug 28, 2025

showlab / UniVTG

[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Python 374 34 Updated May 8, 2024

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,757 313 Updated Nov 28, 2025

google-deepmind / perception_test

Jupyter Notebook 242 14 Updated Jun 4, 2025

doc-doc / NExT-QA

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

Python 184 16 Updated Aug 2, 2025

llyx97 / TempCompass

[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, Lei Li, Sishuo Chen, Xu Sun, Lu Hou

Python 128 4 Updated Apr 4, 2025

Luodian / nano-hevc

A minimal, educational HEVC (H.265) encoder written in Python.

Python 40 1 Updated Feb 10, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 201,473 36,068 Updated Feb 16, 2026

facebookresearch / TimeSformer

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Python 1,824 245 Updated Apr 9, 2024

facebookresearch / Action100M

A Large-scale Video Action Dataset

Python 400 11 Updated Jan 16, 2026

NJU-LINK / MVU-Eval

Python 13 2 Updated Nov 11, 2025

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.

3,077 139 Updated Dec 20, 2025

EvolvingLMMs-Lab / OneVision-Encoder

Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Python 225 5 Updated Feb 13, 2026

QwenLM / Qwen3-VL-Embedding

Python 1,016 75 Updated Feb 2, 2026

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 85,386 12,920 Updated Feb 9, 2026

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,303 1,596 Updated Jan 30, 2026

luhengshiwo / LLMForEverybody

每个人都能看懂的大模型知识分享，LLMs春/秋招大模型面试前必看，让你和面试官侃侃而谈

Jupyter Notebook 5,504 531 Updated Feb 5, 2026

datawhalechina / so-large-lm

大模型基础: 一文了解大模型基础知识

6,755 569 Updated Dec 18, 2025

deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1,893 301 Updated Jan 16, 2024

gofish2020 / algorithm-go

让我们一起刷CodeTop

40 3 Updated Oct 13, 2025

youngyangyang04 / leetcode-master

《代码随想录》LeetCode 刷题攻略：200道经典题目刷题顺序，共60w字的详细图解，视频难点剖析，50余张思维导图，支持C++，Java，Python，Go，JavaScript等多语言版本，从此算法学习不再迷茫！🔥🔥 来看看，你会发现相见恨晚！🚀

Shell 60,338 12,338 Updated Jan 27, 2026

XiaoMi / xiaomi-miloco

Xiaomi Miloco

Python 2,336 159 Updated Feb 14, 2026

alonj / Same-Task-More-Tokens

The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"

Jupyter Notebook 56 4 Updated Oct 24, 2025