JacksonRed

🎯

Focusing

Hongyu Xiang JacksonRed

🎯

Focusing

15 followers · 263 following

Stars

502 stars written in Python

Clear filter

MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Python 1,604 157 Updated Dec 8, 2023

lyuchenyang / Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,587 132 Updated Jan 1, 2025

GestaltCogTeam / BasicTS

A Fair and Scalable Time Series Forecasting Benchmark and Toolkit.

Python 1,535 187 Updated Nov 10, 2025

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,476 102 Updated Apr 24, 2025

tinyvision / SOLIDER

A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximu…

Python 1,463 233 Updated Jul 21, 2023

bytedance / Sa2VA

Official Repo For "Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos"

Python 1,396 97 Updated Nov 4, 2025

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Python 1,361 84 Updated Jan 23, 2024

om-ai-lab / OmDet

Real-time and accurate open-vocabulary end-to-end object detection

Python 1,344 111 Updated Dec 18, 2024

fudan-generative-vision / hallo3

[CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Python 1,328 175 Updated Mar 13, 2025

lxtGH / OMG-Seg

Official Repo For OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,327 53 Updated Oct 15, 2025

facebookresearch / uco3d

Uncommon Objects in 3D dataset

Python 1,304 181 Updated Mar 17, 2025

facebookresearch / vggsfm

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Python 1,301 106 Updated Mar 11, 2025

MasterBin-IIAU / UNINEXT

[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval

Python 1,277 121 Updated Jul 18, 2023

Zefan-Cai / KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 1,275 160 Updated Jan 4, 2025

longguikeji / arkid

一账通是一款开源的统一身份认证授权管理解决方案，支持多种标准协议(LDAP, OAuth2, SAML, OpenID)，细粒度权限控制，完整的WEB管理功能，钉钉、企业微信集成等，QQ group: 167885406

Python 1,274 255 Updated Oct 4, 2023

WPeace-HcH / WPeChatGPT

A plugin for IDA that can help to analyze binary file, it can be based on commonly used AI big models such as OpenAI and DeepSeek.

Python 1,251 192 Updated Mar 28, 2025

zhouxr6066 / Res-SAM

Res-SAM Framework for GPR Underground Hazard Detection

Python 1,241 62 Updated Sep 23, 2025

HJYao00 / Mulberry

[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS

Python 1,224 110 Updated Sep 19, 2025

alibaba / Tora

[CVPR'25]Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Python 1,208 56 Updated Jul 9, 2025

Replicable-MARL / MARLlib

One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)

Python 1,204 186 Updated Nov 28, 2024

shallowdream204 / DreamClear

[NeurIPS 2024] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Python 1,184 50 Updated Mar 21, 2025

Simpleyyt / ai-manus

AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.

Python 1,165 274 Updated Nov 5, 2025

FoundationVision / GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,156 74 Updated Oct 21, 2024

Text-to-Audio / AudioLCM

PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.

Python 1,154 158 Updated Jul 1, 2025

Zefan-Cai / R-KV

[Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

Python 1,145 185 Updated Oct 16, 2025

PantoMatrix / PantoMatrix

PantoMatrix: Generating Face and Body Animation from Speech

Python 1,135 181 Updated Jan 16, 2025

RLinf / RLinf

RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.

Python 1,133 107 Updated Nov 10, 2025

CyberAgentAILab / TANGO

[ICLR 2025 Oral] TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation

Python 1,125 148 Updated Aug 24, 2025

rhymes-ai / Allegro

Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.

Python 1,105 71 Updated Feb 7, 2025

Previous Next

Provide feedback

Saved searches

Use saved searches to filter your results more quickly