Skip to content
View JacksonRed's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report JacksonRed

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
502 stars written in Python
Clear filter

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Python 1,604 157 Updated Dec 8, 2023

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,587 132 Updated Jan 1, 2025

A Fair and Scalable Time Series Forecasting Benchmark and Toolkit.

Python 1,535 187 Updated Nov 10, 2025

Recipes to train reward model for RLHF.

Python 1,476 102 Updated Apr 24, 2025

A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximu…

Python 1,463 233 Updated Jul 21, 2023

Official Repo For "Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos"

Python 1,396 97 Updated Nov 4, 2025
Python 1,372 16 Updated Oct 9, 2024

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Python 1,361 84 Updated Jan 23, 2024

Real-time and accurate open-vocabulary end-to-end object detection

Python 1,344 111 Updated Dec 18, 2024

[CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Python 1,328 175 Updated Mar 13, 2025

Official Repo For OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,327 53 Updated Oct 15, 2025

Uncommon Objects in 3D dataset

Python 1,304 181 Updated Mar 17, 2025

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Python 1,301 106 Updated Mar 11, 2025

[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval

Python 1,277 121 Updated Jul 18, 2023

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 1,275 160 Updated Jan 4, 2025

一账通是一款开源的统一身份认证授权管理解决方案,支持多种标准协议(LDAP, OAuth2, SAML, OpenID),细粒度权限控制,完整的WEB管理功能,钉钉、企业微信集成等,QQ group: 167885406

Python 1,274 255 Updated Oct 4, 2023

A plugin for IDA that can help to analyze binary file, it can be based on commonly used AI big models such as OpenAI and DeepSeek.

Python 1,251 192 Updated Mar 28, 2025

Res-SAM Framework for GPR Underground Hazard Detection

Python 1,241 62 Updated Sep 23, 2025

[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS

Python 1,224 110 Updated Sep 19, 2025

[CVPR'25]Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Python 1,208 56 Updated Jul 9, 2025

One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)

Python 1,204 186 Updated Nov 28, 2024

[NeurIPS 2024] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

Python 1,184 50 Updated Mar 21, 2025

AI Manus is a general-purpose AI Agent system that supports running various tools and operations in a sandbox environment.

Python 1,165 274 Updated Nov 5, 2025

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,156 74 Updated Oct 21, 2024

PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.

Python 1,154 158 Updated Jul 1, 2025

[Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

Python 1,145 185 Updated Oct 16, 2025

PantoMatrix: Generating Face and Body Animation from Speech

Python 1,135 181 Updated Jan 16, 2025

RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.

Python 1,133 107 Updated Nov 10, 2025

[ICLR 2025 Oral] TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation

Python 1,125 148 Updated Aug 24, 2025

Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.

Python 1,105 71 Updated Feb 7, 2025