-
Nankai University
- Tianjin
-
21:33
(UTC +08:00) - https://montaellis.github.io
Starred repositories
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
StyleGAN2 - Official TensorFlow Implementation
Statsmodels: statistical modeling and econometrics in Python
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
PyTorch package for the discrete VAE used for DALL·E.
A collaboration friendly studio for NeRFs
Python bindings for FFmpeg - with complex filtering support
🐍 Geometric Computer Vision Library for Spatial AI
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
A PyTorch implementation of the Transformer model in "Attention is All You Need".
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
Make bilingual epub books Using AI translate
An elegant PyTorch deep reinforcement learning library.
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch