Skip to content
View StevenLiuWen's full-sized avatar

Organizations

@NeuralCarver

Block or report StevenLiuWen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Nano vLLM

Python 9,883 1,243 Updated Nov 3, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,949 288 Updated May 15, 2025
Jupyter Notebook 154 17 Updated Mar 4, 2025

Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".

Python 133 3 Updated Mar 23, 2025

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 5,155 1,807 Updated Feb 26, 2025
Python 253 13 Updated Dec 1, 2024

[NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training

Python 220 7 Updated Mar 20, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,643 2,233 Updated Feb 1, 2025

Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch

Python 207 15 Updated Feb 14, 2024

Code for "Don’t drop your samples! Coherence-aware training benefits Conditional diffusion" CVPR 2024 Highlight

Python 57 2 Updated Jul 24, 2025

Code for PointInfinity: Resolution-Invariant Point Diffusion Models

Python 35 3 Updated Jun 19, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,077 2,287 Updated Dec 25, 2024

Minimal implementation of scalable rectified flow transformers, based on SD3's approach

Jupyter Notebook 623 62 Updated Jul 1, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 2,068 117 Updated Jul 29, 2024
Python 64 1 Updated Oct 15, 2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

4,986 530 Updated Sep 25, 2024

Multimodal Models in Real World

Jupyter Notebook 551 23 Updated Feb 24, 2025

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

Python 4,186 465 Updated Jan 3, 2025

[SIGGRAPH Asia'24 & TOG] Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes

Python 965 64 Updated Nov 15, 2024

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 4,030 581 Updated Apr 24, 2024

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

Python 136 4 Updated Oct 10, 2025

A Pytorch Implementation of Finite Scalar Quantization

Python 167 5 Updated Nov 29, 2023

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 6,389 718 Updated Mar 19, 2025

[ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,887 144 Updated Dec 30, 2024

This is the implementation of our paper in ECCV 2020.

Python 65 13 Updated Jun 8, 2021

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.

Python 5,239 853 Updated Dec 19, 2025

A framework to enable multimodal models to operate a computer.

Python 10,013 1,386 Updated Sep 19, 2025
Next