-
@Westlake-AI Zhejiang University & Westlake University
- Hangzhou, Zhejiang, China
- https://lupin1998.github.io/
- @LupinLSY
- in/siyuan-li-lupin1998
Lists (11)
Sort Name ascending (A-Z)
AI4Sci
AI for Science ApplicationsAwesome List
Summarization of papers on a certain topic.Benchmarks
LLMs / MLLMs / AIGC benchmarksComputer Vision
Computer Vision Fundamental Researches and ApplicationsGraph
Graph Representation LearningLLM&MLLM
Large Language Modeling and ApplicationsManifold Learning
Manifold Learning for Dimension ReductionMulti-Agent
Multi-agent project like OpenClawStars
Implementation of RankE: End-to-End Discrete Text-to-Image Post-Training via Rank-Consistent Alignment
Free open-source AI text humanizer to convert AI-generated content into undetectable, human-like writing. Bypass Turnitin, GPTZero, and all major AI detectors. No sign-up required. Try our unlimiteโฆ
Cognitive runtime for language models with memory, metacognition, multimodal channels, native plugins, and a self-evolving Executive.
YAML-native agent workflow execution engine, written in Rust
[RSS26'] Welcome to Psi-Zero, a Humanoid VLA towards Universal Humanoid Intelligence.
LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs
๐ฆ+๐ฌ NanoResearch: The Autonomous AI Research Assistant
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAIโs advanced image generation capabโฆ
Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with block diffusion, mixed-CoT, unified RL)
This is a repo to track the latest autoregressive visual generation papers.
๐ [CVPR 2026] GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models
a family of versatile and state-of-the-art video tokenizers.
Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).
[NeurIPS'2025] Official implementation of MGUP, a momentum-gradient greedy alignment update policy for stochastic optimization.
๐ [Survey] A curated collection of research papers, models, and resources tracing the evolution from specialized models to unified world models.
Awesome Deep Research list! For more details, please refer to our survey paper -- A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications
Code for explaining and evaluating late chunking (chunked pooling)
[ICLR 2026] MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Public release of the code for "Accelerating Vision Transformers with Adaptive Patches"
Being-VL-0.5: Unified Multimodal Understanding via Byte-Pair Visual Encoding (ICCV 2025, Highlight)