Highlights
- Pro
Stars
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
Academic Research Skills for Claude Code: research → write → review → revise → finalize
[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
[ICLR26] Official implementation of Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling
CVPR and NeurIPS poster examples and templates
[CVPR2026] LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories
MMV-Lab / Agentic-J
Forked from LJMedPhys/Imagent_JAI agent for Microscopy Image Analysis
A lightweight napari plugin that exposes the viewer over MCP (Message-Control Protocol) via a Python socket server. Built on top of FastMCP, it lets external MCP-speaking clients—such as autonomous…
Experiments for bioagent benchmark
Benchmark for evaluating LLM agents in bioinformatics
[ECCV 2026] MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
[ICLR'26] Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
[CVPR2026] PosterOmni: One model for poster creation—unifying local edits and global design for generalized multi-task image/poster-to-poster generation.
[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
[CVPR 2026 Highlight] SGI: Structured 2D Gaussians for Efficient and Compact Large Image Representation
Machine Learning and Computer Vision Engineer - Technical Interview Questions
[AAAI 2026] SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
TradingAgents: Multi-Agents LLM Financial Trading Framework
"Paper2Slides: From Paper to Presentation in One Click"
Automatic Video Generation from Scientific Papers
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.