-
NYU
- New York
- https://jason-cs18.github.io/
- https://yanlu.substack.com/
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much much more
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
The absolute trainer to light up AI agents.
A powerful tool for creating fine-tuning datasets for LLM
This repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for re…
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Unlimited-length talking video generation that supports image-to-video and video-to-video generation
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
The development and future prospects of large multimodal reasoning models.
slime is an LLM post-training framework for RL Scaling.
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
An open-source AI agent that brings the power of Gemini directly into your terminal.
CycleResearcher: Improving Automated Research via Automated Review
A compilation of the best multi-agent papers
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
A python library for self-supervised learning on images.
For developers, who are building real-time data-driven applications, Redis is the preferred, fastest, and most feature-rich cache, data structure server, and document and vector query engine.
A curated list of awesome Multimodal studies.
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.
accompanying material for sleep-time compute paper
🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker
Bridging LLM and Recommender System.
A subjective learning guide for generative AI research