linyueqian

🍊

Yueqian Lin linyueqian

🍊

PhD student at Duke

73 followers · 229 following

Duke University
Durham, NC
yueqianlin.com
@YueqianL
in/yueqian-lin

Achievements

Highlights

Lists (2)

Sort

TTS

VLM

Starred repositories

mco2004 / qwen-tts

Qwen-TTS offers a robust voice synthesis service using FastAPI, supporting bilingual and dialect options. Explore seamless audio generation on GitHub! 🚀🌟

Python 98 11 Updated Dec 19, 2025

boson-ai / higgs-audio

Text-audio foundation model from Boson AI

Python 7,753 577 Updated Sep 15, 2025

wri / GCSC

This repository contains the code and tables from land use change and land occupation emissions

3 Updated Dec 4, 2025

Yuzhe-Fu / FractalCloud

[HPCA 2026] FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing

Python 8 Updated Dec 8, 2025

Audio-Reasoning-Challenge / Audio-Reasoning-Challenge-Baselines

The baselines of ARC-Challenge-Interspeech2026

Python 49 3 Updated Dec 1, 2025

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 1,001 136 Updated Dec 19, 2025

BerriAI / litellm

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 32,662 5,074 Updated Dec 19, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,134 191 Updated Oct 9, 2025

camel-ai / camel

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 15,083 1,668 Updated Dec 19, 2025

birdnet-team / BirdNET-Analyzer

BirdNET analyzer for scientific audio data processing.

Python 1,335 230 Updated Dec 18, 2025

mbzuai-nlp / AudioJailbreak

Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models

Python 27 2 Updated Oct 6, 2025

HankYe / KVCOMM

[NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems

Python 11 3 Updated Nov 1, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 38,883 4,909 Updated Dec 9, 2025

ServiceNow / AU-Harness

A comprehensive framework to test audio comprehension of Large Audio Language Models.

Python 56 2 Updated Nov 25, 2025

NJU-LINK / OmniVideoBench

The Source Code for OmniVideoBench

Python 39 2 Updated Nov 20, 2025

mit-han-lab / streaming-vlm

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Python 772 51 Updated Oct 15, 2025

speedyapply / 2026-AI-College-Jobs

2026 AI/ML internship & new graduate job list updated daily

4,231 173 Updated Dec 19, 2025

neuphonic / neutts-air

On-device TTS model by Neuphonic

Python 4,274 448 Updated Dec 15, 2025

thinking-machines-lab / tinker-cookbook

Post-training with Tinker

Python 2,578 246 Updated Dec 19, 2025

linyueqian / VERA

Python 118 1 Updated Nov 4, 2025

wangqinsi1 / Vision-Zero

This is the official Python version of Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play.

Python 104 2 Updated Oct 21, 2025

openai / codex

Lightweight coding agent that runs in your terminal

Rust 54,298 6,874 Updated Dec 19, 2025

1999Lyd / KVTP

Python 5 Updated Aug 10, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,327 3,240 Updated Dec 19, 2025

DanielLin94144 / Full-Duplex-Bench

A Benchmark for Evaluating Turn-Taking and Overlap Handling in Full-Duplex Spoken Dialogue Models

Python 110 4 Updated Sep 21, 2025

facebookresearch / dinov3

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,858 653 Updated Nov 20, 2025

MoonshotAI / Kimi-K2

Kimi K2 is the large language model series developed by Moonshot AI team

9,733 703 Updated Nov 7, 2025

NVIDIA / audio-intelligence

Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with synthetic captions.

Python 90 5 Updated Oct 15, 2025

musistudio / claude-code-router

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 23,685 1,858 Updated Dec 18, 2025

MASWorks / MASLab

Python 202 23 Updated Jul 25, 2025

Yueqian Lin linyueqian

Highlights

Lists (2)

TTS

VLM

Starred repositories

speech-processing