-
Tsinghua University
- China, Beijing
-
05:35
(UTC +08:00)
Stars
ICLR 2026 Oral: WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Fused Qwen3 MoE layer for faster training, compatible with Transformers, LoRA, bnb 4-bit quant, Unsloth. Also possible to train LoRA over GGUF
清华大学“荷塘雨课堂”助手,包含自动签到、答题、点名语音提醒等功能。
video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is developed by the Department of Electronic Engineering at Tsin…
An LLM-based Agent for the New Automation Paradigm - Agentic Process Automation
Shaders for MagicaVoxel including Terrain Generator, Game of Life, Waterflow Simulator, Progressive Flood Shader etc.
SALMONN family: A suite of advanced multi-modal LLMs
清华大学计算机系课程攻略 Guidance for courses in Department of Computer Science and Technology, Tsinghua University
清华大学云盘 (Tsinghua Cloud) 批量下载助手,适用于分享的文件 size 过大导致无法直接下载的情况,本脚本添加了更多实用的小功能
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
simple_logger - A simple, multifunctional and header-only logging library for C++17.
Open Video Downloader - A cross-platform GUI for youtube-dl made in Rust with Tauri and Vue + Typescript.
Repo for counting stars and contributing. Press F to pay respect to glorious developers.