Skip to content
View nchzzDFTBA's full-sized avatar

Block or report nchzzDFTBA

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程

Python 10,922 1,133 Updated Dec 17, 2025

The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.

Python 272 24 Updated May 15, 2025

[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.

Python 1,107 65 Updated Nov 25, 2025
Python 421 28 Updated Nov 27, 2025

A Fully Self-Hosted Solution for Full-Duplex Voice Interaction

Python 452 34 Updated Sep 28, 2025

Efficient audio understanding with general audio captions

Python 390 39 Updated Nov 3, 2025

中国科学院大学电子学院专业核心课-现代数字信号处理课程资料-张颢老师

17 1 Updated Jun 5, 2025

WeNet 实战课程作业

Python 20 5 Updated Oct 7, 2022

Production First and Production Ready End-to-End Speech Recognition Toolkit

Python 4,953 1,171 Updated Dec 19, 2025

Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems

Python 61 6 Updated Oct 12, 2025

😼 优雅地使用基于 clash/mihomo 的代理环境

Shell 6,984 873 Updated Dec 19, 2025

ICT-STAR小组生存指南 | A website for sharing information about how to become a qualified Master/Ph.D.

Python 7 Updated Apr 18, 2025

中国科学院大学网安-计算机相关课程资源,高级人工智能,深度学习,应用密码学,机器学习,信息隐藏,信息论与编码,多媒体编码等

120 8 Updated Nov 9, 2022

Some useful things while 集中教学 in Yan Qi Lake, Beijing:国科大人文讲座脚本

Python 10 Updated Jul 23, 2024

精益副业:程序员如何优雅地做副业

11,568 924 Updated Mar 28, 2024

PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides [EMNLP 2025]

Python 2,901 346 Updated Dec 18, 2025

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Python 957 175 Updated Jan 6, 2024

CVPR 2019

Python 257 54 Updated May 24, 2023
Jupyter Notebook 519 314 Updated Aug 14, 2025

code for paper "Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion" in the conference of IJCAI 2021

Python 352 64 Updated Feb 15, 2024

This repository contains the codes for LipGAN. LipGAN was published as a part of the paper titled "Towards Automatic Face-to-Face Translation".

Python 614 126 Updated Jun 22, 2025

排版北航本科毕业论文

TypeScript 23 Updated Dec 10, 2022

ABAW3 (CVPRW): A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

Python 48 9 Updated Jan 15, 2024

speech enhancement\speech seperation\sound source localization

1,209 224 Updated Nov 14, 2023

语音方向实验室/公司/资源/实习等,欢迎推荐或自荐

588 69 Updated Nov 13, 2024

A wav2lip Web UI using Gradio

Python 72 11 Updated Nov 2, 2023

包含PlotNeuralNet绘制神经网络结构图的教程源码

TeX 235 38 Updated Jun 21, 2019

Latex code for making neural networks diagrams

TeX 24,251 3,032 Updated Aug 21, 2023
Python 4,573 370 Updated Dec 19, 2025

Out of time: automated lip sync in the wild

Python 852 184 Updated Jan 23, 2024
Next