Skip to content
View voidful's full-sized avatar
🎯
Focusing
🎯
Focusing

Sponsors

@ga642381

Block or report voidful

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
41 results for forked starred repositories
Clear filter

[ACL 2025 Main] UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

Python 5 3 Updated Jun 23, 2025

Official github repo for TMMLU+, Large scale traditional chinese massive multitask language understanding

Python 1 Updated Aug 18, 2025

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 2,189 365 Updated Aug 14, 2025

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Python 2 1 Updated Dec 11, 2024

A collection of prompts, system prompts and LLM instructions

HTML 634 75 Updated Sep 5, 2024

δ½Ώη”¨ηΉι«”δΈ­ζ–‡θ³‡ζ–™ι›†εšηš„ Embedding ζ¨‘εž‹θ©•ζΈ¬

Python 1 1 Updated Jul 7, 2024

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1 Updated Apr 9, 2024

Awesome speech/audio LLMs, representation learning, and codec models

1 Updated Oct 18, 2024

Megatron-LM setup in the smol-cluster

Python 3 Updated Jan 19, 2024

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. γ€εŸΊδΊŽ PyTor…

Python 3,688 533 Updated Sep 21, 2025

πŸ”Š Text-prompted Generative Audio Model - With the ability to clone voices

Jupyter Notebook 3,332 451 Updated Aug 24, 2025

Python tools for processing the stackexchange data dumps into a text dataset for Language Models

Python 1 Updated Mar 8, 2023

A crawler for https://ndltd.ncl.edu.tw

Python 3 Updated Apr 14, 2023

πŸ€— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 81 7 Updated Mar 17, 2022

PodcastMix A dataset for separating music and speech in podcasts.

Jupyter Notebook 44 4 Updated Aug 20, 2024

RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, …

Python 412 39 Updated Jul 11, 2023

Python tools for processing the stackexchange data dumps into a text dataset for Language Models

Python 4 1 Updated Feb 18, 2023

Streamlit APPs that leverage the power of spaCy to assist language learning

Python 6 12 Updated Mar 7, 2023

η΅ε·΄δΈ­ζ–‡ζ–·θ©žε°η£ηΉι«”η‰ˆζœ¬

Python 109 30 Updated Nov 3, 2017

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 80 9 Updated Apr 4, 2022
JavaScript 3 Updated Aug 16, 2025

StyleGAN3 + Inversion

Python 96 11 Updated May 17, 2022

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Python 88 2 Updated Dec 3, 2021

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 4 Updated Sep 14, 2021

Open-source KVM software

C 29,821 1,599 Updated Jun 22, 2024

Write a large text on your GitHub profile, with your commits history (contribution graph).

Python 382 25 Updated Feb 17, 2025

Pre-trained ELECTRA from Hong Kong data

Python 29 1 Updated Jul 7, 2020
Next