voidful

🎯

Focusing

Eric Lam voidful

🎯

Focusing

👩‍🎓PhD@NTU Speech Lab. Formerly, Microsoft Research Intern.

385 followers · 322 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Developer Program Member
Pro

Lists (1)

Sort

instruction dataset

8 repositories

Stars

41 results for forked starred repositories

Clear filter

adamlin120 / Awesome-LLM-Training-System

Forked from InternLM/Awesome-LLM-Training-System

1 Updated Jul 17, 2025

mesolitica / UniCodec-fix

Forked from Jiang-Yidi/UniCodec

[ACL 2025 Main] UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

Python 5 3 Updated Jun 23, 2025

theblackcat102 / ievals

Forked from iKala/ievals

Official github repo for TMMLU+, Large scale traditional chinese massive multitask language understanding

Python 1 Updated Aug 18, 2025

deepspeedai / Megatron-DeepSpeed

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 2,189 365 Updated Aug 14, 2025

ericsunkuan / TalkNet-ASD

Forked from TaoRuijie/TalkNet-ASD

ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'

Python 2 1 Updated Dec 11, 2024

lucasmrdt / TheBigPromptLibrary

Forked from 0xeb/TheBigPromptLibrary

A collection of prompts, system prompts and LLM instructions

HTML 634 75 Updated Sep 5, 2024

adamlin120 / zh-tw-embedding-model-benchmark

Forked from ihower/zh-tw-embedding-model-benchmark

使用繁體中文資料集做的 Embedding 模型評測

Python 1 1 Updated Jul 7, 2024

adamlin120 / needle-haystack

Forked from winglian/needle-haystack

Python 2 1 Updated Jun 13, 2024

George0828Zhang / stable-ts

Forked from jianfch/stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

Python 1 Updated Apr 9, 2024

hbwu-ntu / speech-trident

Forked from ga642381/speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

1 Updated Oct 18, 2024

thomwolf / megatron-smol-cluster

Forked from loubnabnl/nanotron-smol-cluster

Megatron-LM setup in the smol-cluster

Python 3 Updated Jan 19, 2024

breezedeus / CnOCR

Forked from diaomin/crnn-mxnet-chinese-text-recognition

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTor…

Python 3,688 533 Updated Sep 21, 2025

serp-ai / bark-with-voice-clone

Forked from suno-ai/bark

🔊 Text-prompted Generative Audio Model - With the ability to clone voices

Jupyter Notebook 3,332 451 Updated Aug 24, 2025

voidful / stackexchange-dataset

Forked from sgunasekar/stackexchange-dataset

Python tools for processing the stackexchange data dumps into a text dataset for Language Models

Python 1 Updated Mar 8, 2023

voidful / paperCrawler

Forked from paulpeng-popo/paperCrawler

A crawler for https://ndltd.ncl.edu.tw

Python 3 Updated Apr 14, 2023

cimeister / typical-sampling

Forked from huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 81 7 Updated Mar 17, 2022

MTG / Podcastmix

Forked from nschmidtg/Podcastmix

PodcastMix A dataset for separating music and speech in podcasts.

Jupyter Notebook 44 4 Updated Aug 20, 2024

Blealtan / RWKV-LM-LoRA

Forked from BlinkDL/RWKV-LM

RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, …

Python 412 39 Updated Jul 11, 2023