- Ho Chi Minh, Vietnam
- in/lltlien
Stars
Docker image with Uvicorn managed by Gunicorn for high-performance FastAPI web applications in Python with performance auto-tuning.
A library for mechanistic interpretability of GPT-style language models
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing…
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Repository for the paper "ViHateT5: Enhancing Hate Speech Detection in Vietnamese with A Unified Text-to-Text Transformer Model" (ACL'2024 - Findings)
A Vietnamese natural language processing toolkit (NAACL 2018)
PhoWhisper: Automatic Speech Recognition for Vietnamese (2024)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)
An autonomous agent that conducts deep research on any data using any LLM providers
Systems submitted to IWSLT 2022 by the MT-UPC group.
Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".
Tracking the progress in end-to-end speech translation
A modern replacement for Redis and Memcached
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
Simple Python script to split video into equal length chunks or chunks of equal size, duration, etc.
21 Lessons, Get Started Building with Generative AI
Making large AI models cheaper, faster and more accessible
A curated list of modern Generative Artificial Intelligence projects and services
10 Weeks, 20 Lessons, Data Science for All!
A collection of resources and papers on Diffusion Models
Label Studio is a multi-type data labeling and annotation tool with standardized output format
200+ detailed flashcards useful for reviewing topics in machine learning, computer vision, and computer science.
Source code for the X Recommendation Algorithm
A roadmap for those looking to start or expand a career in the data community
Benchmarks of approximate nearest neighbor libraries in Python
Dense image captioning in Torch
I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)