Skip to content
View ThinhPTran's full-sized avatar
  • Ho chi minh city, Vietnam

Block or report ThinhPTran

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
95 stars written in Python
Clear filter

Robust Speech Recognition via Large-Scale Weak Supervision

Python 90,432 11,325 Updated Sep 8, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 61,958 7,494 Updated Nov 6, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 49,035 8,214 Updated Dec 9, 2024

Ultralytics YOLO 🚀

Python 48,348 9,326 Updated Nov 6, 2025

all of the workflows of n8n i could find (also from the site itself)

Python 38,731 3,696 Updated Nov 3, 2025

Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: …

Python 38,581 3,686 Updated Jul 9, 2025

one-click face swap

Python 30,337 6,900 Updated Aug 19, 2024

We have made you a wrapper you can't refuse

Python 28,366 5,905 Updated Nov 3, 2025

State-of-the-art 2D and 3D Face Analysis Project

Python 26,955 5,814 Updated Sep 27, 2025

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Python 26,392 5,442 Updated Nov 20, 2023

Official inference framework for 1-bit LLMs

Python 24,356 1,888 Updated Jun 3, 2025

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 21,815 2,664 Updated Jul 3, 2025

Intelligent automation and multi-agent orchestration for Claude Code

Python 19,989 2,226 Updated Nov 1, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 18,605 1,970 Updated Oct 21, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,050 3,179 Updated Nov 6, 2025

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 15,916 1,268 Updated Jan 18, 2025

LLM agents built for control. Designed for real-world use. Deployed in minutes.

Python 15,885 1,311 Updated Nov 6, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 13,539 1,988 Updated Nov 3, 2025

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 13,351 1,352 Updated Oct 1, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 11,444 1,269 Updated Oct 12, 2025

This repository contains code examples for the Stanford's course: TensorFlow for Deep Learning Research.

Python 10,373 4,285 Updated Dec 22, 2020

TensorFlow-based neural network library

Python 9,890 1,309 Updated Aug 4, 2025

End-to-End Speech Processing Toolkit

Python 9,566 2,343 Updated Nov 5, 2025

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,725 1,382 Updated Dec 6, 2023

Text-audio foundation model from Boson AI

Python 7,573 560 Updated Sep 15, 2025

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

Python 6,352 1,550 Updated Dec 3, 2024

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Python 6,154 2,049 Updated Oct 23, 2023

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 6,036 629 Updated Aug 10, 2024

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Python 5,700 1,612 Updated Mar 8, 2024

Multilingual Document Layout Parsing in a Single Vision-Language Model

Python 5,593 562 Updated Oct 31, 2025
Next