Robust Speech Recognition via Large-Scale Weak Supervision
Contexts Optical Compression
Port of OpenAI's Whisper model in C/C++
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
Video understanding codebase from FAIR for reproducing video models
The no-nonsense RAG chunking library
Foundational Models for State-of-the-Art Speech and Text Translation
UI Automation Framework for Games and Apps
A fast, powerful, and simple hierarchical vision transformer
In-App assistant SDK to build a multimodal conversational UX websites
Beautiful, fast and modern React UI library.
Code release for Cut and Learn for Unsupervised Object Detection
A library to generate LaTeX expression from Python code
Refer and Ground Anything Anywhere at Any Granularity
Kubernetes Native Edge Computing Framework (project under CNCF)
PyTorch code and models for VJEPA2 self-supervised learning from video
Language modeling in a sentence representation space
fast C++ library for linear algebra & scientific computing
Download, save and convert multiple subtitles from YouTube videos
ExDARK dataset is the largest collection of low-light images
Detect faces in an image
The free computer aided translation (CAT) tool for professionals
Blazeface is a lightweight model that detects faces in images
Web based, Open Source alternative to Remark OMR or Teleform