Robust Speech Recognition via Large-Scale Weak Supervision
Contexts Optical Compression
A fast, powerful, and simple hierarchical vision transformer
The no-nonsense RAG chunking library
Video understanding codebase from FAIR for reproducing video models
A library to generate LaTeX expression from Python code
Refer and Ground Anything Anywhere at Any Granularity
Code release for Cut and Learn for Unsupervised Object Detection
PyTorch code and models for VJEPA2 self-supervised learning from video
Language modeling in a sentence representation space
Resources, corpora, and tools for Chinese natural language processing
Air traffic control tower and radar simulator (solo + multi-player)
Code release for ConvNeXt V2 model
FaceXlib aims at providing ready-to-use face-related functions
Code release for ConvNeXt model
A deep learning library for video understanding research
The official pytorch implementation of our paper
Efficient 3D human pose estimation in video using 2D keypoint
VGGFace2 Dataset for Face Recognition
Non-local Neural Networks for Video Classification
open source python packages for X-ray MicroLaue Diffraction analysis
Optical Music Recognition for Tablature Notations
A pygame music lib.
Efficient Approximate Nearest Neighbors for General Metric Spaces