Robust Speech Recognition via Large-Scale Weak Supervision
Contexts Optical Compression
Video understanding codebase from FAIR for reproducing video models
The no-nonsense RAG chunking library
A fast, powerful, and simple hierarchical vision transformer
Code release for Cut and Learn for Unsupervised Object Detection
A library to generate LaTeX expression from Python code
Refer and Ground Anything Anywhere at Any Granularity
PyTorch code and models for VJEPA2 self-supervised learning from video
Language modeling in a sentence representation space
Air traffic control tower and radar simulator (solo + multi-player)
Resources, corpora, and tools for Chinese natural language processing
Code release for ConvNeXt V2 model
FaceXlib aims at providing ready-to-use face-related functions
Code release for ConvNeXt model
A deep learning library for video understanding research
The official pytorch implementation of our paper
Efficient 3D human pose estimation in video using 2D keypoint
VGGFace2 Dataset for Face Recognition
open source python packages for X-ray MicroLaue Diffraction analysis
Non-local Neural Networks for Video Classification
Optical Music Recognition for Tablature Notations
A pygame music lib.
Efficient Approximate Nearest Neighbors for General Metric Spaces