recognition free download

38 projects for "recognition" with 2 filters applied:

Python BSD Clear Filters & Widen Search

Gen AI apps are built with MongoDB Atlas
Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free
Simple, Secure Domain Registration
Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.

Sign up for free
1

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...

Downloads: 75 This Week

Last Update: 2025-06-26
See Project
2

DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...

Downloads: 101 This Week

Last Update: 2025-10-25
See Project
3

Hiera

A fast, powerful, and simple hierarchical vision transformer

Hiera is a hierarchical vision transformer designed to be fast, simple, and strong across image and video recognition tasks. The core idea is to use straightforward hierarchical attention with a minimal set of architectural “bells and whistles,” achieving competitive or superior accuracy while being markedly faster at inference and often faster to train. The repository provides installation options (from source or Torch Hub), a model zoo with pre-trained checkpoints, and code for evaluation...

Downloads: 4 This Week

Last Update: 2025-10-08
See Project
4

Chonkie

The no-nonsense RAG chunking library

Chonkie is an AI-powered framework designed for building conversational agents and chatbots with natural language understanding and multi-turn conversation support.

Downloads: 2 This Week

Last Update: 2025-03-01
See Project
Get the most trusted enterprise browser
Advanced built-in security helps IT prevent breaches before they happen

Defend against security incidents with Chrome Enterprise. Create customizable controls, manage extensions and set proactive alerts to keep your data and employees protected without slowing down productivity.

Download Chrome
5

SlowFast

Video understanding codebase from FAIR for reproducing video models

... excessive computational cost. The architecture is modular and supports tasks like action recognition, temporal localization, and video segmentation, performing strongly on benchmarks like Kinetics and AVA. The repository provides training recipes, pretrained models, and distributed pipelines optimized for large-scale video datasets.

Downloads: 0 This Week

Last Update: 5 days ago
See Project
6

latexify

A library to generate LaTeX expression from Python code

latexify_py converts small, math-heavy pieces of Python code into human-readable LaTeX that mirrors the intent of the computation, not just its surface syntax. It parses Python functions and expressions into an abstract syntax tree (AST), applies symbolic rewrites for common mathematical constructs, and then emits LaTeX that compiles cleanly in standard environments. Typical use cases include turning analytical utilities—like probability mass functions, activation formulas, or recurrence...

Downloads: 0 This Week

Last Update: 2025-10-09
See Project
7

ML Ferret

Refer and Ground Anything Anywhere at Any Granularity

Ferret is Apple’s end-to-end multimodal large language model designed specifically for flexible referring and grounding: it can understand references of any granularity (boxes, points, free-form regions) and then ground open-vocabulary descriptions back onto the image. The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo...

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
8

CutLER

Code release for Cut and Learn for Unsupervised Object Detection

... to benchmarking results that report large gains over prior unsupervised baselines. It’s intended for researchers exploring self-supervised and unsupervised recognition, offering a practical path to scale beyond costly labeled corpora. The README links papers and gives a high-level overview of components and expected outputs, with pointers to demos and assets. The repository is actively starred and structured as a typical research release with license, contribution guidelines, and security policy.

Downloads: 0 This Week

Last Update: 2025-10-09
See Project
9

vJEPA-2

PyTorch code and models for VJEPA2 self-supervised learning from video

... is designed to scale: spatiotemporal ViT backbones, flexible masking schedules, and efficient sampling let it train on long clips while remaining stable. Trained representations transfer well to downstream tasks such as action recognition, temporal localization, and video retrieval, often with simple linear probes or light fine-tuning. The repository typically includes end-to-end recipes—data pipelines, augmentation policies, training scripts, and evaluation harnesses.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
10

Large Concept Model

Language modeling in a sentence representation space

... large image–text or weakly supervised corpora. It includes utilities to build concept vocabularies, map supervision signals to those vocabularies, and measure zero-shot or few-shot generalization. Probing tools help diagnose what the model knows—e.g., attribute recognition, relation understanding, or compositionality—so you can iterate on data and objectives. The design is modular, making it straightforward to swap backbones, change objectives, or integrate retrieval components.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
11

funNLP

Resources, corpora, and tools for Chinese natural language processing

FunNLP is a large, curated collection of resources, corpora, and tools for Chinese natural language processing (NLP). It aggregates datasets, lexicons, wordlists, sentiment dictionaries, knowledge graphs, and pretrained model references, serving as a one-stop resource hub for Chinese NLP practitioners. The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion...

Downloads: 0 This Week

Last Update: 2025-10-01
See Project
12

ATC-pie

Air traffic control tower and radar simulator (solo + multi-player)

ATC-pie is an air traffic control simulation program. It features solo, multi-player and teacher-student sessions, rendering 3D views of airports through FlightGear. It is essentially designed for realism, and simulates real-life ATC tasks and equipment such as strip racks and sequence management, handovers to/from neighbouring controllers, flight plans, primary & secondary radars, RDF, CPDLC, ATIS recording...

3 Reviews

Downloads: 72 This Week

Last Update: 2025-10-09
See Project
13

ConvNeXt V2

Code release for ConvNeXt V2 model

... competition across channels. The result is a convnet that competes strongly with transformer architectures on recognition benchmarks while being efficient and hardware-friendly. The repository provides official PyTorch implementations for multiple model sizes (Atto, Femto, Pico, up through Huge), conversion from JAX weights, code for pretraining/fine-tuning, and pretrained checkpoints. It supports both self-supervised pretraining and supervised fine-tuning.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
14

Facexlib

FaceXlib aims at providing ready-to-use face-related functions

facexlib is a PyTorch-based library providing ready-to-use face-related functions, including detection, alignment, recognition, and more. It integrates state-of-the-art open-source methods for various face processing tasks.

Downloads: 2 This Week

Last Update: 2025-04-24
See Project
15

ConvNeXt

Code release for ConvNeXt model

... it efficient for both pretraining and fine-tuning across a wide range of visual recognition tasks. It achieves competitive or superior results on ImageNet and downstream datasets while being easier to deploy and train than transformers. The repository provides pretrained models, training recipes, and ablation studies demonstrating how incremental design choices collectively yield state-of-the-art performance.

Downloads: 0 This Week

Last Update: 2025-10-06
See Project
16

PyTorchVideo

A deep learning library for video understanding research

PyTorchVideo is a deep learning library for video understanding, providing modular components and pretrained models for tasks like action recognition, video classification, detection, and self-supervised learning. It is tightly integrated with PyTorch and PyTorch Lightning, offering flexible APIs for building and training spatiotemporal networks. The library includes efficient implementations of state-of-the-art architectures such as SlowFast, X3D, and MViT, optimized for both research...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
17

TimeSformer

The official pytorch implementation of our paper

TimeSformer is a vision transformer architecture for video that extends the standard attention mechanism into spatiotemporal attention. The model alternates attention along spatial and temporal dimensions (or designs variants like divided attention) so that it can capture both appearance and motion cues in video. Because the attention is global across frames, TimeSformer can reason about dependencies across long time spans, not just local neighborhoods. The official implementation in PyTorch...

Downloads: 3 This Week

Last Update: 2025-10-07
See Project
18

VideoPose3D

Efficient 3D human pose estimation in video using 2D keypoint

... detections (such as those from OpenPose or Detectron), it enables markerless 3D pose estimation with relatively lightweight computational requirements. The framework includes pretrained models, data preprocessing utilities, visualization tools, and evaluation scripts for standard benchmarks like Human3.6M. VideoPose3D has been used widely in computer vision research for human motion understanding, activity recognition, and animation generation.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
19

VGGFace2

VGGFace2 Dataset for Face Recognition

VGGFace2 is a large-scale face recognition dataset developed to support research on facial recognition across variations in pose, age, illumination, and identity. It consists of 3.31 million images covering 9,131 subjects, with an average of over 360 images per subject. The dataset was collected from Google Image Search, ensuring a wide diversity in ethnicity, profession, and real-world conditions. It is split into a training set with 8,631 identities and a test set with 500 identities, making...

Downloads: 18 This Week

Last Update: 16 hours ago
See Project
20

Video Nonlocal Net

Non-local Neural Networks for Video Classification

video-nonlocal-net implements Non-local Neural Networks for video understanding, adding long-range dependency modeling to 2D/3D ConvNet backbones. Non-local blocks compute attention-like responses across all positions in space-time, allowing a feature at one frame and location to aggregate information from distant frames and regions. This formulation improves action recognition and spatiotemporal reasoning, especially for classes requiring context beyond short temporal windows. The repo...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
21

LaueTools

open source python packages for X-ray MicroLaue Diffraction analysis

LaueTools is an open-source project for white beam Laue x-ray microdiffraction data analysis including tools in image processing, peaks searching & indexing, crystal structure solving (orientation & strain) and data & grain mapping visualisation. Python 3 Code and new features are now at: https://gitlab.esrf.fr/micha/lauetools

2 Reviews

Downloads: 3 This Week

Last Update: 2019-09-12
See Project
22

Lute Tablature Toolkit for Gamera

Optical Music Recognition for Tablature Notations

A toolkit for the optical recognition of 16th century lute tablature prints. It is based on and requires the Gamera document image analysis framework (http://gamera.sf.net).

Downloads: 0 This Week

Last Update: 2016-05-13
See Project
23

Gamera

Gamera is a framework for the creation of structured document analysis applications by domain experts. It combines a programming library with GUI tools for the training and interactive development of recognition systems.

Downloads: 0 This Week

Last Update: 2016-05-11
See Project
24

Tygamusic

A pygame music lib.

This lib was produced while I was programming an other program/game. I was tired of pygame's bad system of handling playlists and the management of music in general. With this lib I want to create an layer that allows you to interact with the music, how you would expect it. Currently featuring: -Playlist -Normal pausing and resuming (played time isn’t lost when new song is loaded) -Automatic recognition of songs and adding them to a separate list

Downloads: 0 This Week

Last Update: 2015-04-10
See Project
25

ProximityForest

Efficient Approximate Nearest Neighbors for General Metric Spaces

A proximity forest is a data structure that allows for efficient computation of approximate nearest neighbors of arbitrary data elements in a metric space. See: O'Hara and Draper, "Are You Using the Right Approximate Nearest Neighbor Algorithm?", WACV 2013 (best student paper award). One application of a ProximityForest is given in the following CVPR publication: Stephen O'Hara and Bruce A. Draper, "Scalable Action Recognition with a Subspace Forest," IEEE Conference on Computer Vision...

Downloads: 0 This Week

Last Update: 2015-03-26
See Project