Lists (2)
Sort Name ascending (A-Z)
Stars
Minimal and educational implementation of an LLM agent.
This code is edited version of code written by Mohamed Aly California Institute of Technology for road detection.
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment
An open source repository of Bluetooth Remote Controller for Reel-to-Reel tape recorders and vintage cassette decks.
A Python script to decode Google News article URLs.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.
Train transformer language models with reinforcement learning.
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Measurement and processing of binaural impulse responses for personalized surround virtualization on headphones.
Export Segment-Anything into ONNX format and use it from C++ with OpenCV and OpenVINO
This repository shows how to solve ONNX export issue in Segment Anything model
Exporting Segment Anything, MobileSAM, and Segment Anything 2 into ONNX format for easy deployment
📚 Jupyter notebook tutorials for OpenVINO™
Refine high-quality datasets and visual AI models
An app for collecting raw RGB-D scans on iOS devices.
The web version of Tidal running in electron with hifi support thanks to widevine.
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.
Visualize the dataset of ICDAR 2015. Only challenge 4 task 1 is available currently.
The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Videos, notes and experiments to understand deep learning
Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)
Official implementations of PSENet, PAN and PAN++.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
A PyTorch implementation of "TextFuseNet: Scene Text Detection with Richer Fused Features".