-
Bosch Rexroth AG
- Ulm, Germany
- in/sanjay-parajuli
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO and designed for fine-tuning.
An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
End-to-end realtime stack for connecting humans and AI
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
[ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
A python library for self-supervised learning on images.
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
Reference PyTorch implementation and models for DINOv3
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
This repository categorizes the papers about masked image modeling according to their main contributions. The classification is based on our survey: https://arxiv.org/abs/2408.06687.
Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Recall (AR), Spatio-Temporal Tube Average Precision (STT-AP). This project supports different bounding b…
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
A framework for building, orchestrating and deploying AI agents and multi-agent workflows with support for Python and .NET.
State-of-the-Art Text Embeddings