Enterprise vision-query Technical Architecture focusing on Scalability and High Performance.
-
Updated
Mar 8, 2026
Enterprise vision-query Technical Architecture focusing on Scalability and High Performance.
The Breakup Recovery Agent is an AI-powered emotional support system built with Streamlit and Agno that coordinates a specialized team of Gemini agents to provide empathetic guidance and personalized healing plans. By analyzing user feelings and chat screenshots, it offers a multi-perspective approach to recovery through supportive therapy, closure
"A full-stack AI document intelligence app built with React, FastAPI, and Google Gemini Vision. Supports instant extraction and chat for PDFs, Images, and Text files."
Using Google Vision AI
This repository explores OpenAI’s o1 model, a cutting-edge AI designed for abstract reasoning, coding, and vision-based tasks. It provides insights into o1’s strengths, advanced prompting techniques, task delegation, and real-world applications, enabling developers to build intelligent, high-performance AI-driven solutions.
"Nicola Blind Assistant" — мобільний додаток, який допомагає людям з вадами зору орієнтуватися в просторі, розпізнавати текст, об'єкти та обличчя, використовуючи сучасні технології."
Web scraping and machine learning for sentiment analysis over the history of a term's usage on twitter.
Transformer OCR by Torch Lightning
Calorie Tracker Pro is a modern web app that analyzes food images to estimate calories and macronutrients using AI. Built with HTML, Tailwind CSS, Chart.js, Firebase, and Groq Vision models, it supports image upload/camera capture, interactive nutrient charts, history tracking, CSV export, and theme customization—all in a sleek glassmorphic UI.
AI-powered image analysis tool — upload any image and get smart, context-aware prompts generated instantly using vision LLMs
AI-powered invoice understanding system using Vision + LLMs (Gemini API). Extracts structured fields from multilingual invoice images and enables intelligent natural language querying through a Streamlit application.
AI-driven object-aware image colorization system that restores grayscale images with realistic, context-sensitive color mapping.
PyCVision is a Python-based real-time object detection system powered by the YOLOv3 (You Only Look Once) algorithm. This project leverages the efficiency and accuracy of YOLOv3 for detecting and classifying multiple objects in live video streams or static images.
Multimodal RAG pipeline to transform static PDF manuals into image-aware AI chatbots. Local, secure, and transparent.
Chrome extension that uses Vision AI to solve captchas. Supports Claude, GPT-4o, Gemini, Qwen-VL, and local Ollama models.
Production REST API with autonomous Vision AI classification for infrastructure inspections. Explainable AI rationale, structured LLM output. FastAPI · Groq Vision AI · LangChain · LangSmith · PostgreSQL · Docker.
Multimodal Vision-AI: CLIP eyes + Qwen2.5 brain, 155 K-step pipeline & demo.
Add a description, image, and links to the vision-ai topic page so that developers can more easily learn about it.
To associate your repository with the vision-ai topic, visit your repo's landing page and select "manage topics."