This project captures a live video stream from an IP camera(in my case I used my phone with an app called IP camera), detects text using OCR, identifies questions, and answers them using a local LLM (Mistral 7B in GGUF format) running on CPU.
- Real-time video capture from an IP camera
- Optical Character Recognition (OCR) with Tesseract
- Automatic detection of questions in the video
- Local LLM (no internet required) for answering questions
- Live visual feedback using OpenCV
- Python 3.8+
- OpenCV
- pytesseract
- llama-cpp-python
- A GGUF LLM model like
mistral-7b-instruct-v0.1.Q4_K_M.gguf - Tesseract OCR installed (with
tesseractaccessible in your PATH)
git clone https://github.com/psycho237-prog/Quizbox-AI-
cd Quizbox-AI-
pip install -r requirements.txtsudo apt update
sudo apt install tesseract-ocr- Update the IP camera URL in
main.py:
ip_camera_url = 'http://your-camera-ip:port/video'- Set the correct path to your local
.ggufmodel:
llm = Llama(model_path="path/to/your-model.gguf")- Run the script:
python main.pyPress q to quit the video window.
You can download the Mistral model (GGUF format) from HuggingFace:
Place it in your project folder or adjust the path accordingly.
QUESTION: What is machine learning?
ANSWER : Machine learning is a subset of AI that enables systems to learn from data...
MIT License. Feel free to use, modify, and share.
Onana Gregoire Legrand