- Neronov Roman
- Fazlyev Albert
KONSPECTO is an intelligent agent based on a local LLM model, offering the following capabilities:
🔍 Search Through Notes
- Semantic search across the notes database
- Generation of structured responses based on the retrieved information
- Ability to view original documents
🎥 Video Processing
- Extraction of keyframes from YouTube videos
- Creation of DOCX documents with images
- Filtering of similar frames
🎤 Voice Input
- Transcription of voice messages using Whisper
- Support for the Russian language
- Ability to combine voice and text input
- ⚛️ React + Vite
- 🎨 TailwindCSS
- 🔄 React Router
- ✨ React Icons
- 🚀 FastAPI
- 🤖 LangChain
- 🔍 LlamaIndex
- 📝 Whisper
- 🎥 OpenCV
- 🗄️ Redis Stack
- Docker and Docker Compose
- Node.js 18+
- Python 3.11+
- Poetry
- pre-commit
- LM Studio - Download from https://lmstudio.ai
- Download and install LM Studio from the official website
- In LM Studio:
- Go to "Search" tab
- Find and download
IlyaGusev/saiga_nemo_12b_gguf/saiga_nemo_12b.Q8_0.ggufmodel - Go to "Local Server" tab
- Select the downloaded model from the dropdown menu
- Start the server (it will run on http://localhost:1234/v1)
- Keep the server running while using KONSPECTO
⚠️ Note: Make sure the LM Studio server is running before starting the application, as KONSPECTO relies on it for text generation.
git clone https://github.com/RomiconEZ/KONSPECTO
cd KONSPECTOCreate configuration files in the backend/app/config/ directory:
.env
FOLDER_ID=your_google_drive_folder_id
GOOGLE_SERVICE_ACCOUNT_KEY_PATH=config/service_account_key.json
TRANSCRIPTION_MODEL=whisper
WHISPER_MODEL_SIZE=large-v3
LLM_STUDIO_BASE_URL=http://localhost:1234/v1
EMBEDDING_MODEL_NAME="intfloat/multilingual-e5-large"
EMBEDDING_BATCH_SIZE=16
EMBEDDING_DIMENSION=1024service_account_key.json
{
// Your Google service account credentials
// Obtain them from the Google Cloud Console
}Frontend:
cd frontend
npm installBackend:
cd backend
poetry installpre-commit install --install-hooks
pre-commit run --all-filesFrontend tests:
cd frontend
npm run testBackend tests:
cd backend
bash tests/run_tests.shdocker compose up --buildThe application will be available at the following addresses:
- Frontend: http://localhost:80
- Backend API: http://localhost:8000
- Redis Stack: http://localhost:8001
-
Information Search
- The user sends a request through the UI
- The agent analyzes the request and determines the necessary tools
- A search is performed across the knowledge base and a response is generated
-
Video Processing
- Uploading a YouTube video
- Extracting frames every 5 seconds
- Filtering similar images
- Creating a DOCX document
-
Voice Input
- Recording audio via the browser
- Transcription using Whisper
- Adding the text to the current query
It is not possible to produce a deterministic assessment of the agent’s performance because its effectiveness depends on the unique data serving as its knowledge base. In our case, this knowledge base consists of user-generated notes, which are different for every individual. Consequently, any quality measurement will vary significantly from one user’s environment to another.
In this project, we tested the agent on two specific documents: one explaining gradient descent and another explaining stochastic gradient descent. The system demonstrated consistent accuracy in retrieving relevant information from these documents during the queries shown in the demo video. However, because user notes can differ in style, depth, and content, the same agent might show varied results when applied to an entirely different set of documents.
This inherent reliance on specialized, user-specific data makes it impossible to generalize the agent’s quality or establish a uniform benchmark. The system’s performance is inseparable from the nuances of the data it is provided with, preventing any deterministic evaluation of its capabilities.
Apache License
If you like the project, give it a star on GitHub!