A Telegram bot that helps refugees find relevant information from multiple sources.
This project aims to create a system that:
- Collects information from refugee-related websites and Telegram groups
- Processes and indexes this information for efficient retrieval
- Provides a Telegram bot interface for refugees to query this information
- Offers an admin interface for managing and updating the information
The system uses a functional approach with clear separation of concerns:
- Storage Layer: MongoDB for document storage and Milvus for vector embeddings
- Processing Layer: Text chunking, embedding generation, and search
- Interface Layer: Telegram bot and admin interface
- Python 3.8+
- Docker and Docker Compose
- VoyageAI API key (for embeddings)
- Telegram API credentials (for scraping Telegram groups)
The project uses Docker Compose to manage the following services:
-
MongoDB: Document database for storing content
- Port: 27017
- Admin UI: Mongo Express on port 8081
-
Milvus: Vector database for storing embeddings
- Port: 19530
- Admin UI: Attu on port 8000
-
Clone the repository
git clone https://github.com/your-username/ino-refugee-bot.git cd ino-refugee-bot -
Create a virtual environment and install dependencies
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
-
Set up environment variables (create a
.envfile)# Copy the example file cp .env.example .env # Edit the file to add your credentials nano .env # or use any text editor
-
Start the database services
docker compose up -d
-
Set up the database collections
python main.py setup
-
MongoDB Express: http://your-server-ip:8081
- Username and password are configured in .env file
-
Milvus Attu: http://your-server-ip:8000
To index documents from a JSON file:
python main.py index --file data/documents.json --text-field contentTo start the Telegram bot:
python main.py botTo start the admin interface:
python main.py adminrefugee-bot/
├── app/ # Application code
│ ├── storage/ # Storage abstractions
│ │ ├── interfaces.py # Storage interfaces
│ │ ├── mongodb.py # MongoDB implementation
│ │ └── milvus.py # Milvus implementation
│ ├── processing/ # Text processing
│ │ ├── chunking.py # Text chunking functions
│ │ └── embedding.py # Embedding generation
│ └── indexer.py # Document indexing
├── data/ # Data storage
├── tests/ # Test files
├── docker-compose.yml # Docker Compose configuration
├── main.py # Main entry point
└── requirements.txt # Python dependencies
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License.