This is a demo image search engine powered by BGE-VL and Faiss. It supports two retrieval modes:
- Text-to-Image Retrieval: Search for images using a textual query.
- Image+Text Retrieval: Search for images based on both an image and a textual query.
The demo uses BAAI/BGE-VL-large
to extract image embeddings and leverages Faiss for efficient similarity search.
Image embeddings are extracted and stored as a file for fast future retrieval.
- Install all requirements.
pip install -r requirements.txt
- Modify line 21 of demo.py to specify your device (gpu or cpu).
Note: The model requires ~2GB VRAM for inference.
- Modify line 48 to the folder containing your images. The embeddings for these images will be generated and added to the index.
- Run
demo.py
with Python. You may use a Jupyter interactive window for better experience.
Embeddings were generated for 8,319 images. Each embedding is a 768-dimensional vector, requiring ~25MiB of storage for all vectors (~3MiB per 1,000 embeddings). Embedding extraction speed varies by hardware:
- Intel i7-1255U: ~1.5 embeddings/sec.
- NVIDIA RTX 2050: ~8.2 embeddings/sec.
- NVIDIA A800: ~39.1 embeddings/sec.
For both retrieval modes, I used the same target image (a color wheel of Vocaloid characters) to test the performance.
For Text-to-Image Retrieval, I used the query: rainbow, color wheel, colorful, color circle
, and the target image ranked at #79.
This ranking is quite low, with many less relevant images ranked higher, causing me to miss the target image initially.
For Image+Text Retrieval, I used the query: Find a picture that also a color wheel
, along with another picture of a color wheel of Vocaloid characters.
In this case, the target image ranked at #12, which is better but still not ideal, as many less relevant images were ranked higher.
This demo was inspired by bd4sur/meme-search. Much of the code is adapted from the the offical demo of BGE-VL.
This demo was created to help me search this color wheel of Vocaloid characters. Special thanks to @lapis_jam2 for drawing this colorful work and motivating me to build this demo :)