On-device emoji semantic lookup CLI powered by vector search
It's a simple CLI that uses sentence-transformers/all-MiniLM-L6-v2 quantized ONNX model and a vector search powered by sqlite-vec to help you to find emoji you are looking for
uv tool install https://github.com/subpath/Emji.git
emji wavegit clone https://github.com/subpath/Emji.git
cd Emji
uv tool install -e .
emji waveAll data is stored under ~/.emji:
- Model file:
~/.emji/model_qint8_arm64.onnx - Default emoji data:
~/.emji/shortnames.json - Optional override data:
~/.emji/shortnames_override.json(takes precedence if present) - Vector index:
~/.emji/emoji_index.db - Config:
~/.emji/.config
On first run, the CLI will automatically download:
- The emoji data from this gist
- The quantized ONNX model from
sentence-transformers/all-MiniLM-L6-v2
The CLI will automatically download the quantized ONNX model from sentence-transformers/all-MiniLM-L6-v2 when needed. You can also use any other bi-encoder of your choice that fits your device by replacing the ONNX file. If you are using a model that requires different prompts to encode a query and a searched entity (emojis in our case) then you will need to modify the source code.
The tiny dedicated finetuned MiniLM is coming soon.
- Search for emojis (the CLI will automatically download dependencies and build the index if needed):
emji happy birthday
emji coffee break
emji celebration party-
Select and copy: Choose from the interactive list, and the emoji will be copied to your clipboard!
-
Force rebuild index (if needed):
emji --build-index- Cleanup all Emji data (removes
~/.emji):
emji --cleanupemji <text>- Search for emojis matching your descriptionemji --build-index- Force rebuild the semantic search indexemji --cleanup- Delete all Emji data and config under~/.emjiemji --show-stats- Show emoji popularity statistics
Options:
--n <number>: Number of results to return (default: 3). Also limits rows for--show-stats.
# Find celebration emojis
emji party celebration
# Find food-related emojis
emji delicious food
# Find weather emojis
emji sunny day
# Get more results
emji animals --n 5
# Force rebuild the index
emji --build-index- Automatic Setup: On first run, the CLI automatically downloads the required model and emoji data
- Embedding Generation: Uses a quantized version of
sentence-transformers/all-MiniLM-L6-v2to convert emoji names and descriptions into 384-dimensional vectors - Vector Database: Stores embeddings in SQLite with
sqlite-vecextension for fast similarity search - Semantic Matching: When you search, your query is converted to an embedding and compared against all emoji embeddings
- Personalized Re-ranking: Results are re-ranked using a blend of cosine similarity and historical click-through rates (CTR), controlled by
ALPHAin the config. CTR impact increases with impressions and is discounted by rank. - Interactive Selection: The best matches are presented in an interactive menu for you to choose from
A JSON config is stored at ~/.emji/.config and is created on first run:
ALPHA(float): Balance between cosine similarity and CTR in ranking (default: 0.2)MODEL_URL(string): URL to download the ONNX modelEMOJI_URL(string): URL to download the emoji shortnames JSON
To override the emoji dataset entirely, place your file at ~/.emji/shortnames_override.json (it takes precedence over the default).
# create virtual env
uv venv
# activate
source .venv/bin/activate
# sync dependencies
uv sync --all-groups
# Set up pre-commit hooks
pre-commit installLicensed under the Apache License 2.0. See for details.
Contributions are welcome! Please feel free to submit a Pull Request.