Modification of CX_DB8 project by refactoring the code, adding user interface, and adding minor functional features.
- Asymmetric or Symmetric Supported Semantic Search with Average Attention from n-gram Scores using Sliding Window Algorithm.
- Plug-and-Play Retriever Model & Reranker Model.
- Text, Web, and Pdf Input to Text and Pdf Output format.
- Output Highlighter.
- Processing time & Score statistics.
- Caching to speedup reprocessing (Click "Git remote repository sync" button/rerun notebook cell to clear unused data in RAM after repeated unique processing).
- Raw results for inspecting.
- @Hellisotherpeople (Base idea and implementation)
- @muazhari (Modification)
- Get your ngrok Authentication Token.
- Create cell based on below Jupyter Notebook script in Kaggle or other alternatives.
#@title Semantic Search App
NGROK_TOKEN = "" #@param {type:"string"}
%cd ~
!git clone https://github.com/muazhari/semantic-search.git
%cd ~/semantic-search/
!git fetch --all
!git reset --hard origin
!apt-get update -y
!yes | DEBIAN_FRONTEND=noninteractive apt-get install -yqq wkhtmltopdf xvfb libopenblas-dev libomp-dev poppler-utils openjdk-8-jdk jq
!pip install -r requirements.txt
!pip install pyngrok
!nvidia-smi
get_ipython().system_raw(f'ngrok authtoken {NGROK_TOKEN}')
get_ipython().system_raw('ngrok http 8501 &')
print("Open public URL:")
!curl -s http://localhost:4040/api/tunnels | jq ".tunnels[0].public_url"
!streamlit run ~/semantic-search/app.py
!sleep 10000000
- Submit your ngrok Authentication Token to
NGROK_TOKEN
coloumn in the cell form. - Enable GPU in the Notebook.
- Run the cell.
- Wait until the setups are done.
- Open ngrok public URL.
- Use the app.
- This repository not yet peer reviewed, so be careful when using it.