Skip to content

noah-art3mis/intersect

Repository files navigation

Intersect - Personalized job matching

Find the job you actually want using AI.

Access here: https://intersect.streamlit.app

Intersect (web app) is a job-searching tool that uses NLP to reorder job postings based on semantic similarity rather than traditional keyword searches. Unlike lexical search (BM25), which relies on exact word matches, semantic search uses dense vectors to represent meaning (Boykis, 2023; Mitchell, 2019; Schmidt, 2015), providing more personalized results when used with user-provided text. By providing the user with different information retrieval methods (original ranking, semantic search, lexical search, semantic delta, reranking), the purpose of Intersect is to enhance job discovery and reduce manual effort.

Intersect uncovers non-obvious job opportunities by enhancing traditional search methods with NLP. The varied outcomes suggest a hybrid approach—combining keyword, semantic, and reranking techniques—could yield optimal results.

It involves

  • Fetching job listings via APIs (currently Reed API) and vectorizing results with OpenAI's text-embedding-3-small.
  • Capturing user input (text or PDF CV) and reordering results by computing similarity via dot product.
  • Visualizing clusters using UMAP and HDBSCAN.
  • Displaying original ranking from the job API.
  • Reordering results using BM25 (lexical search).
  • Reordering results using semantic search (embedding similarity).
  • Identifying semantic delta (jobs that rank differently between lexical and semantic search).
  • Reranking with Cohere's cross-encoder.

Implementation details:

  • web development
    • uv: environment and dependency management
    • streamlit: web framework (frontend and backend) and hosting
    • pypdf: pdf cv parsing
  • data science
    • semantic search: OpenAI's text-embedding-3-small
    • lexical search: bm25s (Lucene method)
      • preprocessing (tokenizer, stemmer, stop words)
    • visualization: UMAP + HDBSCAN (umap-learn, hdbscan)
    • reranker: Cohere's reranking model (rerank-v3.5)

References

About

Job board which uses NLP to find more relevant roles

Topics

Resources

Stars

Watchers

Forks

Languages