DeepMatch is an advanced pipeline that revolutionizes resume-to-job-description matching by leveraging Named Entity Recognition (NER) and transformer-based embeddings. Extract structured data, compute semantic similarities, and automate candidate screening with precision! 🚀
- Overview
- Project Structure
- Features
- Example Outputs
- Setup Instructions
- Data Notes
- Use Cases
- Tech Stack
- License
- Author
- Acknowledgements
DeepMatch uses state-of-the-art NLP to extract structured entities (e.g., skills, experience, degrees) from resumes and job descriptions, then compares them using dense vector embeddings. It supports semantic matching, skill relevance scoring, and automated candidate ranking — making it ideal for modern HR automation.
Extract structured information from resumes and job descriptions with high accuracy.
Models Used:
-
spaCy
en_core_web_sm(pretrained, lightweight)en_core_web_trf(transformer-based, high accuracy)- Custom-trained spaCy model on resume NER data
-
Hugging Face Transformers
bert-base-caseddistilbert-base-uncased
Entities Extracted:
- Name
- Phone
- Location
- Degree
- Designation
- Company
- Years of Experience
- Skills
Converts entity-level text into dense vectors for semantic comparison.
Models Supported:
all-MiniLM-L6-v2paraphrase-MiniLM-L12-v2sentence-t5-basesentence-t5-large
Embedding Modes:
- Per-Entity: Individual embeddings for each entity
- Combined: Joint embeddings for concatenated entities
Measures alignment between resumes and job descriptions.
Metrics:
- Cosine Similarity (default)
- Dot Product (alternative)
- Euclidean Distance (optional)
Scoring Options:
- Per-entity similarity for granular insights
- Joint profile-level comparison for overall match
- Sample NER Output
- Similarity Score Heatmap
(Files available in the output/ folder.)
Get DeepMatch up and running with these simple steps:
git clone https://github.com/prakadeesh01/deepmatch.git
cd deepmatch
pip install -r requirements.txt
jupyter notebook
Input:
Place resumes and job descriptions in the data/ folder.
Supported Formats: .pdf, .docx
Output:
NER results, embeddings, and similarity scores are saved in output/.
Privacy:
No actual resume data is included in the majority of the repository to protect personal information.
DeepMatch powers a range of HR and recruitment solutions:
- ✅ Resume Screening Systems: Automate candidate evaluation with precision.
- ✅ Job Recommendation Engines: Match candidates to ideal roles.
- ✅ Candidate–Job Fit Matching: Rank candidates by semantic alignment.
- ✅ Automated Skill Gap Analysis: Identify areas for upskilling.
- Languages: Python 3.9+
- NER: spaCy, Hugging Face Transformers
- Embeddings: SentenceTransformers, T5
- Similarity Metrics: scikit-learn, SciPy
- Environment: Jupyter Notebooks, VS Code
This project is licensed under the MIT License.
Prakadeesh K S
GitHub: @prakadeesh01
- spaCy for robust NER capabilities
- Hugging Face Transformers for pretrained language models
- SentenceTransformers for efficient semantic embeddings