Named Entity Recognition

Objective : To build a web application that automatically extracts and identifies named entities from text using advanced Natural Language Processing (NLP) and Machine Learning (ML) models. The system detects entities such as persons, organizations, locations, products, dates, monetary values, and provides analytics, sentiment analysis, contextual categorization, and multi-document processing. Features: Text highlighting, entity aggregation, and multi-doc analysis. Backend uses spaCy and HuggingFace Transformers, for better performance across news, business, academic, and social media text.

Entity Extraction

People, organizations, countries, cities, dates, money, events, products, etc.
Color-coded highlighting in extracted text
Interactive sidebar with click-to-highlight feature

Advanced Features

Sentiment Analysis
Contextual Categorization (e.g., grouping similar entities)
Analytical Dashboard with charts & entity frequency
Model Switching (spaCy small / Transformer model)

Backend

Python 3.10+
FastAPI
spaCy NER / HuggingFace Transformers
Uvicorn

Frontend

React 18
Vite
Tailwind CSS
Axios

Prerequisites

Install: Python, Node.js, npm, Git

Setup

1. Clone Repo

git clone https://github.com/YOUR_USER_NAME/ner-ml-project.git
cd ner-ml-project

2. Backend

mkdir backend
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Download a NER Model

Small Model:

python -m spacy download en_core_web_sm

Transformer Model:

python -m spacy download en_core_web_trf

4. Start Backend

uvicorn app.main:app --reload --port 8000

Backend URL: http://localhost:8000
Health check: http://localhost:8000/health

5. Frontend

cd ../frontend
mkdir frontend (if not there)
cd frontend
npm install
npm run dev

App URL: http://localhost:5173

Datasets

CoNLL-2003

Entity types: PERSON, ORG, LOC, MISC

WNUT-17

3,394 social media texts
Noisy, real-world anomalies
Useful for informal language

OntoNotes 5.0

Large, multi-genre corpus
18 entity categories

Model

NER Pipeline

Tokenization
Feature representation
Named Entity Recognition
Post-processing
JSON output

Models

spaCy en_core_web_sm

Fast, lightweight
Suitable for real-time UI

spaCy transformer en_core_web_trf

BERT/RoBERTa-based
Higher accuracy (~91% F1)
Higher latency

Custom Model Use

nlp = spacy.load("./custom_ner_model")

Result

Performance Metrics

Model	Precision	Recall	F1 Score	Latency
spaCy small	89%	90%	89.5%	120 ms
Transformer	92%	91%	91.5%	430 ms

Achievements

~90% F1 Score
<500 ms average latency
Supports documents up to 10k words

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
Named Entity Recognition (NER) Tool.pdf		Named Entity Recognition (NER) Tool.pdf
README.md		README.md
Research Paper NER.pdf		Research Paper NER.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Named Entity Recognition

Entity Extraction

Advanced Features

Backend

Frontend

Prerequisites

Setup

1. Clone Repo

2. Backend

3. Download a NER Model

4. Start Backend

5. Frontend

Datasets

CoNLL-2003

WNUT-17

OntoNotes 5.0

Model

NER Pipeline

Models

spaCy en_core_web_sm

spaCy transformer en_core_web_trf

Custom Model Use

Result

Performance Metrics

Achievements

About

Uh oh!

Packages

Languages

Nishtha031105/ner-ml-project

Folders and files

Latest commit

History

Repository files navigation

Named Entity Recognition

Entity Extraction

Advanced Features

Backend

Frontend

Prerequisites

Setup

1. Clone Repo

2. Backend

3. Download a NER Model

4. Start Backend

5. Frontend

Datasets

CoNLL-2003

WNUT-17

OntoNotes 5.0

Model

NER Pipeline

Models

spaCy en_core_web_sm

spaCy transformer en_core_web_trf

Custom Model Use

Result

Performance Metrics

Achievements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Languages

Packages