Skip to content

noxhex/PoC-ML-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PoC AI — Text Classifier + API

This is a proof‑of‑concept AI/ML project that trains a simple text classifier (ticket triage vibe) on a subset of the 20 Newsgroups dataset, then serves predictions via a FastAPI HTTP service.

What it does

  • Trains a TF‑IDF + Logistic Regression model to classify text into one of 4 categories:
    • comp.sys.mac.hardware
    • rec.autos
    • sci.med
    • talk.politics.misc
  • Exposes a POST /predict endpoint that takes free‑text and returns the predicted label and class probabilities.

This is intended as a compact, inspectable PoC you can extend (swap datasets, add pre/post‑processing, log metrics, etc.).

Quickstart

1) Setup

python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

2) Train

python -m src.train

This downloads (via scikit‑learn) a subset of 20 Newsgroups, trains a model, prints a report, and writes:

  • artifacts/model.joblib
  • artifacts/labels.json
  • artifacts/metrics.json

3) Serve

uvicorn src.serve:app --reload

Now send a request:

curl -X POST http://127.0.0.1:8000/predict   -H "Content-Type: application/json"   -d '{"text":"My Mac keeps freezing when I plug in an external display"}'

Example response:

{
  "label": "comp.sys.mac.hardware",
  "scores": {"comp.sys.mac.hardware": 0.62, "rec.autos": 0.08, "sci.med": 0.05, "talk.politics.misc": 0.25}
}

4) Run tests

pytest -q

5) Docker (optional)

docker build -t poc-ai .
docker run -p 8000:8000 poc-ai

Project Layout

poc-ai-ml/
├─ src/
│  ├─ train.py          # train + evaluate + save artifacts
│  ├─ serve.py          # FastAPI inference app
│  ├─ model.py          # model/pipeline factory
│  ├─ schema.py         # pydantic request/response models
│  └─ __init__.py
├─ tests/
│  └─ test_predict.py   # quick smoke test for the trained model
├─ requirements.txt
├─ Dockerfile
├─ .gitignore
└─ README.md

Notes

  • This keeps dependencies light and everything in plain Python for clarity.
  • You can switch to other datasets (e.g., SMS spam) or add a database/logger with minimal code changes.
  • For production use, add robustness (input validation, monitoring, retries, CI/CD, etc.).

About

A compact AI/ML PoC that trains a TF-IDF + Logistic Regression text classifier on a four-class newsgroups subset and serves it with FastAPI. Includes training, saved artifacts, tests, and a ready-to-run Docker container.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors