Skip to content

Alkhioz/CatDogPractice

Repository files navigation

Dog vs Cat Classifier 🐶😺

Tiny end-to-end project to:

  • Train a cats vs dogs classifier with PyTorch

  • Serve it via FastAPI

  • Test it using a small HTML UI

This README explains:

  • How to set up Python 3.12.3 with pyenv

  • How to create and install the virtualenv with make

  • How to add data (cats/dogs) into the data/ folder

  • How to split the dataset into train/val

  • How to check the data for corrupt images

  • How to train the model and run the API

  1. Prerequisites macOS + Homebrew

Make sure you have Homebrew installed.

Install the system dependency for LZMA (used by Pillow/torchvision):

brew install xz

pyenv

Install pyenv:

brew install pyenv

Add this to your shell config (~/.zshrc or ~/.bashrc):

export PYENV_ROOT="$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"

Reload your shell:

source ~/.zshrc # or ~/.bashrc
  1. Python 3.12.3 via pyenv (per-project)

Install Python 3.12.3 with pyenv:

pyenv install 3.12.3

Then in this project folder:

cd /catdog # or wherever you cloned it
pyenv local 3.12.3
python -V # should show Python 3.12.3

This creates a .python-version file so this directory always uses Python 3.12.3.

  1. Virtualenv + dependencies

The project uses a Makefile to manage the venv.

From the project root:

make install

This will:

Create .venv/ with python -m venv .venv

Activate it

Install dependencies from requirements.txt

You only need make install once (or when you change requirements.txt).

  1. Dataset layout (cats & dogs)

The project expects a structure like:

data/
    train/
        cats/
            cat001.jpg
            cat002.jpg
            ...
        dogs/
            dog001.jpg
            dog002.jpg
            ...
    val/
        cats/ # initially empty OR you can skip this folder
        dogs/ # initially empty OR you can skip this folder

Step-by-step

Create folders:

mkdir -p data/train/cats data/train/dogs
mkdir -p data/val/cats data/val/dogs

Put all your labeled images into:

data/train/cats/ for cat images

data/train/dogs/ for dog images

Leave data/val/* empty if you want the project to generate the 80/20 split for you.

If you already have your own train/val split, you can skip the make split step and just ensure the folders follow the structure above.

  1. Split data into train (80%) and val (20%)

The split target runs split.py (a small helper script) which:

Looks at data/train//

Randomly keeps ~80% of each class in train/

Moves ~20% of each class to data/val//

Run:

make split

After this, you’ll have an ~80/20 split for each class:

data/ train/ cats/ # ~80% of original cats dogs/ # ~80% of original dogs val/ cats/ # ~20% of original cats dogs/ # ~20% of original dogs

  1. Check for corrupt / bad images

Real datasets often contain broken files. The check target runs check.py, which:

Walks over data/train and data/val

Attempts to open each image with PIL

Moves any corrupt or unreadable images into data/bad/ (preserving subfolder structure)

Run:

make check

You’ll see logs for any bad files, and they’ll be moved out of the train/val sets.

If you see UserWarning: Truncated File Read or UnidentifiedImageError during training, it’s a sign you should re-run make check to quarantine bad images.

  1. Train the model

Once:

Python 3.12.3 is set via pyenv (pyenv local 3.12.3)

Dependencies are installed (make install)

Data is split (make split)

Data is cleaned (make check)

you can train the model:

make train

This will:

Use the images from data/train/ and data/val/

Train a small ResNet18-based classifier for ['cats', 'dogs']

Save the best checkpoint (by validation accuracy) to:

dogcat_model.pth

You’ll see logs like:

Using device: cpu Classes: ['cats', 'dogs'] Epoch 1/5 Train loss: ... acc: ... | Val loss: ... acc: ... New best val acc: 0.9472, saving model...

  1. Run the FastAPI service + UI

Start the FastAPI app:

make run

This runs:

. .venv/bin/activate && uvicorn service:app --reload --port 9000

The app will:

Load dogcat_model.pth

Expose:

GET / – serves static/index.html (simple test UI)

POST /predict – takes an uploaded image (file) and returns JSON like:

{ "label": "cats", "confidence": 0.97 }

Testing via browser

Open:

http://localhost:9000/

Upload a cat or dog image.

The UI will:

Show a preview of the image at 300px width

Color the border:

Green (2px) if confidence ≥ 70%

Red (2px) otherwise

Show the label + confidence percentage in the same color

Dump the raw JSON response below

Testing via curl

You can also test the API directly:

curl -X POST "http://localhost:9000/predict " -H "Accept: application/json" -F "file=@/full/path/to/your_image.jpg"

  1. Cleaning up

To remove the virtualenv and Python caches:

make clean

This deletes:

.venv/

pycache/ folders in subdirectories

Your code, data/, and dogcat_model.pth remain untouched.

  1. Makefile summary

For reference, the current Makefile:

.PHONY: venv install train run clean

Create Python virtualenv

venv: python -m venv .venv

Install dependencies into the venv

install: venv . .venv/bin/activate && pip install --upgrade pip && pip install -r requirements.txt

Train the dog/cat classifier

train: . .venv/bin/activate && python train.py

split: . .venv/bin/activate && python split.py

check: . .venv/bin/activate && python check.py

Run the FastAPI inference service

run: . .venv/bin/activate && uvicorn service:app --reload --port 9000

Clean venv and any cached stuff

clean: rm -rf .venv pycache */pycache

Use it as your main “command center”:

make install – setup env

make split – 80/20 split

make check – clean bad images

make train – train model

make run – start API + UI

make clean – reset venv/caches

About

Practice of trainning a model, using it from within fastapi and send images from a simple ui

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published