This is a Next.js project bootstrapped with create-next-app.
First, run the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun devThe dev runner automatically picks the first available port, preferring 3000. Check the terminal output if 3000 is busy to see which port it settled on and open that URL in your browser.
You can start editing the page by modifying app/page.tsx. The page auto-updates as you edit the file.
This project uses next/font to automatically optimize and load Geist, a new font family for Vercel.
- Python 3.10+
pip install -r python/requirements.txt(installspdfplumberand the scikit-learn stack)- Optional:
minerupackage if you want to experiment with the full extraction pipeline
-
Install Python Dependencies: Ensure you have Python 3.8+ installed. Then, install the required libraries:
pip install -r python/requirements.txt
This installs
pdfplumberfor PDF extraction,scikit-learnfor transaction categorization fallbacks, anddeep-translatorso the worker automatically translates non-English descriptions to English. -
Start the Next.js dev server:
npm run dev -
Visit
http://localhost:3000/transactions/import -
Drag a PDF bank statement (for example
src/lib/credo_statement.pdf) into the drop zone or click Select PDF -
The UI shows the upload pipeline (queued → uploading → processing → categorizing → ready)
-
Review the editable table, tweak any fields, and press Confirm Import to POST to
/api/transactions/import -
Confirmed imports appear in the history table with search, filters, and pagination
- Entry point:
python/process_pdf.py - Accepts
PDF_PATHand optionalMODEL_PATH(python/models/categories.ftzby default) - Uses
pdfplumberheuristics for GEL statements; falls back to MinerU/sample data if parsing fails - Outputs JSON payload:
{ transactions: [...], metadata: { currency, source, periodStart, periodEnd } } - Run locally:
python python/process_pdf.py src/lib/credo_statement.pdf
- A placeholder model lives at
python/models/transactions_model.joblib; replace with a trained model when available - Override its location with the
CATEGORIES_MODEL_PATHenvironment variable if needed - A placeholder
python/models/README.mdexplains how to train and save ascikit-learnmodel astransactions_model.joblib. - If
transactions_model.joblibis not present, the Python worker will use keyword-based heuristics for categorization. It matches translated descriptions against weighted keywords (e.g. "conversion", "withdrawal", "fee") and derives a confidence score from the total weight it finds.
- Temporary PDFs are saved under
tmp/uploadsand removed once processing finishes - Override the temp directory with
TMPDIR
Python dependencies (pdfplumber, scikit-learn, deep-translator) are required for PDF processing. They are automatically installed when you run npm install (via the postinstall script), but you can also install them manually:
pip install -r python/requirements.txt- Deploy your Next.js app to Vercel (frontend + API routes that don't need Python)
- Deploy a separate Python service (e.g., on Render, Railway, or Fly.io) for PDF processing
- Update
/api/transactions/upload-bank-statementto call the external Python service via HTTP - Set
PYTHON_SERVICE_URLenvironment variable in Vercel
- Use Vercel's Docker support to run a container with both Node.js and Python
- Requires custom Dockerfile and more complex setup
- Deploy the full Next.js app to Render or Railway (both support Python)
- Install Python dependencies in the build step:
pip install -r python/requirements.txt - Set environment variables:
PYTHON_PATH,CATEGORIES_MODEL_PATH,TMPDIR
For any deployment, set these environment variables:
PYTHON_PATH- Path to Python executable (default:python3on Linux/Mac,pythonon Windows)CATEGORIES_MODEL_PATH- Path to the ML model file (optional, defaults topython/models/categories.ftz)TMPDIR- Temporary directory for uploaded PDFs (optional)
- Replace
python/models/categories.ftzwith the trained model or download from object storage at runtime
To learn more about Next.js, take a look at the following resources:
- Next.js Documentation - learn about Next.js features and API.
- Learn Next.js - an interactive Next.js tutorial.
You can check out the Next.js GitHub repository - your feedback and contributions are welcome!
The easiest way to deploy your Next.js app is to use the Vercel Platform from the creators of Next.js.
Check out our Next.js deployment documentation for more details.