Skip to content

Assess product toxicity from images of ingredient labels by OCR-extracting text, normalizing ingredient names, and matching them against a local toxicology database to compute per-ingredient and overall risk score

Notifications You must be signed in to change notification settings

muntisa/ToxScan2

 
 

Repository files navigation

ToxScan2

Programed by Cristian R. Munteanu

Affiliations:

  • RNASA-IMEDIR, CITIC, Universidade da Coruña (UDC)
  • QANAP - Applied Analytical Chemistry, UDC
  • ECOTOX - Ecotoxicology and Marine Chemical Pollution, University of Vigo (UVIGO)

What is ToxScan2?

ToxScan2 is a privacy-first, offline web application that analyzes product ingredients lists to detect potential toxicity. Unlike other apps that require an internet connection and send your photos to the cloud, ToxScan runs entirely on your device using a local expert system.

It empowers consumers with accessible information, helping them make more informed decisions about the products they use every day, without compromising their data privacy.

The Embedded Expert System

Since we cannot use the live internet without API keys or heavy cloud models, ToxScan2 utilizes a specialized "Embedded Expert System" architecture. Instead of searching the web every time, we have embedded a comprehensive "Toxicology Dictionary" directly into the application code.

Here is the architecture for the Serverless, No-API-Key ToxScan2 app:

1. The "Eye" (OCR)

  • Tool: Tesseract.js (WebAssembly)
  • Why: It is the industry standard for browser-based Optical Character Recognition (OCR). It downloads a small language file (~4MB) once and runs entirely offline to read text from your camera.

2. The "Brain" (Knowledge Base)

  • Tool: A custom, comprehensive ingredients_db.ts file.
  • Why: We have generated a JSON database containing 1,000+ common cosmetic and food ingredients. Each entry contains hard-coded data: Risk Level, Function, Affected Organs, and Description.
  • Benefit: Zero latency, works offline, 100% privacy, no API keys required.

3. The "Logic" (Fuzzy Matching)

  • Tool: Fuse.js (Fuzzy Search Algorithm).
  • Why: OCR often makes typos (e.g., reading "Sulfates" as "Su1fates"). A standard exact-match database search would fail. Fuse.js uses mathematical probability to match the messy OCR text to the correct ingredient in our local database.

How to Use It

  1. Scan or Upload: Tap "Scan ingredients" to use your camera or "Upload from Gallery" to select a photo.
  2. Capture: Frame the ingredients list clearly. Good lighting is essential for accurate on-device OCR.
  3. Local Analysis: The app processes the image instantly on your phone. No data leaves your device.
  4. Review Results: View the Toxicity Score, Summary, and detailed breakdown of ingredients and affected systems.

Run Locally

Prerequisites: Node.js

  1. Install dependencies:
    npm install
  2. Run the app:
    npm run dev

Note: No API keys or environment variables are needed.

Limitations

  • Database Scope: The app can only identify ingredients that exist in its internal database. Rare or new chemicals may not be detected.
  • OCR Sensitivity: Because the analysis happens in the browser, it requires clear text. Blurry images or handwriting may result in poor detection.
  • Not Medical Advice: Information is provided for educational purposes only.

Disclaimer

Our database is regularly updated, but information may contain errors or delays. Please verify critical results with a qualified professional.

About

Assess product toxicity from images of ingredient labels by OCR-extracting text, normalizing ingredient names, and matching them against a local toxicology database to compute per-ingredient and overall risk score

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • TypeScript 97.8%
  • HTML 2.2%