Modern Document Scanner for the Web
Scanic is a blazing-fast, lightweight, and modern document scanner library written in JavaScript and rust (WASM). It enables developers to detect, scan, and process documents from images directly in the browser or Node.js, with no dependencies or external services.
I always wanted to use document scanning features within web environments for years. While OpenCV makes this easy, it comes at the cost of a 30+ MB download.
Scanic combines pure JavaScript algorithms with Rust-compiled WebAssembly for performance-critical operations like Gaussian blur, Canny edge detection, and gradient calculations. This hybrid approach delivers near-native performance while maintaining JavaScript's accessibility and a lightweight footprint.
Performance-wise, I'm working to match OpenCV solutions while maintaining the lightweight footprint - this is an ongoing area of improvement.
This library is heavily inspired by jscanify
- π Document Detection: Accurately finds and extracts document contours from images
- β‘ Pure JavaScript: Works everywhere JavaScript runs
- π¦ Rust WebAssembly: Performance-critical operations optimized with Rust-compiled WASM
- π οΈ Easy Integration: Simple API for web apps, Electron, or Node.js applications
- π·οΈ MIT Licensed: Free for personal and commercial use
- π¦ Lightweight: Small bundle size (< 100kb) compared to OpenCV-based solutions (+30 mb)
Try the live demo: Open Demo
npm install scanicOr use via CDN:
<script src="https://unpkg.com/scanic/dist/scanic.js"></script>import { scanDocument, extractDocument } from 'scanic';
// Simple usage - just detect document
const result = await scanDocument(imageElement);
if (result.success) {
console.log('Document found at corners:', result.corners);
}
// Extract the document (with perspective correction)
const extracted = await scanDocument(imageElement, { mode: 'extract' });
if (extracted.success) {
document.body.appendChild(extracted.output); // Display extracted document
}
// Manual extraction with custom corner points (for image editors)
const corners = {
topLeft: { x: 100, y: 50 },
topRight: { x: 400, y: 60 },
bottomRight: { x: 390, y: 300 },
bottomLeft: { x: 110, y: 290 }
};
const manualExtract = await extractDocument(imageElement, corners);
if (manualExtract.success) {
document.body.appendChild(manualExtract.output);
}import { scanDocument } from 'scanic';
async function processDocument() {
// Get image from file input or any source
const imageFile = document.getElementById('fileInput').files[0];
const img = new Image();
img.onload = async () => {
try {
// Extract and display the scanned document
const result = await scanDocument(img, {
mode: 'extract',
output: 'canvas'
});
if (result.success) {
// Add the extracted document to the page
document.getElementById('output').appendChild(result.output);
// Or get as data URL for download/display
const dataUrl = result.output.toDataURL('image/png');
console.log('Extracted document as data URL:', dataUrl);
}
} catch (error) {
console.error('Error processing document:', error);
}
};
img.src = URL.createObjectURL(imageFile);
}
// HTML setup
// <input type="file" id="fileInput" accept="image/*" onchange="processDocument()">
// <div id="output"></div>Main entry point for document scanning with flexible modes and output options.
Parameters:
image: HTMLImageElement, HTMLCanvasElement, or ImageDataoptions: Optional configuration objectmode: String - 'detect' (default), or 'extract''detect': Only detect document, return corners/contour info (no image processing)'extract': Extract/warp the document region
output: String - 'canvas' (default), 'imagedata', or 'dataurl'debug: Boolean (default: false) - Enable debug information- Detection options:
maxProcessingDimension: Number (default: 800) - Maximum dimension for processing in pixelslowThreshold: Number (default: 75) - Lower threshold for Canny edge detectionhighThreshold: Number (default: 200) - Upper threshold for Canny edge detectiondilationKernelSize: Number (default: 3) - Kernel size for dilationdilationIterations: Number (default: 1) - Number of dilation iterationsminArea: Number (default: 1000) - Minimum contour area for document detectionepsilon: Number - Epsilon for polygon approximation
Returns: Promise<{ output, corners, contour, debug, success, message }>
output: Processed image (null for 'detect' mode)corners: Object with{ topLeft, topRight, bottomRight, bottomLeft }coordinatescontour: Array of contour pointssuccess: Boolean indicating if document was detectedmessage: Status message
const options = {
mode: 'extract',
maxProcessingDimension: 1000, // Higher quality, slower processing
lowThreshold: 50, // More sensitive edge detection
highThreshold: 150,
dilationKernelSize: 5, // Larger dilation kernel
minArea: 2000, // Larger minimum document area
debug: true // Enable debug information
};
const result = await scanDocument(imageElement, options);// Just detect (no image processing)
const detection = await scanDocument(imageElement, { mode: 'detect' });
// Extract as canvas
const extracted = await scanDocument(imageElement, {
mode: 'extract',
output: 'canvas'
});
// Extract as ImageData
const rawData = await scanDocument(imageElement, {
mode: 'extract',
output: 'imagedata'
});
// Extract as DataURI
const rawData = await scanDocument(imageElement, {
mode: 'extract',
output: 'dataurl'
});Clone the repository and set up the development environment:
git clone https://github.com/marquaye/scanic.git
cd scanic
npm installStart the development server:
npm run devBuild for production:
npm run buildThe built files will be available in the dist/ directory.
The Rust WASM module is pre-compiled and included in the repository. If you need to rebuild it:
npm run build:wasmThis uses Docker to build the WASM module without requiring local Rust installation.
Scanic uses a hybrid JavaScript + WebAssembly approach:
- JavaScript Layer: High-level API, DOM manipulation, and workflow coordination
- WebAssembly Layer: CPU-intensive operations like:
- Gaussian blur with SIMD optimizations
- Canny edge detection with hysteresis thresholding
- Gradient calculations using Sobel operators
- Non-maximum suppression for edge thinning
- Morphological operations (dilation/erosion)
Contributions are welcome! Here's how you can help:
- Report Issues: Found a bug? Open an issue with details and reproduction steps
- Feature Requests: Have an idea? Create an issue to discuss it
- Pull Requests: Ready to contribute code?
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Please ensure your code follows the existing style.
Special thanks to our amazing sponsors who make this project possible!
|
ZeugnisProfi Professional certificate and document services |
ZeugnisProfi.de German document processing specialists |
Verlingo Language and translation services |
- Performance optimizations to match OpenCV speed
- Enhanced WASM module with additional Rust-optimized algorithms
- SIMD vectorization for more image processing operations
- TypeScript definitions
- Additional image enhancement filters
- Mobile-optimized processing
- WebGPU acceleration for supported browsers
MIT License Β© marquaye