π 100% Private β’ π Works Offline β’ πͺ Join Files Instantly
Powerful client-side data analysis tool that lets you join, query, and analyze CSV/Excel files using full SQL syntax - your data never leaves your computer!
β‘οΈ Live Site: https://jim-my.github.io/duck-tools/
- Your sensitive data never leaves your computer
- No cloud servers, no data breaches, no privacy concerns
- Perfect for confidential business data, personal finances, or sensitive research
- No internet connection required after initial page load
- Perfect for air-gapped environments or restricted networks
- Analyze data anywhere - plane, train, or secure facility
- Latest SQL features: Window functions, CTEs, JSON processing, regex
- PostgreSQL compatibility: Use familiar SQL syntax and functions
- Analytics powerhouse: Built for data analysis, not just storage
- Zero setup: Enterprise database features without the enterprise complexity
- Drag & drop CSV and Excel files (.xlsx, .csv)
- Automatic table creation with smart aliasing (t1, t2, t3...)
- Real-time file validation and error handling
- Header detection with toggle option for CSV files
- Complete DuckDB SQL engine running in your browser
- Syntax highlighting with CodeMirror editor
- JOINs, aggregations, window functions - everything you need
- Query examples to get you started quickly
- Responsive table with smooth scrolling
- Export to CSV with one click
- Performance optimized for large datasets
- Real-time query statistics
- 100% client-side processing - your data never leaves your computer
- No server required - works completely offline
- No data collection - total privacy guaranteed
Simply open index.html in any modern browser - no installation, no signup, no data upload!
Drag and drop multiple CSV or Excel files. Each file instantly becomes a table (t1, t2, t3...) - all processing happens locally!
This is where the magic happens! Write SQL to join your files - something that normally requires a database setup:
-- π JOIN TWO FILES INSTANTLY!
SELECT
p.name as product_name,
p.category,
s.sales as sales_amount,
s.region
FROM t1 p
JOIN t2 s ON p.id = s.product_id
ORDER BY s.sales DESC;Boom! Your files are joined and analyzed - all without uploading a single byte to any server!
Download your results as CSV. Share the insights, not the sensitive data.
-- Find top-selling products by category
SELECT category, name, SUM(sales) as total_sales
FROM products p
JOIN sales s ON p.id = s.product_id
GROUP BY category, name
ORDER BY total_sales DESC
LIMIT 10;-- Calculate customer lifetime value
SELECT
customer_id,
COUNT(*) as orders,
SUM(amount) as total_spent,
AVG(amount) as avg_order_value,
MAX(order_date) as last_order
FROM orders
GROUP BY customer_id
HAVING total_spent > 1000
ORDER BY total_spent DESC;-- Combine data from multiple sources
SELECT
u.name,
u.email,
p.plan_name,
COUNT(o.id) as order_count,
SUM(o.amount) as revenue
FROM users u
LEFT JOIN plans p ON u.plan_id = p.id
LEFT JOIN orders o ON u.id = o.customer_id
GROUP BY u.name, u.email, p.plan_name
ORDER BY revenue DESC;# Serve the files (any static server works)
python -m http.server 8000
# or
npx serve .# Update all screenshots automatically (works anywhere!)
just screenshots
# Run tests with browser automation
just test
# Serve locally from project directory
just serveπ― Fully Portable: The automation uses {{justfile_directory()}} so it works immediately after cloning - no path configuration needed!
duck-tools/
βββ index.html # Main application
βββ css/
β βββ style.css # Styling
βββ js/
β βββ main.js # Application coordinator
β βββ duckdb-manager.js # Database operations
β βββ file-handler.js # File processing
β βββ sql-editor.js # SQL editor component
β βββ results-table.js # Results display
β βββ csv-exporter.js # Export functionality
βββ tests/
β βββ test.html # Test runner
β βββ *.js # Test suites
βββ demo/ # Sample data files
βββ screenshots/ # Documentation images
# Run all tests
just test
# Run specific test suite
open tests/test.html- Automatically detects multiple sheets
- Load specific sheets on demand
- Switch between sheets seamlessly
- Virtual scrolling for large datasets
- Web Workers for file processing
- Memory management for long sessions
- Intelligent error messages with suggestions
- Automatic retry for transient failures
- Graceful degradation when features unavailable
The CSS uses CSS custom properties for easy theming:
:root {
--primary-color: #2563eb;
--success-color: #10b981;
--error-color: #ef4444;
/* ... customize colors ... */
}Add your own example queries in js/sql-editor.js:
const customExamples = [
{
name: "My Custom Query",
sql: "SELECT * FROM my_table WHERE condition = 'value'"
}
];| Feature | Duck Tools | Cloud Analytics | Traditional Database |
|---|---|---|---|
| Data Upload | β Never | β Always | β Required |
| Internet Required | β No (after load) | β Always | β For setup |
| Data Breach Risk | β Zero | ||
| Compliance Issues | β None | ||
| Setup Time | β‘ 0 seconds | β±οΈ Hours | β±οΈ Days |
| Cost | π° Free | π°π° $/month | π°π°π° $$ |
| SQL Features | β Modern | β Full | |
| File Joining | β Instant | β Yes | β Yes |
π― Perfect for:
- Financial analysts working with sensitive data
- Healthcare researchers with patient information
- Government agencies with classified data
- Consultants analyzing client confidential information
- Anyone who values data privacy and security
| Browser | Support | Notes |
|---|---|---|
| Chrome 90+ | β Full | Recommended |
| Firefox 88+ | β Full | WebAssembly support |
| Safari 14+ | β Full | Modern WebKit |
| Edge 90+ | β Full | Chromium-based |
- Use descriptive filenames - they become your table identifiers
sales_2024.csvβ accessible assales_2024table- Spaces in filenames are converted to underscores
- For large files (>5MB), consider splitting into smaller chunks
- Use
LIMITin your queries while exploring data - Export uses the full dataset regardless of display limits
Duck Tools supports cutting-edge SQL features that rival expensive enterprise databases:
-- πͺ Window Functions
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) as running_total,
LAG(amount, 1) OVER (PARTITION BY customer_id ORDER BY order_date) as prev_order
FROM orders;
-- π Common Table Expressions (CTEs)
WITH monthly_sales AS (
SELECT
DATE_TRUNC('month', order_date) as month,
SUM(amount) as total
FROM orders
GROUP BY month
)
SELECT month, total, total - LAG(total) OVER (ORDER BY month) as growth
FROM monthly_sales;
-- π Advanced Analytics
SELECT
product_name,
category,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY price) as median_price,
STDDEV(price) as price_volatility,
CORR(price, sales_volume) as price_sales_correlation
FROM products
GROUP BY product_name, category;
-- π
Date/Time Magic
SELECT
customer_id,
DATE_DIFF('day', first_order, last_order) as customer_lifespan_days,
DATE_PART('quarter', order_date) as quarter,
DATE_FORMAT(order_date, '%Y-%m') as year_month
FROM customer_orders;- Check file format (CSV/XLSX only)
- Verify file isn't corrupted
- Try converting to CSV if Excel file fails
- Check table names (t1, t2, t3...)
- Verify column names match your data
- Use
DESCRIBE t1to see table structure
- Use
LIMITfor large result sets - Add appropriate
WHEREclauses - Consider indexing for repeated queries
- DuckDB Documentation: duckdb.org/docs
- SQL Tutorial: W3Schools SQL
- Sample Datasets: Check the
demo/folder
- Fork the repository
- Create a feature branch
- Make your changes
- Run the tests:
just test - Update screenshots:
just screenshots - Submit a pull request
MIT License - feel free to use this in your own projects!
Made with β€οΈ for data analysts who value privacy and performance
No servers, no tracking, no limits - just pure SQL power in your browser!