Skip to content

jim-my/duck-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

39 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ¦† Duck Tools - SQL Analytics for CSV/Excel Files

πŸ”’ 100% Private β€’ 🌐 Works Offline β€’ πŸ’ͺ Join Files Instantly
Powerful client-side data analysis tool that lets you join, query, and analyze CSV/Excel files using full SQL syntax - your data never leaves your computer!

Live Site Offline No Upload Privacy First No Server

➑️ Live Site: https://jim-my.github.io/duck-tools/

Duck Tools Interface

πŸ” Why Duck Tools? (Privacy + Power)

🚫 No Data Upload = Zero Security Risk

  • Your sensitive data never leaves your computer
  • No cloud servers, no data breaches, no privacy concerns
  • Perfect for confidential business data, personal finances, or sensitive research

🌐 Works Completely Offline

  • No internet connection required after initial page load
  • Perfect for air-gapped environments or restricted networks
  • Analyze data anywhere - plane, train, or secure facility

πŸ’ͺ Modern SQL Engine (DuckDB)

  • Latest SQL features: Window functions, CTEs, JSON processing, regex
  • PostgreSQL compatibility: Use familiar SQL syntax and functions
  • Analytics powerhouse: Built for data analysis, not just storage
  • Zero setup: Enterprise database features without the enterprise complexity

File Joining Demo ↗️ Drag two CSV files, write one SQL query, get instant insights - all without uploading anything!

✨ Features

πŸš€ Instant File Processing

  • Drag & drop CSV and Excel files (.xlsx, .csv)
  • Automatic table creation with smart aliasing (t1, t2, t3...)
  • Real-time file validation and error handling
  • Header detection with toggle option for CSV files

πŸ’ͺ Full SQL Power

  • Complete DuckDB SQL engine running in your browser
  • Syntax highlighting with CodeMirror editor
  • JOINs, aggregations, window functions - everything you need
  • Query examples to get you started quickly

πŸ“Š Smart Results Display

  • Responsive table with smooth scrolling
  • Export to CSV with one click
  • Performance optimized for large datasets
  • Real-time query statistics

πŸ”’ Privacy & Security

  • 100% client-side processing - your data never leaves your computer
  • No server required - works completely offline
  • No data collection - total privacy guaranteed

🎯 Quick Start - Join Files in 30 Seconds!

1. Open the App

Simply open index.html in any modern browser - no installation, no signup, no data upload!

2. Drop Multiple Files

Step 1: Empty App

Step 2: First File

Step 3: Two Files Ready

Drag and drop multiple CSV or Excel files. Each file instantly becomes a table (t1, t2, t3...) - all processing happens locally!

3. Write JOIN Queries

Step 4: JOIN Query

This is where the magic happens! Write SQL to join your files - something that normally requires a database setup:

-- πŸš€ JOIN TWO FILES INSTANTLY!
SELECT 
    p.name as product_name,
    p.category,
    s.sales as sales_amount,
    s.region
FROM t1 p
JOIN t2 s ON p.id = s.product_id
ORDER BY s.sales DESC;

4. Get Instant Results

Step 5: JOIN Results

Boom! Your files are joined and analyzed - all without uploading a single byte to any server!

5. Export & Share

Download your results as CSV. Share the insights, not the sensitive data.

πŸ“– Use Cases

πŸ” Data Analysis

-- Find top-selling products by category
SELECT category, name, SUM(sales) as total_sales
FROM products p
JOIN sales s ON p.id = s.product_id
GROUP BY category, name
ORDER BY total_sales DESC
LIMIT 10;

πŸ“ˆ Business Intelligence

-- Calculate customer lifetime value
SELECT 
    customer_id,
    COUNT(*) as orders,
    SUM(amount) as total_spent,
    AVG(amount) as avg_order_value,
    MAX(order_date) as last_order
FROM orders
GROUP BY customer_id
HAVING total_spent > 1000
ORDER BY total_spent DESC;

πŸ”— Data Joining

-- Combine data from multiple sources
SELECT 
    u.name,
    u.email,
    p.plan_name,
    COUNT(o.id) as order_count,
    SUM(o.amount) as revenue
FROM users u
LEFT JOIN plans p ON u.plan_id = p.id
LEFT JOIN orders o ON u.id = o.customer_id
GROUP BY u.name, u.email, p.plan_name
ORDER BY revenue DESC;

πŸ› οΈ Development

Running Locally

# Serve the files (any static server works)
python -m http.server 8000
# or
npx serve .

Taking Screenshots (Automated)

# Update all screenshots automatically (works anywhere!)
just screenshots

# Run tests with browser automation
just test

# Serve locally from project directory
just serve

🎯 Fully Portable: The automation uses {{justfile_directory()}} so it works immediately after cloning - no path configuration needed!

Project Structure

duck-tools/
β”œβ”€β”€ index.html              # Main application
β”œβ”€β”€ css/
β”‚   └── style.css           # Styling
β”œβ”€β”€ js/
β”‚   β”œβ”€β”€ main.js            # Application coordinator
β”‚   β”œβ”€β”€ duckdb-manager.js  # Database operations
β”‚   β”œβ”€β”€ file-handler.js    # File processing
β”‚   β”œβ”€β”€ sql-editor.js      # SQL editor component
β”‚   β”œβ”€β”€ results-table.js   # Results display
β”‚   └── csv-exporter.js    # Export functionality
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ test.html          # Test runner
β”‚   └── *.js               # Test suites
β”œβ”€β”€ demo/                  # Sample data files
└── screenshots/           # Documentation images

πŸ§ͺ Testing

# Run all tests
just test

# Run specific test suite
open tests/test.html

πŸš€ Advanced Features

Excel Multi-Sheet Support

  • Automatically detects multiple sheets
  • Load specific sheets on demand
  • Switch between sheets seamlessly

Performance Optimization

  • Virtual scrolling for large datasets
  • Web Workers for file processing
  • Memory management for long sessions

Error Recovery

  • Intelligent error messages with suggestions
  • Automatic retry for transient failures
  • Graceful degradation when features unavailable

🎨 Customization

Themes

The CSS uses CSS custom properties for easy theming:

:root {
    --primary-color: #2563eb;
    --success-color: #10b981;
    --error-color: #ef4444;
    /* ... customize colors ... */
}

SQL Examples

Add your own example queries in js/sql-editor.js:

const customExamples = [
    {
        name: "My Custom Query",
        sql: "SELECT * FROM my_table WHERE condition = 'value'"
    }
];

πŸ›‘οΈ Privacy & Security Comparison

Feature Duck Tools Cloud Analytics Traditional Database
Data Upload ❌ Never βœ… Always βœ… Required
Internet Required ❌ No (after load) βœ… Always βœ… For setup
Data Breach Risk ❌ Zero ⚠️ High ⚠️ Medium
Compliance Issues ❌ None ⚠️ Many ⚠️ Some
Setup Time ⚑ 0 seconds ⏱️ Hours ⏱️ Days
Cost πŸ’° Free πŸ’°πŸ’° $/month πŸ’°πŸ’°πŸ’° $$
SQL Features βœ… Modern ⚠️ Limited βœ… Full
File Joining βœ… Instant βœ… Yes βœ… Yes

🎯 Perfect for:

  • Financial analysts working with sensitive data
  • Healthcare researchers with patient information
  • Government agencies with classified data
  • Consultants analyzing client confidential information
  • Anyone who values data privacy and security

πŸ”§ Browser Compatibility

Browser Support Notes
Chrome 90+ βœ… Full Recommended
Firefox 88+ βœ… Full WebAssembly support
Safari 14+ βœ… Full Modern WebKit
Edge 90+ βœ… Full Chromium-based

πŸ’‘ Tips & Tricks

File Naming

  • Use descriptive filenames - they become your table identifiers
  • sales_2024.csv β†’ accessible as sales_2024 table
  • Spaces in filenames are converted to underscores

Performance

  • For large files (>5MB), consider splitting into smaller chunks
  • Use LIMIT in your queries while exploring data
  • Export uses the full dataset regardless of display limits

Modern SQL Features

Duck Tools supports cutting-edge SQL features that rival expensive enterprise databases:

-- πŸͺŸ Window Functions
SELECT 
    customer_id,
    order_date,
    amount,
    SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) as running_total,
    LAG(amount, 1) OVER (PARTITION BY customer_id ORDER BY order_date) as prev_order
FROM orders;

-- πŸ“Š Common Table Expressions (CTEs)
WITH monthly_sales AS (
    SELECT 
        DATE_TRUNC('month', order_date) as month,
        SUM(amount) as total
    FROM orders
    GROUP BY month
)
SELECT month, total, total - LAG(total) OVER (ORDER BY month) as growth
FROM monthly_sales;

-- πŸ” Advanced Analytics
SELECT 
    product_name,
    category,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY price) as median_price,
    STDDEV(price) as price_volatility,
    CORR(price, sales_volume) as price_sales_correlation
FROM products 
GROUP BY product_name, category;

-- πŸ“… Date/Time Magic
SELECT 
    customer_id,
    DATE_DIFF('day', first_order, last_order) as customer_lifespan_days,
    DATE_PART('quarter', order_date) as quarter,
    DATE_FORMAT(order_date, '%Y-%m') as year_month
FROM customer_orders;

πŸ› Troubleshooting

File Won't Load

  1. Check file format (CSV/XLSX only)
  2. Verify file isn't corrupted
  3. Try converting to CSV if Excel file fails

Query Errors

  1. Check table names (t1, t2, t3...)
  2. Verify column names match your data
  3. Use DESCRIBE t1 to see table structure

Performance Issues

  1. Use LIMIT for large result sets
  2. Add appropriate WHERE clauses
  3. Consider indexing for repeated queries

πŸ“š Resources

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run the tests: just test
  5. Update screenshots: just screenshots
  6. Submit a pull request

πŸ“„ License

MIT License - feel free to use this in your own projects!


Made with ❀️ for data analysts who value privacy and performance

No servers, no tracking, no limits - just pure SQL power in your browser!

About

Duck Tools - SQL Analytics for CSV/Excel Files using DuckDB

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors