Skip to content

Releases: nao1215/filesql

v0.10.0

18 Dec 13:24
188d16e

Choose a tag to compare

Added

  • Custom Logger Support: Flexible logging system with slog integration
    • Logger interface: Simple logging interface with Debug, Info, Warn, Error, and With methods
    • ContextLogger interface: Extended logging interface with context-aware methods (DebugContext, InfoContext, WarnContext, ErrorContext)
    • NewSlogAdapter(): Adapter to use standard library slog.Logger with filesql's Logger interface
    • NewSlogContextAdapter(): Adapter for context-aware logging with slog.Logger
    • WithLogger(): Builder method to inject custom logger into the build and open process
    • nopLogger: Zero-overhead no-op logger implementation used as default (benchmarked at ~0.2 ns/op)
    • Logging throughout build, validation, and database opening operations
    • Comprehensive test coverage and benchmarks for all logger implementations

Changed

  • Documentation Updates: Added Custom Logger section to all README files (7 languages: EN, ES, FR, JA, KO, RU, ZH-CN)
    • Usage examples with slog integration
    • Logger and ContextLogger interface definitions
    • Performance benchmark comparison table

v0.9.0

18 Dec 08:15
2c88c3a

Choose a tag to compare

Added

  • Read-Only Database Mode: New ReadOnlyDB wrapper for safe read-only access to databases

    • NewReadOnlyDB(db): Wraps existing *sql.DB to prevent write operations
    • ReadOnlyDB.Query(), QueryContext(), QueryRow(), QueryRowContext(): Read operations work normally
    • ReadOnlyDB.Exec(), ExecContext(): Returns ErrReadOnly for write operations (INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, TRUNCATE, REPLACE, UPSERT)
    • ReadOnlyDB.Prepare(), PrepareContext(): Rejects preparation of write statements
    • ReadOnlyDB.Begin(), BeginTx(): Returns ReadOnlyTx for read-only transactions
    • ReadOnlyDB.Ping(), PingContext(), Close(), DB(): Standard database operations
    • ReadOnlyStmt: Read-only prepared statement wrapper
    • ReadOnlyTx: Read-only transaction wrapper with same protections
    • DBBuilder.OpenReadOnly(ctx): Convenience method to open database in read-only mode
    • ErrReadOnly: Sentinel error for rejected write operations
    • Useful for audit scenarios where data viewing without modification risk is required
  • ACHTableInfo Struct: New struct for managing ACH table name information

    • ACHTableInfo.BaseName: The base table name derived from ACH filename
    • ACHTableInfo.FileHeaderTable(): Returns {baseName}_file_header
    • ACHTableInfo.BatchesTable(): Returns {baseName}_batches
    • ACHTableInfo.EntriesTable(): Returns {baseName}_entries
    • ACHTableInfo.AddendaTable(): Returns {baseName}_addenda
    • ACHTableInfo.IATBatchesTable(): Returns {baseName}_iat_batches
    • ACHTableInfo.IATEntriesTable(): Returns {baseName}_iat_entries
    • ACHTableInfo.IATAddendaTable(): Returns {baseName}_iat_addenda
    • ACHTableInfo.AllTableNames(): Returns all possible table names for the base name
    • GetACHTableInfos(): Returns []ACHTableInfo for all registered ACH files

Changed

  • Internal ACH Function: Made GetACHBaseTableNames private (getACHBaseTableNames) as it was only used internally
    • Use GetACHTableInfos() for public access to ACH table information

v0.8.0

11 Dec 11:47
f68bafc

Choose a tag to compare

Added

  • New Compression Formats: Added support for 4 new compression formats via fileparser v0.2.0
    • zlib (.z) - Standard DEFLATE compression
    • snappy (.snappy) - Google's high-speed compression
    • s2 (.s2) - Improved Snappy extension, faster
    • lz4 (.lz4) - Extremely fast compression

v0.7.0

11 Dec 08:08
6713704

Choose a tag to compare

Changed

  • Migrated from internal github.com/nao1215/filesql/parser to external github.com/nao1215/fileparser for file parsing
  • Updated all internal references from parser. to fileparser.

Removed

  • Internal parser package (now using github.com/nao1215/fileparser v0.1.0 as external dependency)

v0.6.0

09 Dec 11:48
19c85e8

Choose a tag to compare

Added

  • Public Parser Package (6271e5ef): Exposed the internal parser as a public API for use in external projects
    • New parser package: Standalone file parsing without SQLite dependency
      • parser.Parse(): Parse CSV, TSV, LTSV, XLSX, and Parquet files from io.Reader
      • parser.DetectFileType(): Automatic file type detection from file path
      • parser.BaseFileType(): Get base file type from potentially compressed file types
    • Type exports: TableData, ColumnType, FileType types for working with parsed data
    • Parquet support: Full Parquet parsing with parser/parquet.go
    • XLSX support: Excel file parsing with parser/xlsx.go
    • Comprehensive test coverage: 90%+ coverage for the parser package
  • ORM Integration Examples (281ede2): Added example code for popular Go ORMs and query builders
    • GORM: Full GORM integration example with model definitions
    • Bun: Bun ORM example with struct scanning
    • Ent: Facebook's Ent framework example with generated code
    • sqlx: sqlx example with struct tags
    • sqlc: sqlc example with generated type-safe queries
    • Squirrel: Squirrel query builder example
    • Basic: Standard library database/sql example
    • Multi-format: Example combining CSV, TSV, and LTSV files
  • FileType.String() Method: Added fmt.Stringer implementation for FileType enum
    • Human-readable format names for logging and debugging
    • Returns names like "CSV", "TSV", "LTSV", "XLSX", "Parquet", etc.

Changed

  • Documentation Updates: Enhanced README files across all 7 languages
    • Added fileprep project reference (e3705a7)
    • Fixed project link formatting (dea615e)
    • Updated fileprep section (d7905b0)

Technical Details

  • Architecture: Parser package enables lightweight file parsing without database overhead
  • Compatibility: Parser package can be used independently of the main filesql package
  • Testing: Added comprehensive test suites for parser, types, and error handling

v0.5.0

05 Dec 15:58
e44d7bc

Choose a tag to compare

Added

  • Benchmark Tests (2852ea2): Added benchmark infrastructure for performance testing
    • New make benchmark target in Makefile for running benchmark tests
    • Benchmark tests isolated with //go:build benchmark tag to prevent execution during regular tests
    • BenchmarkOpenContext and BenchmarkOpenContextParallel for measuring CSV loading performance

Improved

  • Major Performance Optimization (d20b3c8, e95a5bf): Significantly improved file loading performance
    • 55% faster execution: Reduced 100,000-row CSV loading time from ~960ms to ~430ms
    • 12% less memory: Reduced memory usage from ~161MB to ~141MB
    • Transaction batching: Wrapped all INSERT operations in a single transaction to reduce SQLite disk sync operations
    • Slice reuse: Pre-allocate and reuse value slices in insertChunkData() to reduce allocations
    • Pre-allocation in type inference: Optimized newColumnInfoList() and inferColumnsInfo() with pre-allocated column value slices

Fixed

  • Data Integrity in Chunk Insertion (b191d93): Fixed potential data corruption issues in insertChunkData()
    • Stale value prevention: Fixed issue where records with fewer columns than headers could retain stale values from previous rows
    • Extra column detection: Added validation to fail fast when records have more columns than headers, preventing silent data truncation

Changed

  • Documentation Updates (17a42fa): Added benchmark results to all README files (7 languages)
    • Performance metrics: ~430ms execution time, ~141MB memory for 100,000-row CSV

Dependencies

  • github.com/klauspost/compress: 1.18.1 → 1.18.2

v0.4.6

27 Nov 08:09
c44d4cf

Choose a tag to compare

Added

  • Header-Only File Support (PR #67, 5de8801): Files with headers but no data records are now supported
    • CSV, TSV, Parquet, and XLSX formats can now be loaded with only header rows
    • Creates empty SQLite tables with correct column names (all columns as TEXT type)
    • Useful for schema definition files and template files
    • Example: A CSV file containing only id,name,age will create a table with those columns but zero rows

Fixed

  • LTSV Error Handling: Improved error messages for invalid LTSV data
    • Now correctly returns "no valid LTSV keys found" error instead of silently creating empty tables
    • LTSV format requires key:value pairs, so header-only concept does not apply

Changed

  • Dependencies: Updated library dependencies
    • modernc.org/sqlite: 1.40.0 → 1.40.1
    • github.com/klauspost/compress: 1.18.0 → 1.18.1
    • github.com/xuri/excelize/v2: 2.9.1 → 2.10.0
    • golang.org/x/crypto: Security update
    • actions/checkout: 4 → 6

v0.4.5

17 Sep 11:33
adf18f1

Choose a tag to compare

Fixed

  • Table Name Sanitization: Fixed SQL syntax errors caused by special characters in file names
    • Applied sanitizeTableName() to all table name generation paths
    • Hyphens, spaces, and special characters are now automatically converted to underscores
    • Example: "user-data.csv" → table "user_data", "my file.csv" → table "my_file"
    • Updated test expectations to match sanitized table names

Improved

  • API Documentation: Enhanced documentation for public APIs to clarify table name sanitization
    • Updated Open(), OpenContext(), and DBBuilder.Open() method documentation
    • Added examples showing special character conversion in table names
    • Improved sanitizeTableName() function documentation with detailed transformation rules
  • Development Experience: Optimized test execution time for local development
    • Added GitHub Actions environment checks to skip slow tests locally
    • Reduced local test execution time by 63% (from ~55s to ~20s)
    • Maintained full test coverage in CI/CD while improving developer productivity

Technical Details

  • Breaking Change Prevention: Preserved existing tableFromFilePath() behavior for backward compatibility
  • Test Coverage: Maintained 80.7% test coverage with updated test expectations
  • Performance: No impact on runtime performance, only development-time improvements

v0.4.4

03 Sep 12:44
958d5ae

Choose a tag to compare

Added

  • Memory Management System (PR #49, d128a27): Comprehensive memory optimization for large file processing
    • Introduced MemoryPool for efficient reuse of byte slices, record slices, and string slices
    • Added MemoryLimit with configurable thresholds and graceful degradation
    • Implemented automatic memory monitoring with adaptive chunk size reduction
    • Enhanced XLSX processing with chunked streaming and memory-optimized operations
    • Added comprehensive test coverage (800+ lines) with benchmarks and concurrent access validation
  • Compression Handler (PR #48, ac04ae9): Factory pattern for file compression handling
    • Unified compression/decompression interface supporting gzip, bzip2, xz, and zstd formats
    • Clean resource management with automatic cleanup functions
    • Comprehensive test suite with end-to-end compression validation
    • Performance benchmarks for different compression algorithms

Changed

  • Architecture Refactoring (PR #47, c228ffd): Split DBBuilder into focused processors following Single Responsibility Principle
    • Created dedicated FileProcessor for file-specific operations
    • Introduced StreamProcessor for streaming data processing
    • Added Validator for centralized validation logic
    • Improved code maintainability and testability through separation of concerns
  • API Breaking Change: Exported Record type (was previously unexported record)
    • Fixed lint issues with exported methods returning unexported types
    • Added comprehensive documentation for migration guidance

Fixed

  • Memory Pool Resource Management: Fixed critical backing array tracking issue
    • Resolved potential memory corruption when slice capacity exceeded original allocation
    • Implemented proper resource cleanup with original slice tracking
  • Performance Optimization: Reduced runtime.ReadMemStats call frequency
    • Changed from every 100 records to every 1000 records (10x performance improvement)
    • Added detailed comments explaining the performance trade-offs

Technical Improvements

  • Enhanced Documentation: Added comprehensive godoc comments for all new types
    • MemoryPool and MemoryLimit usage examples and thread safety guarantees
    • Performance notes and best practices for memory management
  • Code Quality: Replaced magic numbers with named constants throughout memory management
  • Integer Overflow Safety: Enhanced overflow protection with detailed documentation for edge cases
  • Test Coverage: Maintained 81.2% test coverage with extensive memory management test suite

v0.4.3

02 Sep 13:11
f0e7153

Choose a tag to compare

Fixed

  • DBBuilder Refactoring (PR #45, 6379425): Major architectural improvements for better maintainability
    • Refactored DBBuilder implementation for cleaner code structure
    • Improved error handling and validation in builder pattern
    • Enhanced code organization and readability

Technical Improvements

  • LLM Settings Enhancement (PR #44, 2575759): Updated LLM configuration for unit testing
    • Improved development workflow with better AI assistance configuration
    • Enhanced test environment setup for LLM-powered development tools
  • Integration Testing Expansion (PR #43, 48eadbe): Added comprehensive integration test coverage
    • Enhanced test coverage with real-world usage scenarios
    • Improved reliability and robustness validation
  • Sample Data Addition (PR #41, 0adba40): Added sample CSV files for testing and demonstration
    • Enhanced testing capabilities with realistic sample data
    • Improved documentation with practical examples