Releases: nao1215/filesql
Releases · nao1215/filesql
v0.10.0
Added
- Custom Logger Support: Flexible logging system with slog integration
Loggerinterface: Simple logging interface withDebug,Info,Warn,Error, andWithmethodsContextLoggerinterface: Extended logging interface with context-aware methods (DebugContext,InfoContext,WarnContext,ErrorContext)NewSlogAdapter(): Adapter to use standard libraryslog.Loggerwith filesql'sLoggerinterfaceNewSlogContextAdapter(): Adapter for context-aware logging withslog.LoggerWithLogger(): Builder method to inject custom logger into the build and open processnopLogger: Zero-overhead no-op logger implementation used as default (benchmarked at ~0.2 ns/op)- Logging throughout build, validation, and database opening operations
- Comprehensive test coverage and benchmarks for all logger implementations
Changed
- Documentation Updates: Added Custom Logger section to all README files (7 languages: EN, ES, FR, JA, KO, RU, ZH-CN)
- Usage examples with slog integration
- Logger and ContextLogger interface definitions
- Performance benchmark comparison table
v0.9.0
Added
-
Read-Only Database Mode: New
ReadOnlyDBwrapper for safe read-only access to databasesNewReadOnlyDB(db): Wraps existing*sql.DBto prevent write operationsReadOnlyDB.Query(),QueryContext(),QueryRow(),QueryRowContext(): Read operations work normallyReadOnlyDB.Exec(),ExecContext(): ReturnsErrReadOnlyfor write operations (INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, TRUNCATE, REPLACE, UPSERT)ReadOnlyDB.Prepare(),PrepareContext(): Rejects preparation of write statementsReadOnlyDB.Begin(),BeginTx(): ReturnsReadOnlyTxfor read-only transactionsReadOnlyDB.Ping(),PingContext(),Close(),DB(): Standard database operationsReadOnlyStmt: Read-only prepared statement wrapperReadOnlyTx: Read-only transaction wrapper with same protectionsDBBuilder.OpenReadOnly(ctx): Convenience method to open database in read-only modeErrReadOnly: Sentinel error for rejected write operations- Useful for audit scenarios where data viewing without modification risk is required
-
ACHTableInfo Struct: New struct for managing ACH table name information
ACHTableInfo.BaseName: The base table name derived from ACH filenameACHTableInfo.FileHeaderTable(): Returns{baseName}_file_headerACHTableInfo.BatchesTable(): Returns{baseName}_batchesACHTableInfo.EntriesTable(): Returns{baseName}_entriesACHTableInfo.AddendaTable(): Returns{baseName}_addendaACHTableInfo.IATBatchesTable(): Returns{baseName}_iat_batchesACHTableInfo.IATEntriesTable(): Returns{baseName}_iat_entriesACHTableInfo.IATAddendaTable(): Returns{baseName}_iat_addendaACHTableInfo.AllTableNames(): Returns all possible table names for the base nameGetACHTableInfos(): Returns[]ACHTableInfofor all registered ACH files
Changed
- Internal ACH Function: Made
GetACHBaseTableNamesprivate (getACHBaseTableNames) as it was only used internally- Use
GetACHTableInfos()for public access to ACH table information
- Use
v0.8.0
Added
- New Compression Formats: Added support for 4 new compression formats via fileparser v0.2.0
- zlib (.z) - Standard DEFLATE compression
- snappy (.snappy) - Google's high-speed compression
- s2 (.s2) - Improved Snappy extension, faster
- lz4 (.lz4) - Extremely fast compression
v0.7.0
Changed
- Migrated from internal
github.com/nao1215/filesql/parserto externalgithub.com/nao1215/fileparserfor file parsing - Updated all internal references from
parser.tofileparser.
Removed
- Internal
parserpackage (now usinggithub.com/nao1215/fileparser v0.1.0as external dependency)
v0.6.0
Added
- Public Parser Package (6271e5ef): Exposed the internal parser as a public API for use in external projects
- New
parserpackage: Standalone file parsing without SQLite dependencyparser.Parse(): Parse CSV, TSV, LTSV, XLSX, and Parquet files fromio.Readerparser.DetectFileType(): Automatic file type detection from file pathparser.BaseFileType(): Get base file type from potentially compressed file types
- Type exports:
TableData,ColumnType,FileTypetypes for working with parsed data - Parquet support: Full Parquet parsing with
parser/parquet.go - XLSX support: Excel file parsing with
parser/xlsx.go - Comprehensive test coverage: 90%+ coverage for the parser package
- New
- ORM Integration Examples (281ede2): Added example code for popular Go ORMs and query builders
- GORM: Full GORM integration example with model definitions
- Bun: Bun ORM example with struct scanning
- Ent: Facebook's Ent framework example with generated code
- sqlx: sqlx example with struct tags
- sqlc: sqlc example with generated type-safe queries
- Squirrel: Squirrel query builder example
- Basic: Standard library database/sql example
- Multi-format: Example combining CSV, TSV, and LTSV files
- FileType.String() Method: Added
fmt.Stringerimplementation forFileTypeenum- Human-readable format names for logging and debugging
- Returns names like "CSV", "TSV", "LTSV", "XLSX", "Parquet", etc.
Changed
- Documentation Updates: Enhanced README files across all 7 languages
Technical Details
- Architecture: Parser package enables lightweight file parsing without database overhead
- Compatibility: Parser package can be used independently of the main filesql package
- Testing: Added comprehensive test suites for parser, types, and error handling
v0.5.0
Added
- Benchmark Tests (2852ea2): Added benchmark infrastructure for performance testing
- New
make benchmarktarget in Makefile for running benchmark tests - Benchmark tests isolated with
//go:build benchmarktag to prevent execution during regular tests BenchmarkOpenContextandBenchmarkOpenContextParallelfor measuring CSV loading performance
- New
Improved
- Major Performance Optimization (d20b3c8, e95a5bf): Significantly improved file loading performance
- 55% faster execution: Reduced 100,000-row CSV loading time from ~960ms to ~430ms
- 12% less memory: Reduced memory usage from ~161MB to ~141MB
- Transaction batching: Wrapped all INSERT operations in a single transaction to reduce SQLite disk sync operations
- Slice reuse: Pre-allocate and reuse value slices in
insertChunkData()to reduce allocations - Pre-allocation in type inference: Optimized
newColumnInfoList()andinferColumnsInfo()with pre-allocated column value slices
Fixed
- Data Integrity in Chunk Insertion (b191d93): Fixed potential data corruption issues in
insertChunkData()- Stale value prevention: Fixed issue where records with fewer columns than headers could retain stale values from previous rows
- Extra column detection: Added validation to fail fast when records have more columns than headers, preventing silent data truncation
Changed
- Documentation Updates (17a42fa): Added benchmark results to all README files (7 languages)
- Performance metrics: ~430ms execution time, ~141MB memory for 100,000-row CSV
Dependencies
github.com/klauspost/compress: 1.18.1 → 1.18.2
v0.4.6
Added
- Header-Only File Support (PR #67, 5de8801): Files with headers but no data records are now supported
- CSV, TSV, Parquet, and XLSX formats can now be loaded with only header rows
- Creates empty SQLite tables with correct column names (all columns as TEXT type)
- Useful for schema definition files and template files
- Example: A CSV file containing only
id,name,agewill create a table with those columns but zero rows
Fixed
- LTSV Error Handling: Improved error messages for invalid LTSV data
- Now correctly returns
"no valid LTSV keys found"error instead of silently creating empty tables - LTSV format requires
key:valuepairs, so header-only concept does not apply
- Now correctly returns
Changed
- Dependencies: Updated library dependencies
modernc.org/sqlite: 1.40.0 → 1.40.1github.com/klauspost/compress: 1.18.0 → 1.18.1github.com/xuri/excelize/v2: 2.9.1 → 2.10.0golang.org/x/crypto: Security updateactions/checkout: 4 → 6
v0.4.5
Fixed
- Table Name Sanitization: Fixed SQL syntax errors caused by special characters in file names
- Applied
sanitizeTableName()to all table name generation paths - Hyphens, spaces, and special characters are now automatically converted to underscores
- Example:
"user-data.csv"→ table"user_data","my file.csv"→ table"my_file" - Updated test expectations to match sanitized table names
- Applied
Improved
- API Documentation: Enhanced documentation for public APIs to clarify table name sanitization
- Updated
Open(),OpenContext(), andDBBuilder.Open()method documentation - Added examples showing special character conversion in table names
- Improved
sanitizeTableName()function documentation with detailed transformation rules
- Updated
- Development Experience: Optimized test execution time for local development
- Added GitHub Actions environment checks to skip slow tests locally
- Reduced local test execution time by 63% (from ~55s to ~20s)
- Maintained full test coverage in CI/CD while improving developer productivity
Technical Details
- Breaking Change Prevention: Preserved existing
tableFromFilePath()behavior for backward compatibility - Test Coverage: Maintained 80.7% test coverage with updated test expectations
- Performance: No impact on runtime performance, only development-time improvements
v0.4.4
Added
- Memory Management System (PR #49, d128a27): Comprehensive memory optimization for large file processing
- Introduced
MemoryPoolfor efficient reuse of byte slices, record slices, and string slices - Added
MemoryLimitwith configurable thresholds and graceful degradation - Implemented automatic memory monitoring with adaptive chunk size reduction
- Enhanced XLSX processing with chunked streaming and memory-optimized operations
- Added comprehensive test coverage (800+ lines) with benchmarks and concurrent access validation
- Introduced
- Compression Handler (PR #48, ac04ae9): Factory pattern for file compression handling
- Unified compression/decompression interface supporting gzip, bzip2, xz, and zstd formats
- Clean resource management with automatic cleanup functions
- Comprehensive test suite with end-to-end compression validation
- Performance benchmarks for different compression algorithms
Changed
- Architecture Refactoring (PR #47, c228ffd): Split DBBuilder into focused processors following Single Responsibility Principle
- Created dedicated
FileProcessorfor file-specific operations - Introduced
StreamProcessorfor streaming data processing - Added
Validatorfor centralized validation logic - Improved code maintainability and testability through separation of concerns
- Created dedicated
- API Breaking Change: Exported
Recordtype (was previously unexportedrecord)- Fixed lint issues with exported methods returning unexported types
- Added comprehensive documentation for migration guidance
Fixed
- Memory Pool Resource Management: Fixed critical backing array tracking issue
- Resolved potential memory corruption when slice capacity exceeded original allocation
- Implemented proper resource cleanup with original slice tracking
- Performance Optimization: Reduced
runtime.ReadMemStatscall frequency- Changed from every 100 records to every 1000 records (10x performance improvement)
- Added detailed comments explaining the performance trade-offs
Technical Improvements
- Enhanced Documentation: Added comprehensive godoc comments for all new types
MemoryPoolandMemoryLimitusage examples and thread safety guarantees- Performance notes and best practices for memory management
- Code Quality: Replaced magic numbers with named constants throughout memory management
- Integer Overflow Safety: Enhanced overflow protection with detailed documentation for edge cases
- Test Coverage: Maintained 81.2% test coverage with extensive memory management test suite
v0.4.3
Fixed
- DBBuilder Refactoring (PR #45, 6379425): Major architectural improvements for better maintainability
- Refactored DBBuilder implementation for cleaner code structure
- Improved error handling and validation in builder pattern
- Enhanced code organization and readability
Technical Improvements
- LLM Settings Enhancement (PR #44, 2575759): Updated LLM configuration for unit testing
- Improved development workflow with better AI assistance configuration
- Enhanced test environment setup for LLM-powered development tools
- Integration Testing Expansion (PR #43, 48eadbe): Added comprehensive integration test coverage
- Enhanced test coverage with real-world usage scenarios
- Improved reliability and robustness validation
- Sample Data Addition (PR #41, 0adba40): Added sample CSV files for testing and demonstration
- Enhanced testing capabilities with realistic sample data
- Improved documentation with practical examples