Tags: twmb/avro
Tags
v1.7.1: audit fixes Correctness - timestamp-nanos encode errors on overflow instead of silent wrap - json.Number exponent notation (1.5e3) accepted for int/long - EncodeJSON float32 overflow errors instead of invalid '+Inf' literal Symmetry - decimal decode into float32/float64/string matches encode targets - json.Number in arrays/maps handled like scalars SchemaFor - recursive Go types (linked lists, trees, mutual recursion) Performance - map[string]any record encode: zero-alloc fast path
v1.7.0: SchemaFor: type-alias tag, auto null default, fixed type naming Add type-alias struct tag for named type aliases (record, enum, fixed) during schema evolution. Add bracket syntax for multi-value alias and type-alias tags (alias=[a,b]) — safe because brackets are not valid in Avro names per the spec. Pointer types (*T) now automatically get "default": null, making nullable fields backward-compatible out of the box. Fixed types ([N]byte) now use the Go type name when available (e.g. type MD5 [16]byte → "name": "MD5"), falling back to "fixed_N" for unnamed arrays. Harden splitTag to error on unclosed delimiters and validate type-alias targets non-primitive named types.
v1.6.0: ocf.WithReaderSchemaFunc, better null semantics, numeric over… …flow guards on decode New ocf.WithReaderSchemaFunc takes a callback invoked after the OCF header is parsed, so callers can inspect rd.Schema() and rd.Metadata() before choosing a reader schema. Returning nil skips resolution. This unblocks the alias-based schema-evolution pattern for formats like Iceberg manifest lists where the reader schema depends on metadata or writer-schema shape only available post-header. WithReaderSchema's doc clarifies writer/reader terminology and notes the two options are mutually exclusive. Null branches now always decode to the target's Go zero value, replacing any prior content across all Kinds — matching encoding/json/v2. Numeric decode paths now range-check against the target Go type instead of silently wrapping: Avro long(2^33) into Go int32 returns an error instead of -2^31; Avro int(-1) into uint32 errors instead of 2^32-1. Brings decode to parity with encode. Float64 → float32 narrowing errors on overflow-to-Inf (binary decode, JSON decode, encode). NaN and ±Inf pass through unchanged. In-range precision-loss rounding stays silent, matching json/v2's "rounded or clamped" rule.
1.5.0: Decode decimal to *big.Rat, export RatFromBytes Decode and DecodeJSON now return *big.Rat instead of json.Number when decoding decimal logical types into *any targets, matching the behavior of hamba/avro and linkedin/goavro. This is a breaking change for code that type-asserts json.Number from decoded decimal values — in practice only redpanda-data/connect relied on this, and it is being updated to use a CustomType callback instead. No further breakage is expected. json.Number is still supported as a typed struct target. RatFromBytes is now exported to support CustomType Decode callbacks that override decimal handling — without it, users would need to reimplement two's complement byte decoding. DurationFromBytes serves the same role for the duration logical type and is now documented accordingly. SchemaNode.Scale and SchemaNode.Precision are now correctly populated when a CustomType matches a primitive decimal schema. EncodeJSON fixes: json.Number values with decimal points or scientific notation (e.g. "42.0", "1e2") are now correctly accepted for int/long schemas. Float precision overflow error messages now reference the correct type (float32/float64).
v1.4.1: Spec compliance fixes for canonical form. - aobject.MarshalJSON always emits "fields":[] for record/error and "symbols":[] for enum, per Avro spec Complex Types > Records/Enums. The old struct-tag path dropped them via omitempty, producing JSON that strict readers (Java Avro) reject. Found via Spark reading an iceberg-go unpartitioned-table manifest. - MarshalJSON now honors PCF [ORDER]: name, type, fields, symbols, items, values, size, then non-PCF attributes. Fingerprint note: Rabin fingerprint changes only for schemas that were previously emitted without a "fields" key — those were invalid Avro and no conforming reader accepted them.
v1.4.0: Avro spec fixes, fixed(16) UUID support, atype pkg, ocf schem… …a opts Improvements from an iceberg-go migration off hamba/avro. - New github.com/twmb/avro/atype subpackage exporting untyped string constants for Avro primitive types, complex types, logical types, and field sort orders. Catches typos at compile time; groups all spec names under one import; Duration logical type constant doesn't collide with the top-level Duration struct. - New ocf.WithSchemaOpts ReaderOpt forwarding avro.SchemaOpt values to the file-header Parse call. CustomType registration now works for OCF reads. A schema that reuses a named type (e.g. a fixed UUID in two fields) used to fail with "duplicate named type". Schema() and SchemaFor now emit the definition once and a name reference thereafter. - Identical redefinitions: silently deduped. - Conflicting redefinitions (same name, different content): detected via JSON comparison, error "conflicting definitions for named type". Per the spec, a union with "null" as the first branch defaults to null. - Before: encoding a map[string]any with missing keys for nullable fields errored with "missing key". - After: missing key → null branch. fixed(16) with logicalType:"uuid" now respects the annotation. - Decode into any: [16]byte (was raw bytes, logical type ignored). - Decode into string: formatted hex-dash UUID. - Encode accepts [16]byte, []byte of the right length, or a formatted UUID string. typeFieldMapping resolved same-named fields by "first-seen wins". Depth-first traversal visits embedded struct fields before the enclosing struct's direct fields, so a shadowing direct field incorrectly lost to the deeper embedded one. - Fix: shallower wins regardless of declaration order, matching encoding/json and the library's own godoc. The library never used a 1.26 feature. Test-only new(literal) syntax replaced with a ptr[T](v T) *T helper. New FuzzSchemaNode and FuzzEncodeMapMissingKeys fuzzers, plus recursive linked-list and 3-level-nested record schemas added to the existing fuzz corpus. Fuzzing uncovered three pre-existing issues, all fixed: - Cyclic SchemaNode via *SchemaNode Items/Values now errors cleanly instead of stack-overflowing (deduper tracks visited pointers). - Root().Schema() round-trip now succeeds for schemas with non-canonical key casing like "tYpe". Parse was already lenient via encoding/json's default case-insensitive struct matching; nodeFromJSONObject now uses the same leniency. - Parse and Root now agree on which duplicate JSON key wins. Both UnmarshalJSON methods normalize through map[string]RawMessage before struct decode so the dedup is consistent.
v1.3.4: more ecosystem parity / enhancements / more internal consistency EncodeJSON type parity with Encode: json.Number, int→float, []byte→string, TextMarshaler, time.Time for time-millis, RFC3339 for timestamps, date strings, enum by index, *big.Rat/float64 for decimal, tagged union maps. Validation: reject fractional floats for int/long, int32/int64 overflow, float precision limits, fixed size mismatch, enum symbol validation. DecodeJSON: accept JSON numbers for decimal round-trip, validate enum symbols, reject leading-zero numbers, accept 1e999 as ±Infinity. Encode: json.Number uses Int64() for full precision, time.Time for time-millis, nil no longer panics. Schema: canonical form no longer emits "name":"" for arrays/maps. JSON: named escape sequences (\b, \t, \n, \f, \r) matching encoding/json.
v1.3.3 - Streaming JSON decoder: DecodeJSON rewritten with a single-pass byte scanner, eliminating the json.Unmarshal → Encode → Decode round-trip. Zero allocations for struct targets; 9 allocs for *any targets (down from 64). ~5x faster for structs, ~3x faster for *any. - Logical type conversions centralized in logical.go; fixed missing .UTC() on unsafe fast path timestamp decode. - Pre-computed fieldIdx, nameVal, defaultJSON on schema nodes. - Slab string allocator and zero-copy field name lookups for JSON decode. - Deleted fromAvroJSON dead code (178 lines). - 4 new fuzz targets; 99% test coverage.
v1.3.2 - Fix nullable union encoding with nested pointers (#26) - Encode accepts tagged union maps, enabling Decode(TaggedUnions) → Encode round-trips (#27) - CustomType nil Decode bypasses built-in logical type handler with zero overhead, documented as stable contract (#27) - CustomType nil Encode no longer suppresses logical type serializer (#27)
PreviousNext