This not finished and mainly for learning purposes. Probably FlexBuffers from Google do similar thing.
A JSON-compatible format that is mmap friendly.
This structure can be dump into a volume and use it as-is, without a necessity of loading it all to memory. The particular use case are large JSON that are pickled. Since MMJSON is mmap-friendly, in can be pickled with out-of-band mechanism.
- Use int in serialization
- Store all the strings at the very beginning and then only use as refs.
- Implement proper map with binary search of keys
-
Add StringRef type that would refer to a string in a pre-filled string table StringRefTable, to spare duplicated strings. Such a thing should be a part of some Meta block probably.
-
Make dict more efficient by using a BTree. Alternatively, have a list of sorted keys, bisect on keys, then have a list of payloads. The keys could be StringRef.