Collate is a simple collation library for comparing strings in various languages for Go. It's designed for the BuntDB project, and is simliar to the collation that is found in traditional database systems
The idea is that you call a function with a collation name and it generates
a Less(a, b string) bool
function that can be used for sorting using the
sort
package or with B-Tree style databases.
go get -u github.com/tidwall/collate
// create a case-insensitive collation for spanish.
less := collate.IndexString("SPANISH_CI")
println(less("Hola", "hola"))
println(less("hola", "Hola"))
// Output:
// false
// false
Add _CI
to the collation name to specify case-insensitive comparing.
Add _CS
for case-sensitive compares, this is the default.
collate.Index("SPANISH_CI") // Case-insensitive collation for spanish
collate.Index("SPANISH_CS") // Case-sensitive collation for spanish
Add _LOOSE
to ignores diacritics, case and weight.
Add _NUM
to specifies that numbers should sort numerically ("2" < "12")
You can also compare fields in json documents using the IndexJSON
function.
The GJSON is used under-the-hood.
var jsonA = `{"name":{"last":"Miller"}}`
var jsonB = `{"name":{"last":"anderson"}}`
less := collate.IndexJSON("ENGLISH_CI", "name.last")
println(less(jsonA, jsonB))
println(less(jsonB, jsonA))
// Output:
// false
// true
Afrikaans Albanian AmericanEnglish Amharic Arabic Armenian Azerbaijani Bengali BrazilianPortuguese BritishEnglish Bulgarian Burmese CanadianFrench Catalan Chinese Croatian Czech Danish Dutch English Estonian EuropeanPortuguese EuropeanSpanish Filipino Finnish French Georgian German Greek Gujarati Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Kannada Kazakh Khmer Kirghiz Korean Lao LatinAmericanSpanish Latvian Lithuanian Macedonian Malay Malayalam Marathi ModernStandardArabic Mongolian Nepali Norwegian Persian Polish Portuguese Punjabi Romanian Russian Serbian SerbianLatin SimplifiedChinese Sinhala Slovak Slovenian Spanish Swahili Swedish Tamil Telugu Thai TraditionalChinese Turkish Ukrainian Urdu Uzbek Vietnamese Zulu
Josh Baker @tidwall
Collate source code is available under the MIT License.