This is an elixir dictionary library providing information for words, often refered as "headword" in linguistic domains.
This library compile dictionary data into codes for accurater results. There are some libraries using algorithms to get results. However, so far the results are not good enough compared to dictionaries. It may change in future and hopefully we'll switch to a better approach.
iex> WordInfo.frequency("word")
340340 means this word is the last one of top 340 frequently used words, among the whole 33,000 ones. Quite popular!
iex> WordInfo.arpabet("mix")
["M", "IH1", "K", "S"]iex> WordInfo.ipa("exsiting")
["ɪgˈzɪstɪŋ"]iex> WordInfo.syllables("syllable")
["syl", "la", "ble"]Please refer to online document for more information.
Here are the data sources of this library:
- syllables - 43,000 words from Gary Darby's DFF project
- IPA style pronunciation - 125,000 word pronunciations from cmudict-ipa project
- ARPABET style pronunciation - 130,000 word pronunciations from CMU Dict
- frequency - usage frequency ranking of 33,000+ words from Brown Corpus of American English and cmudict-ipa
Without these open data, this library is impossible.