Skip to content

karan/language.dart

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project is now deprecated. If you would like to complete it, feel free to send PR's.

language

General natural language processing utilities for Dart. It provides a simple API for getting started with natural language processing (NLP), Artificial Intelligence (AI) and Natural Language Generation (NLG) tasks.

This package will initially support English. In future, it may support other major languages like Spanish, Russian, Chinese (maybe).

Features Overview

  • Tokenization
  • String distance
  • n-grams
  • Markov chain
  • Classifiers
  • Phonetics
  • Language identification
  • Summarization
  • Part-of-speech tagging (POS)
  • Sentiment Analysis
  • TF-IDF
  • Words Inflection and Lemmatization

Tokenization

Space Tokenizer

SpaceTokenizer tokenizer = new SpaceTokenizer();
tokenizer.tokenize('brown fox jumps');
===> ['brown', 'fox', 'jumps']

tokenizer.tokenize('Stand   on   your   head!');
===> ['Stand', '', '', 'on', '', '', 'your', '', '', 'head!']

Tab Tokenizer

TabTokenizer tokenizer = new TabTokenizer();
tokenizer.tokenize('brown\tfox\tjumps');
===> ['brown', 'fox', 'jumps']

Regexp Tokenizer

Word Tokenizer

Word-Punctuation Tokenizer

Treebank Tokenizer

String distance

Jaro–Winkler algorithm

Levenshtein algorithm

Dice's Coefficient

n-grams

Markov chain

http://blog.codinghorror.com/markov-and-you/

Classifiers

Naive Bates

Logistic regression

Phonetics

SoundEx

Metaphone

Double Metaphone

Language identification

Summarization

Part-of-speech tagging (POS)

TnT (?)

Sentiment Analysis

TF-IDF

Words Inflection and Lemmatization

Noun inflection

Number inflection

Present verb inflector

Testing

$ chmod u+x tool/run_tests.sh
$ ./tool/run_tests.sh

About

🌺 Natural language processing utilities for Dart

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors