tokenizer

A grammar describes the syntax of a programming language, and might be defined in Backus-Naur form (BNF). A lexer performs lexical analysis, turning text into tokens. A parser takes tokens and builds a data structure like an abstract syntax tree (AST). The parser is concerned with context: does the sequence of tokens fit the grammar? A compiler is a combined lexer and parser, built for a specific grammar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokenizer

Here are 88 public repositories matching this topic...

risesoft-y9 / Data-Labeling

smoothnlp / SmoothNLP

jflex-de / jflex

CogComp / cogcomp-nlp

joliciel-informatique / talismane

youthlin / SNL-Compiler

uds-se / lFuzzer

CameraForensics / elasticsearch-plugins

niesfisch / tokenreplacer

melchisedech333 / antlr4-experiments

candowu / jieba-lucene-analiysis

Maha-J-Althobaiti / AraNLP

tuan-nng / solr-vn-tokenizer

zhangsoledad / solr-ik

abzif / opennlp-model-generator

brilacasck / full-compiler

vinhkhuc / Twitter-Tokenizer

audunhalland / parceq

edydfang / UW-Madison-CS536

KarsDev / Clarity

Related Topics