tokenizer

A grammar describes the syntax of a programming language, and might be defined in Backus-Naur form (BNF). A lexer performs lexical analysis, turning text into tokens. A parser takes tokens and builds a data structure like an abstract syntax tree (AST). The parser is concerned with context: does the sequence of tokens fit the grammar? A compiler is a combined lexer and parser, built for a specific grammar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokenizer

Here are 396 public repositories matching this topic...

DethRaid / Roy_VnTokenizer

sooonas / SocialTextTokenizer

Systemcluster / tokenizer-bench

paunovicbojana / infix-postfix-calculator

AhmedDawoud3 / Tokenizer

bodonlp / bodo-tokenizer

Mike014 / Chatbot_App

shivlloyd / custom-english-tokenizer

dylankle / kingdom-of-ngrams

rafaladamczyk44 / polish-bpe-llm-research_msc_thesis

markmusic27 / tokenizer

igoakulov / tokker

TryOmar / CompilerXArabic

ashiqueeelahi / Real-or-Not-NLP-with-Disaster-Tweets

akhvorov / pyvgram

heyitsmass / mass-json

Hords01 / Data_Mining

brendanddev / LLEmon

sumony2j / Simple-BPE-Tokenizer

azizamirsaidova / nlp_tasks

Related topics