tokenizer

A grammar describes the syntax of a programming language, and might be defined in Backus-Naur form (BNF). A lexer performs lexical analysis, turning text into tokens. A parser takes tokens and builds a data structure like an abstract syntax tree (AST). The parser is concerned with context: does the sequence of tokens fit the grammar? A compiler is a combined lexer and parser, built for a specific grammar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokenizer

Here are 398 public repositories matching this topic...

DethRaid / Roy_VnTokenizer

iFanpSGTS / YAL

Code-Yay-Mal / mon_tokenizer

art-test-stack / tokenizer

ChandradithyaJ / InterpretingRust

KonstantinosBarmpas / NeuroRVQ

sooonas / SocialTextTokenizer

MatinHosseinianFard / TesLang-compiler

Systemcluster / tokenizer-bench

Jimil1407 / research_bot

paunovicbojana / infix-postfix-calculator

shivendrra / biosaic

AhmedDawoud3 / Tokenizer

DolbyUUU / byte_pair_encoding_BPE_subword_tokenization_implementation_python

DzmitryPihulski / Encoder-transformer-from-scratch

Mike014 / Chatbot_App

kgruiz / PyTokenCounter

shivlloyd / custom-english-tokenizer

igoakulov / tokker

Japiahh / Chakaria-Tokenizer

Related topics