Mastering NLP step by step
-
Updated
Aug 11, 2019 - Jupyter Notebook
A grammar describes the syntax of a programming language, and might be defined in Backus-Naur form (BNF). A lexer performs lexical analysis, turning text into tokens. A parser takes tokens and builds a data structure like an abstract syntax tree (AST). The parser is concerned with context: does the sequence of tokens fit the grammar? A compiler is a combined lexer and parser, built for a specific grammar.
Mastering NLP step by step
Extract text content from an HTML page, process it, and extract unique words from the processed text. This notebook utilizes various text processing techniques including cleaning, normalization, tokenization, lemmatization or stemming, and stop words removal.
This repository contains the collection of explorative notebooks pure in python and in the language that we, humans can read. Have tried to compile all lectures from the Andrej Karpathy's 💎 playlist on Neural Networks - which we will end up with building GPT.
This Jupyter Notebook implements a tool to check whether two sentences are paraphrases by analyzing their semantic similarity using NLP techniques. It provides a similarity score and a binary decision to indicate if the sentences are paraphrases.
Modular pipeline for building RAG and LLM workflows in Colab, including tokenizer/chunk/embed notebooks and CoreML model exporters for iOS. Part of the NoesisNoema project.