Stars
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
A wrapper of LLMs that biases its behaviour using prompts and contexts in a transparent manner to the end-users
Hipster4j is a lightweight and powerful heuristic search library for Java and Android. It contains common, fully customizable algorithms such as Dijkstra, A* (A-Star), DFS, BFS, Bellman-Ford and more.
A framework to learn cross-lingual word embedding mappings
IXA pipes Part of Speech tagger and Lemmatizer (http://ixa2.si.ehu.es/ixa-pipes)
Resources for the propor2022 and cc_net corpus and models
BigBWA is a new tool that uses the Big Data technology Hadoop to boost the performance of the Burrows–Wheeler aligner (BWA).
Comparing BERT-like models in word embedding contextualization.
Python wrapper for Linguakit's Perl implementation
SparkBWA is a new tool that exploits the capabilities of a Big Data technology as Apache Spark to boost the performance of one of the most widely adopted sequence aligner, the Burrows-Wheeler Align…
Generator of Markov chains. Demo at @MarkovUnchained.
FIles and other materials regarding linguistic variation and manuscript transmission
DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text.
Herramienta para observar el cambio de significado en las palabras a lo largo del tiempo. Puede ver más información en: http://tec.citius.usc.es/explorador-diacronico/ o en: http://explorador-diacr…
Efficient Execution of Perl Scripts on Hadoop Clusters