- Budapest, Hungary
-
21:40
(UTC +01:00) - https://gyorgy.orosz.link
- in/oroszgy
Highlights
NLP tools
Hierarchy-Aware Global Model for Hierarchical Text Classification
Code for Analyzing Redundancy in Pretrained Transformer Models accepted at EMNLP 2020
fast python port of arc90's readability tool, updated to match latest readability.js!
This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
a sklearn wrapper for Google's BERT model
ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Port of OpenAI's Whisper model in C/C++
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
The BiLSTM-CRF model implementation in Tensorflow, for sequence labeling tasks.
使用BERT-BiLSTM+CRF进行ner任务(pytorch_lightning版)
A library to synthesize text datasets using Large Language Models (LLM)
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023
🦙 Integrating LLMs into structured NLP pipelines
Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition, EACL 2021"
Multi-task model for named-entity recognition, relation extraction, entity mention detection and coreference resolution.
PyTorch code for SpERT: Span-based Entity and Relation Transformer
A blazingly fast and lightweight language detection library for Rust
QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/pdf/1909.04054
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.