Skip to content
View naoa's full-sized avatar

Organizations

@groonga @mroonga @ipnexus @cleanhearing @patentfield

Block or report naoa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Training LLMs with QLoRA + FSDP

Jupyter Notebook 1,539 201 Updated Nov 9, 2024

Build LLM-powered applications in Ruby

Ruby 1,974 260 Updated Mar 8, 2026

Language-Agnostic SEntence Representations

Jupyter Notebook 3,662 461 Updated May 2, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 22,067 2,695 Updated Jan 23, 2026

Incremental Skip-gram Model with Negative Sampling

Shell 69 8 Updated Jun 30, 2019

Word2Vec naïve version from scratch vs Word2Vec parallelized version.

Jupyter Notebook 1 Updated Aug 4, 2022

Package for evaluating word embeddings

Python 441 109 Updated Jan 4, 2021

RiverText is a framework that standardizes the Incremental Word Embeddings proposed in the state-of-art. Please feel welcome to open an issue in case you have any questions or a pull request if you…

Python 23 1 Updated Feb 26, 2025

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 14,190 1,221 Updated Oct 29, 2025

🍇 GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations

Jupyter Notebook 623 39 Updated Feb 24, 2024

A collection of ORM-style clients to public patent data

Python 125 46 Updated Jan 6, 2026

Painterro - JavaScript painting plugin

JavaScript 657 86 Updated Mar 20, 2026

🔥 Use pre-trained models in PyTorch to extract vector embeddings for any image

Python 621 98 Updated May 13, 2025

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 36,559 5,142 Updated Mar 23, 2026

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Python 3,457 262 Updated Oct 18, 2024

Header-only C++/python library for fast approximate nearest neighbors

C++ 5,142 801 Updated Mar 25, 2026

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

C++ 3,575 463 Updated Jan 12, 2026

FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)

C 1,161 194 Updated Jun 1, 2024

Hash function quality and speed tests

C++ 2,134 190 Updated Dec 2, 2025

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

C++ 355 51 Updated Apr 1, 2024

Javascript Canvas Library, SVG-to-Canvas (& canvas-to-SVG) Parser

TypeScript 31,034 3,629 Updated Mar 26, 2026

Zest is a compression-based text classifier using Meta's Zstandard compression algorithm. Zest is language-agnostic and this approach simplifies configuration, avoids careful feature extraction and…

Python 6 Updated Jan 15, 2022

Datasets, SOTA results of every fields of Chinese NLP

HTML 1,812 264 Updated Apr 7, 2022

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,647 534 Updated Oct 16, 2024

Pytorch version of BERT-whitening

Python 307 43 Updated Oct 9, 2021

PISA: Performant Indexes and Search for Academia

C++ 1,046 72 Updated Mar 22, 2026

BERT models for Japanese text.

Python 544 55 Updated Mar 23, 2024

PyTorch code for SpERT: Span-based Entity and Relation Transformer

Python 713 149 Updated Feb 1, 2024
Next