Skip to content
View na399's full-sized avatar
🌏
Navigating the (Data) World
🌏
Navigating the (Data) World

Highlights

  • Pro

Organizations

@VACLab @sidataplus

Block or report na399

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Kosmos: An AI Scientist for Autonomous Discovery - An implementation and adaptation to be driven by Claude Code or API - Based on the Kosmos AI Paper - https://arxiv.org/abs/2511.02824

Python 328 65 Updated Dec 12, 2025

Data validation toolkit for assessing and monitoring data quality.

Python 311 22 Updated Dec 23, 2025

πŸ“Š Cube Core is open-source semantic layer and LookML alternative for AI, BI and embedded analytics

Rust 19,225 1,931 Updated Dec 22, 2025

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, xDC replica…

Go 29,030 2,624 Updated Dec 23, 2025

Transformers for Clinical NLP

Python 26 18 Updated Aug 6, 2025

Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.

Scala 177 26 Updated Dec 21, 2025

An extensible, state of the art columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Linux Foundation.

Rust 2,395 107 Updated Dec 23, 2025

AIStore: scalable storage for AI applications

Go 1,708 231 Updated Dec 23, 2025

versity s3 gateway

Go 900 133 Updated Dec 22, 2025

Ask the oracle when you're stuck. Invoke GPT-5 Pro with a custom context and files.

TypeScript 526 42 Updated Dec 23, 2025

Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)

Python 380 22 Updated Dec 10, 2025

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 5,374 541 Updated Dec 23, 2025

πŸ„ Give your CLI an extra life

TypeScript 370 15 Updated Dec 20, 2025

Send files and folders anywhere in the world without storing in cloud - any size, any format, no accounts, no restrictions.

TypeScript 4,501 240 Updated Dec 8, 2025

Language modeling with linear-cost context

Jupyter Notebook 114 13 Updated Sep 25, 2025

Open source DocuSign alternative. Create, fill, and sign digital documents ✍️

Ruby 11,042 896 Updated Dec 22, 2025

The Open Source DocuSign Alternative.

TypeScript 12,086 2,200 Updated Dec 23, 2025

Building a modern alternative to Salesforce, powered by the community.

TypeScript 37,644 4,788 Updated Dec 23, 2025

Toolkit for linearizing PDFs for LLM datasets/training

Python 16,342 1,260 Updated Dec 20, 2025

Rapid fuzzy string matching in Python using various string metrics

Python 3,611 146 Updated Dec 15, 2025

🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)

Rust 20 1 Updated Mar 15, 2025

🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure in Rust.

Rust 237 21 Updated Jun 11, 2025

Python tool for converting files and office documents to Markdown.

Python 84,526 4,868 Updated Dec 1, 2025

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Python 3,801 262 Updated May 17, 2025

Python module (C extension and plain python) implementing Aho-Corasick algorithm

C 1,063 140 Updated Dec 17, 2025

Check for multiple patterns in a single string at the same time: a fast Aho-Corasick algorithm for Python

Python 218 16 Updated Dec 19, 2025

Migrate a project from Poetry/Pipenv/pip-tools/pip to uv package manager

Rust 1,010 11 Updated Dec 23, 2025

Official PyTorch Implementation of "DiffusionPen: Towards Controlling the Style of Handwritten Text Generation" - ECCV 2024

Python 81 13 Updated Oct 24, 2024

Official Code for ECCV 2024 paper β€” One-Shot Diffusion Mimicker for Handwritten Text Generation

Python 499 54 Updated Oct 15, 2025
Next