Highlights
Stars
KoViDoRe: Korean Vision Document Retrieval Benchmark
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
Korean MTEB Retrieval Evaluators for SPLADE, Dense, and Reranking models
A massively multilingual modern encoder language model
State-of-the-art paired encoder and decoder models (17M-1B params)
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
한국어 문장 임베딩 모델들의 성능을 비교하고 시각화하는 프로젝트입니다. 본 프로젝트는 Claude Opus 4로 구현되었습니다.
A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability.
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Get your documents ready for gen AI
LEAKED SYSTEM PROMPTS FOR CHATGPT, GEMINI, GROK, CLAUDE, PERPLEXITY, CURSOR, DEVIN, REPLIT, AND MORE! - AI SYSTEMS TRANSPARENCY FOR ALL! 👐
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
🛰️ Official repository of paper "RemoteCLIP: A Vision Language Foundation Model for Remote Sensing" (IEEE TGRS)
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
Model Context Protocol Servers
openai websearch tool as mcp server
A Model Context Protocol server for searching and analyzing arXiv papers
【Star-crossed coders unite!⭐️】Model Context Protocol (MCP) server implementation providing Google News search capabilities via SerpAPI, with automatic news categorization and multi-language support.
Faster Whisper transcription with CTranslate2
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception
Fast Multimodal Semantic Deduplication & Filtering