Lists (1)
Sort Name ascending (A-Z)
Stars
A model-agnostic layered cognitive framework for LLMs. Improves coherence, emotional clarity, structural reasoning, and creative depth across GPT, Claude, Gemini, Grok, Mistral, and others—while re…
The Deep Research Assistant is meticulously crafted on Mastra's modular, scalable architecture, designed for intelligent orchestration and seamless human-AI interaction. It's built to tackle comple…
An example Nuxt 4 app using Mastra AI agent framework
A2A Mastra Demo - Multi-Agent System with Amazon Bedrock
Ship Agent2Agent in one line of code.
An AI agent that searches the web and creates research reports
Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!
[NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient
Real-time, full-duplex AI voice bot integrating NVIDIA's PersonaPlex with Twilio Media Streams for natural speech-to-speech conversations.
Voice bridge connecting PersonaPlex (System 1 fast response) with Letta/Claude (System 2 reasoning). Talker-Reasoner coordination service.
Kronos: A Foundation Model for the Language of Financial Markets
Memory library for building stateful agents
"DeepTutor: Agent-Native Personalized Learning Assistant"
A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.
Code Implementation of SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection
Gemma-4-E4B-it running locally in a browser with WebGPU.
Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.
Run AI ✨ assistant locally! with simple API for Node.js 🚀
minimalist & responsive ai-powered portfolio template that creates an interactive ama (ask me anything) experience for your visitors
LLM inference with 7x longer context. Pure C, zero dependencies. Lossless KV cache compression + single-header library.
LLM inference in C/C++ with changes from Prism-ML to support 1Bit models
Turbo1Bit: Combining 1-bit LLM weights (Bonsai) with TurboQuant KV cache compression for maximum inference efficiency. 4.2x KV cache compression + 16x weight compression = ~10x total memory reduction.
Multimodal orchestration for LLM APIs. Source patterns, context caching, and structured output for text, PDFs, images, video, and YouTube - so you don't manage the complexity yourself.
AI agent for web scrapping using LLM, RAG and Vector DB
AI pipeline built with the honc and workers-ai. vector embeddings, web scraping and processing with Cloudflare Workflows (beta)
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN