Stars
Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".
A small JS library for beautiful drawing and handwriting on the HTML Canvas.
Modern CLI tool for scraping & analyzing Facebook groups using Playwright & Gemini AI. Features self-healing selectors, session security, and local offline analysis.
Memory layer for AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer. Give your agents instant retrieval and long-term memory.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
MCP server to provide Figma layout information to AI coding agents like Cursor
A TTS model capable of generating ultra-realistic dialogue in one pass.
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
Fast drag and drop for any experience on any tech stack
Evaluation and Tracking for LLM Experiments and AI Agents
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
C++ examples for the Vulkan graphics API
high-performance rendering backend (C++, Python, C#, Java)
stackgl / headless-gl
Forked from mikeseven/node-webgl🎃 Windowless WebGL for node.js
A generative world for general-purpose robotics & embodied AI learning.
Generate 3D objects conditioned on text or images
Cross-platform, graphics API agnostic, "Bring Your Own Engine/Framework" style rendering library.
Open 3D Engine (O3DE) is an Apache 2.0-licensed multi-platform 3D engine that enables developers and content creators to build AAA games, cinema-quality 3D worlds, and high-fidelity simulations wit…
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Moveable! Draggable! Resizable! Scalable! Rotatable! Warpable! Pinchable! Groupable! Snappable!
💅 Beautiful and accessible drag and drop for lists with React. ⭐️ Star to support our work!
The modern toolkit for building drag and drop interfaces