Lists (18)
Sort Name ascending (A-Z)
About Transformer & LLM
AI AGENT
Audio LLM
Avatar数字人
Document intelligence
Graph vis
to visualize graoh in frontendImage edit
image/video gen
Invoice Gen
Language learning assistant
LLM Reasoning
Low code
N2SQL/Data Analytics/Tabular
Non-LLM
Object detection/Computer Vision
OCR
python runtime
RAG
Including RAG, GraphRAG, function callingStars
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
A system for agentic LLM-powered data processing and ETL
Superfast AI decision making and intelligent processing of multi-modal data.
Fast and accurate automatic speech recognition (ASR) for edge devices
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially…
Unleash Next-Level AI! 🚀 💻 Code Generation: DeepSeek r1 + Claude 3.7 Sonnet - Unparalleled Performance! 📝 Content Creation: DeepSeek r1 + Gemini 2.5 Pro - Superior Quality! 🔌 OpenAI-Compatible. 🌊 S…
Analytics, Versioning and ETL for multimodal data: video, audio, PDFs, images
HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance
Run Mixtral-8x7B models in Colab or consumer desktops
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
AI conversations that actually remember. Never re-explain your project to your AI again. Join our Discord: https://discord.gg/tyvKNccgqN
Empowering RAG with a memory-based data interface for all-purpose applications!
Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
李白 👤 作为唐代杰出诗人,其诗歌作品在中国文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
first base model for full-duplex conversational audio
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
A ComfyUI custom node designed for advanced image background removal and object, face, clothes, and fashion segmentation, utilizing multiple models including RMBG-2.0, INSPYRENET, BEN, BEN2, BiRefN…
基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle deep learning training framework, with ultra-fast inference speed.
AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that's as simple as writing Pandas code
[ICLR 2025] Automated Design of Agentic Systems
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
✨ Hotshot-XL: State-of-the-art AI text-to-GIF model trained to work alongside Stable Diffusion XL
CogView4, CogView3-Plus and CogView3(ECCV 2024)