Skip to content
View lieding's full-sized avatar

Block or report lieding

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
116 stars written in Python
Clear filter

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,497 302 Updated Nov 5, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,328 279 Updated May 4, 2024

A system for agentic LLM-powered data processing and ETL

Python 3,259 349 Updated Nov 29, 2025

Superfast AI decision making and intelligent processing of multi-modal data.

Python 3,086 300 Updated Nov 18, 2025

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 3,034 156 Updated Nov 20, 2025

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,909 177 Updated May 26, 2025

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially…

Python 2,856 321 Updated Nov 2, 2025

Unleash Next-Level AI! 🚀 💻 Code Generation: DeepSeek r1 + Claude 3.7 Sonnet - Unparalleled Performance! 📝 Content Creation: DeepSeek r1 + Gemini 2.5 Pro - Superior Quality! 🔌 OpenAI-Compatible. 🌊 S…

Python 2,771 504 Updated Sep 24, 2025

Analytics, Versioning and ETL for multimodal data: video, audio, PDFs, images

Python 2,716 132 Updated Dec 17, 2025

复现大模型相关算法及一些学习记录

Python 2,698 371 Updated Dec 15, 2025

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

Python 2,455 180 Updated Nov 24, 2025

Run Mixtral-8x7B models in Colab or consumer desktops

Python 2,325 230 Updated Apr 8, 2024

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 2,263 132 Updated May 30, 2025

AI conversations that actually remember. Never re-explain your project to your AI again. Join our Discord: https://discord.gg/tyvKNccgqN

Python 2,207 137 Updated Dec 17, 2025

Empowering RAG with a memory-based data interface for all-purpose applications!

Python 2,187 154 Updated Sep 11, 2025

Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)

Python 1,938 139 Updated Oct 23, 2025

Fast stable diffusion on CPU and AI PC

Python 1,913 170 Updated Nov 24, 2025

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Python 1,870 144 Updated Apr 14, 2025

​ 李白 👤 作为唐代杰出诗人,其诗歌作品在中国文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。

Python 1,847 231 Updated Jul 12, 2025

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

Python 1,803 124 Updated Jul 10, 2024

first base model for full-duplex conversational audio

Python 1,770 112 Updated Jan 5, 2025

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,627 135 Updated Dec 4, 2025

A ComfyUI custom node designed for advanced image background removal and object, face, clothes, and fashion segmentation, utilizing multiple models including RMBG-2.0, INSPYRENET, BEN, BEN2, BiRefN…

Python 1,581 75 Updated Dec 16, 2025

基于PaddleOCR重构,并且脱离PaddlePaddle深度学习训练框架的轻量级OCR,推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle deep learning training framework, with ultra-fast inference speed.

Python 1,569 169 Updated Nov 1, 2025

AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that's as simple as writing Pandas code

Python 1,501 132 Updated Dec 11, 2025

[ICLR 2025] Automated Design of Agentic Systems

Python 1,474 226 Updated Jan 28, 2025

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.

Python 1,401 85 Updated Apr 21, 2025

✨ Hotshot-XL: State-of-the-art AI text-to-GIF model trained to work alongside Stable Diffusion XL

Python 1,111 94 Updated Jan 23, 2024

CogView4, CogView3-Plus and CogView3(ECCV 2024)

Python 1,100 79 Updated Mar 29, 2025