lieding

lieding

Lists (18)

Sort

Stars

116 stars written in Python

Clear filter

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,497 302 Updated Nov 5, 2024

dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,328 279 Updated May 4, 2024

ucbepic / docetl

A system for agentic LLM-powered data processing and ETL

Python 3,259 349 Updated Nov 29, 2025

aurelio-labs / semantic-router

Superfast AI decision making and intelligent processing of multi-modal data.

Python 3,086 300 Updated Nov 18, 2025

moonshine-ai / moonshine

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 3,034 156 Updated Nov 20, 2025

InternLM / InternLM-XComposer

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Python 2,909 177 Updated May 26, 2025

HumanAIGC-Engineering / OpenAvatarChat

Python 2,906 467 Updated Dec 2, 2025

kayak / pypika

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially…

Python 2,856 321 Updated Nov 2, 2025

ErlichLiu / DeepClaude

Unleash Next-Level AI! 🚀 💻 Code Generation: DeepSeek r1 + Claude 3.7 Sonnet - Unparalleled Performance! 📝 Content Creation: DeepSeek r1 + Gemini 2.5 Pro - Superior Quality! 🔌 OpenAI-Compatible. 🌊 S…

Python 2,771 504 Updated Sep 24, 2025

datachain-ai / datachain

Analytics, Versioning and ETL for multimodal data: video, audio, PDFs, images

Python 2,716 132 Updated Dec 17, 2025

wyf3 / llm_related

复现大模型相关算法及一些学习记录

Python 2,698 371 Updated Dec 15, 2025

InternLM / HuixiangDou

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

Python 2,455 180 Updated Nov 24, 2025

dvmazur / mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops

Python 2,325 230 Updated Apr 8, 2024

X-PLUG / mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 2,263 132 Updated May 30, 2025

basicmachines-co / basic-memory

AI conversations that actually remember. Never re-explain your project to your AI again. Join our Discord: https://discord.gg/tyvKNccgqN

Python 2,207 137 Updated Dec 17, 2025

qhjqhj00 / MemoRAG

Empowering RAG with a memory-based data interface for all-purpose applications!

Python 2,187 154 Updated Sep 11, 2025

Yuliang-Liu / Monkey

Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)

Python 1,938 139 Updated Oct 23, 2025

rupeshs / fastsdcpu

Fast stable diffusion on CPU and AI PC

Python 1,913 170 Updated Nov 24, 2025

opendatalab / DocLayout-YOLO

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Python 1,870 144 Updated Apr 14, 2025

BinNong / meet-libai

李白 👤 作为唐代杰出诗人，其诗歌作品在中国文学史上具有重要地位。近年来，随着数字技术和人工智能的快速发展，传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入，但在数字化、智能化普及方面仍存在不足。因此，本项目旨在通过构建李白知识图谱，结合大模型训练出专业的AI智能体，以生成式对话应用的形式，推动李白文化的普及与推广。

Python 1,847 231 Updated Jul 12, 2025

SqueezeAILab / LLMCompiler

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

Python 1,803 124 Updated Jul 10, 2024

Standard-Intelligence / hertz-dev

first base model for full-duplex conversational audio

Python 1,770 112 Updated Jan 5, 2025

McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,627 135 Updated Dec 4, 2025

1038lab / ComfyUI-RMBG

A ComfyUI custom node designed for advanced image background removal and object, face, clothes, and fashion segmentation, utilizing multiple models including RMBG-2.0, INSPYRENET, BEN, BEN2, BiRefN…

Python 1,581 75 Updated Dec 16, 2025

jingsongliujing / OnnxOCR

基于PaddleOCR重构，并且脱离PaddlePaddle深度学习训练框架的轻量级OCR，推理速度超快 —— A lightweight OCR system based on PaddleOCR, decoupled from the PaddlePaddle deep learning training framework, with ultra-fast inference speed.

Python 1,569 169 Updated Nov 1, 2025

AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that's as simple as writing Pandas code

Python 1,501 132 Updated Dec 11, 2025

ShengranHu / ADAS

[ICLR 2025] Automated Design of Agentic Systems

Python 1,474 226 Updated Jan 28, 2025

facebookresearch / MobileLLM

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.

Python 1,401 85 Updated Apr 21, 2025

hotshotco / Hotshot-XL

✨ Hotshot-XL: State-of-the-art AI text-to-GIF model trained to work alongside Stable Diffusion XL

Python 1,111 94 Updated Jan 23, 2024

zai-org / CogView4

CogView4, CogView3-Plus and CogView3(ECCV 2024)

Python 1,100 79 Updated Mar 29, 2025

Previous Next

lieding

Lists (18)

About Transformer & LLM

AI AGENT

Audio LLM

Avatar数字人

Document intelligence

Graph vis

Image edit

image/video gen

Invoice Gen

Language learning assistant

LLM Reasoning

Low code

N2SQL/Data Analytics/Tabular

Non-LLM

Object detection/Computer Vision

OCR

python runtime

RAG

Stars