kanglue

kanglue

Starred repositories

GoPlusSecurity / agentguard

Security guard for AI agents — blocks malicious skills, prevents data leaks, protects secrets. 24 detection rules, runtime action evaluation, trust registry.

TypeScript 375 56 Updated Mar 26, 2026

jd-opensource / joyagent-jdgenie

开源的端到端产品级通用智能体

Java 11,586 1,577 Updated Feb 12, 2026

bytedance / deer-flow

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…

Python 48,105 5,729 Updated Mar 26, 2026

ByteDance-Seed / Depth-Anything-3

Depth Anything 3

Python 4,804 495 Updated Mar 21, 2026

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,557 239 Updated Jan 8, 2026

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

1,163 37 Updated Mar 24, 2026

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.

3,127 140 Updated Mar 19, 2026

NoizAI / skills

Allow your 🦞 bot to Shout, Speak, with "human" vibe

Python 415 57 Updated Mar 18, 2026

MoonshotAI / Attention-Residuals

2,753 127 Updated Mar 17, 2026

Vincent-ZHQ / Comprehensive-Long-Video-Understanding-Survey

A survey on MM-LLMs for long video understanding: From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding

19 1 Updated Sep 12, 2025

kahnchana / mvu

🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)

Python 57 5 Updated Jan 31, 2025

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 96,678 11,928 Updated Dec 15, 2025

marktext / marktext

📝A simple and elegant markdown editor, available for Linux, macOS and Windows.

JavaScript 54,735 4,060 Updated Mar 4, 2026

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 23,740 2,740 Updated Mar 12, 2026

jlygit / MediaNN

The project has implemented AI image/video processing based on neural networks, including but not limited to tasks such as denoising, restoration, enhancement, super-resolution.

2 Updated Apr 30, 2025

jlygit / AI-video-enhance

This repository collects the state-of-the-art algorithms for video/image enhancement using deep learning (AI) in recent years, including super resolution, compression artifact reduction, deblocking…

188 36 Updated May 15, 2020

amusi / Deep-Learning-Interview-Book

深度学习面试宝典（含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向）

8,771 1,377 Updated Apr 24, 2024

skindhu / Build-A-Large-Language-Model-CN

《Build a Large Language Model (From Scratch)》是一本深入探讨大语言模型原理与实现的电子书，适合希望深入了解 GPT 等大模型架构、训练过程及应用开发的学习者。为了让更多中文读者能够接触到这本极具价值的教材，我决定将其翻译成中文，并通过 GitHub 进行开源共享。

HTML 3,450 581 Updated Sep 7, 2025

woshidandan / Rethinking-Personalized-Aesthetics-Assessment

🔥[CVPR 2025 Highlight, Official Code] for paper "Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification". Official Weights and Demos provided.…

Python 44 Updated Nov 20, 2025

AndyLone22 / MirrorMetrics

MirrorMetrics: How to evaluate Stable Diffusion LoRAs. A visual diagnostic tool to detect overfitting, check dataset quality, and fix training settings using InsightFace biometrics.

Python 49 6 Updated Feb 21, 2026