Skip to content
View azuredsky's full-sized avatar

Block or report azuredsky

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

The most powerful AI agent and AI chat software on Android/Operit是一款Android上目前能力最为强大的AI Agent

Kotlin 2,717 207 Updated Dec 23, 2025
Python 6 Updated Dec 6, 2025

An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone

Python 18,961 3,005 Updated Dec 22, 2025

GELab: GUI Exploration Lab. One of the best GUI agent solutions in the galaxy, built by the StepFun-GELab team and powered by Step’s research capabilities.

Python 1,696 141 Updated Dec 19, 2025
C++ 28 3 Updated Dec 14, 2025

[TCSVT] DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction

Python 93 5 Updated Oct 28, 2025

MM-ACT: Learn from Multimodal Parallel Generation to Act

Python 83 4 Updated Dec 19, 2025

Benchmarking Knowledge Transfer in Lifelong Robot Learning

Jupyter Notebook 1,302 261 Updated Mar 15, 2025

LLaVA_OpenVLA part 2, Generate MLLM general training data

Python 11 1 Updated Dec 27, 2024

手把手复现OpenVLA的中文说明

Python 63 6 Updated Dec 10, 2025

多模态具身智能大模型 OpenVLA 的复现以及在 LIBERO 数据集上的微调改进

Python 312 27 Updated Aug 8, 2025

YOLO multi-threaded and hardware-accelerated inference framework based on RKNN

C 34 7 Updated Nov 9, 2025

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding

Python 1 Updated Nov 25, 2025
Python 31 7 Updated Dec 16, 2025

An open source implementation of CLIP.

Python 13,150 1,221 Updated Nov 4, 2025

OCR model that handles complex tables, forms, handwriting with full layout.

Python 3,956 445 Updated Dec 19, 2025

使用OpenCV+onnxruntime部署中文clip做以文搜图,给出一句话来描述想要的图片,就能从图库中搜出来符合要求的图片。包含C++和Python两个版本的程序

C++ 84 18 Updated Jan 15, 2024

Real-time Vision Language Model interaction via webcam - WebRTC-based web interface

Python 155 19 Updated Dec 17, 2025

Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.

Python 11,203 1,810 Updated Dec 23, 2025

C++ implementations of PP-OCRv3 and PP-OCRv5 using ncnn for inference.

C++ 50 1 Updated Nov 12, 2025

Official implementation of "I2VWM: Robust Watermarking for Image to Video Generation"

Python 10 Updated Sep 28, 2025
C++ 2 12 Updated Jun 15, 2021

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,384 7,805 Updated Dec 23, 2025

[⭐️ WACV 2025 Oral ⭐️] PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition

Python 26 2 Updated Jun 9, 2025

🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

Python 3,573 761 Updated Mar 20, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,800 1,082 Updated Dec 23, 2025

A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.

JavaScript 1,590 195 Updated Nov 19, 2025

Wan 2.5 AI Video Generator - Transform text & images into HD videos with synchronized audio

72 8 Updated Sep 25, 2025

[ICCV 2023] TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric Perspective

Python 110 16 Updated May 19, 2025
Next