Skip to content
View HastyJenny's full-sized avatar

Block or report HastyJenny

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Embedding model prioritized towards Multimodal RAG, overall + VisDoc double top1 on MMEB benchmark

Python 29 Updated Nov 6, 2025

An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone

Python 17,953 2,806 Updated Dec 19, 2025

Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++

C++ 4,909 475 Updated Dec 19, 2025
Python 5 Updated Nov 26, 2025

Generate text line images for training deep learning OCR models

Python 889 175 Updated Nov 4, 2025

Object365 dataset downloader

Shell 9 Updated Aug 6, 2025

利用 onnxruntime 及 PaddleOCR 提供的模型, 对图片中的文字进行检测与识别.

Python 86 18 Updated Jan 10, 2023

卡证和文档检测和矫正

Python 78 21 Updated Sep 18, 2024

RapidOcr onnxruntime推理 for Android

C++ 96 13 Updated Apr 17, 2025

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 66,537 9,521 Updated Dec 16, 2025

Implementation of layer diffuse inference using refiners

Python 25 1 Updated Apr 25, 2024

Diffusers implementation of LayerDiffuse

Python 5 Updated May 29, 2024

Unofficial implementation of Layer Diffuse in diffusers

Python 27 3 Updated Apr 3, 2024

[WIP] Layer Diffusion for WebUI (via Forge)

Python 4,107 350 Updated Aug 30, 2024

Tensorflow-JS-playground

TypeScript 2 Updated Jul 5, 2024

An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.

Python 267 22 Updated Mar 6, 2025

Play ChatGPT and other LLM with Xiaomi AI Speaker

Python 6,705 925 Updated Dec 10, 2025

A TensorFlow.js Graph Model Converter

Python 140 21 Updated Jan 22, 2023

PALLAIDIUM — a generative AI movie studio, seamlessly integrated into the Blender Video Editor (VSE), enabling end-to-end production from script to screen and back.

Python 1,301 116 Updated Dec 13, 2025

AUTOMATIC1111版web UIをまねた、DiffusersベースのStable Diffusion用GUIです(画像生成のみ)

Jupyter Notebook 3 Updated Sep 29, 2024

💯2025年信息系统项目管理师(软考高级)备考资源库。

Rich Text Format 1,063 264 Updated Dec 14, 2025

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,433 479 Updated Aug 7, 2024

Inference Llama 2 in one file of pure C

C 19,035 2,430 Updated Aug 6, 2024

tf-keras code of Face Ear Landmark Detection System (with Multi-Task Learning).

Jupyter Notebook 20 3 Updated Aug 19, 2022

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,627 746 Updated Sep 22, 2025

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 5,027 394 Updated Dec 19, 2025

An Open-source Toolkit for LLM Development

Python 2,794 176 Updated Jan 13, 2025

FaceChain is a deep-learning toolchain for generating your Digital-Twin.

Jupyter Notebook 9,495 892 Updated Jun 6, 2025
Next