Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 66,537 9,521 Updated Dec 16, 2025

chloedia / layerdiffuse

Implementation of layer diffuse inference using refiners

Python 25 1 Updated Apr 25, 2024

kartoon-ai / layer_diffusers

Diffusers implementation of LayerDiffuse

Python 5 Updated May 29, 2024

KaustubhPatange / Diffuser-layerdiffuse

Unofficial implementation of Layer Diffuse in diffusers

Python 27 3 Updated Apr 3, 2024

lllyasviel / sd-forge-layerdiffuse

[WIP] Layer Diffusion for WebUI (via Forge)

Python 4,107 350 Updated Aug 30, 2024

yoavain / tensorflow-js-playground

Tensorflow-JS-playground

TypeScript 2 Updated Jul 5, 2024

matatonic / openedai-vision

An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.

Python 267 22 Updated Mar 6, 2025

yihong0618 / xiaogpt

Play ChatGPT and other LLM with Xiaomi AI Speaker

Python 6,705 925 Updated Dec 10, 2025

patlevin / tfjs-to-tf

A TensorFlow.js Graph Model Converter

Python 140 21 Updated Jan 22, 2023

tin2tin / Pallaidium

PALLAIDIUM — a generative AI movie studio, seamlessly integrated into the Blender Video Editor (VSE), enabling end-to-end production from script to screen and back.

Python 1,301 116 Updated Dec 13, 2025

calm-ixia / SDManualGUI

AUTOMATIC1111版web UIをまねた、DiffusersベースのStable Diffusion用GUIです（画像生成のみ）

Jupyter Notebook 3 Updated Sep 29, 2024

xiaomabenten / ruankao_itpm

💯2025年信息系统项目管理师（软考高级）备考资源库。

Rich Text Format 1,063 264 Updated Dec 14, 2025

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,433 479 Updated Aug 7, 2024

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 19,035 2,430 Updated Aug 6, 2024

Dryjelly / Face_Ear_Landmark_Detection

tf-keras code of Face Ear Landmark Detection System (with Multi-Task Learning).

Jupyter Notebook 20 3 Updated Aug 19, 2022

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,627 746 Updated Sep 22, 2025

InternLM / xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 5,027 394 Updated Dec 19, 2025

Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Python 2,794 176 Updated Jan 13, 2025

modelscope / facechain

FaceChain is a deep-learning toolchain for generating your Digital-Twin.

Jupyter Notebook 9,495 892 Updated Jun 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HastyJenny

Block or report HastyJenny

Stars

aws-samples / sample-demo-of-nova-mme

360CVGroup / RzenEmbed

zai-org / Open-AutoGLM

leejet / stable-diffusion.cpp

SUFE-AIFLM-Lab / VisFinEval

oh-my-ocr / text_renderer

rendicahya / objects365-downloader

triwinds / ppocr-onnx

1079863482 / paddle2torch_PPOCRv3

BADBADBADBOY / CardDetectRotate

RapidAI / RapidOcrAndroidOnnx

PaddlePaddle / PaddleOCR