[CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
-
Updated
Jul 8, 2025 - Python
[CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
Easy text classification for everyone : Bert based models via Huggingface transformers (KR / EN)
Upload & Merge CSV or JSON Data with Images to Notion Database
ScaleDP is an Open-Source extension of Apache Spark for Document Processing
Simple Generative AI enabled Streamlit web application that converts speech to-image.
Multimodal Document Processing RAG with LangChain
A 5-way embedding model for text, audio, image, video, and 3D point clouds.
Neuromorphic Bird Classifier Desktop App (NeuroBCDA) bundled with Live Event Camera Simulator
This repository contains code for generating blog content using the LLama 2 language model. It integrates with Streamlit for easy user interaction. Simply input your blog topic, desired word count, and writing style to generate engaging blog content.
一个从 Hugging Face 镜像站点快速下载模型和数据集的命令行工具。
🤗 A Python script for efficiently downloading and reconstructing large Hugging Face model files by splitting them into manageable chunks
A URL summarizer, which summarizes the content of a URL with proper formatting. It uses 'sshleifer/distilbart-cnn-12-6', which is a distilled version of the BART model, specifically optimized for text summarization tasks, including CNN summarization.
Multimodal-OCR3 is an advanced Optical Character Recognition (OCR) application that leverages multiple state-of-the-art multimodal models to extract text from images.
UniversalLLMAdapter class initializes the appropriate client based on the specified provider.
An AI powered CLI tool that can help you organize and make projects the fastest way possible.
Ara AI is an AI-powered financial analysis platform developed by MeridianAlgo, designed for stock volatility forecasting, market predictions, and portfolio optimization using ensemble machine learning models and real-time data.
VLM-Parsing is a Gradio-based web application for parsing documents and images into structured HTML and Markdown formats using advanced Vision Language Models (VLMs).
A fine-tuned version of SmolLM2-360M-Instruct-bnb-4bit specialized for parsing unstructured calendar event requests into structured JSON data.
A FastAPI-powered backend that manages structured debates, analyzes arguments, and generates AI-driven summaries and insights.
Age-Classification-SigLIP2 is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to predict the age group of a person from an image using the SiglipForImageClassification architecture.
Add a description, image, and links to the huggingface-models topic page so that developers can more easily learn about it.
To associate your repository with the huggingface-models topic, visit your repo's landing page and select "manage topics."