#

siglip2

Here are 57 public repositories matching this topic...

Sanyi54 / Clipart-126-DomainNet

Clipart-126-DomainNet is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify clipart images into 126 domain categories using the SiglipForImageClassification architecture

art classification image-classification llama demo-app gradio torchvision huggingface-transformers vision-transformer huggingface-spaces siglip2

Updated Oct 10, 2025
Python

MrAlonso9 / Hand-Gesture-2-Robot

Hand-Gesture-2-Robot is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to recognize hand gestures and map them to specific robot commands using the SiglipForImageClassification architecture.

png robot jpeg pillow pil image-classification visionprocessing gesture-recognition huggingface-transformers vision-transformer vision-language-model siglip2

Updated Oct 10, 2025
Python

PRITHIVSAKTHIUR / Deepfake-vs-Real-8000

Deepfake vs Real is a dataset designed for image classification, distinguishing between deepfake and real images.

detection vit deepfake vision-transformer siglip2

Updated Mar 27, 2025
Python

PRITHIVSAKTHIUR / open-deepfake-detection

open-deepfake-detection is a vision-language encoder model fine-tuned from siglip2-base-patch16-512 for binary image classification. It is trained to detect whether an image is fake or real using the OpenDeepfake-Preview dataset. The model uses the SiglipForImageClassification architecture.

google image-classification image-recognition gradio deepfake-detection huggingface-transformers siglip2

Updated May 22, 2025
Python

PRITHIVSAKTHIUR / Food-101-93M

Food-101-93M is a fine-tuned image classification model built on top of google/siglip2-base-patch16-224 using the SiglipForImageClassification architecture. It is trained to classify food images into one of 101 popular dishes, derived from the Food-101 dataset.

food image-classification huggingface-transformers vision-transformer siglip2

Updated Apr 7, 2025
Python

PRITHIVSAKTHIUR / Multilabel-GeoSceneNet

Multilabel-GeoSceneNet is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for multi-label image classification. It is designed to recognize and label multiple geographic or environmental elements in a single image using the SiglipForImageClassification architecture.

map geospatial landscape spaces gradio huggingface-transformers hugging-face siglip vision-encoder siglip2 geoscenenet

Updated Apr 23, 2025
Python

PRITHIVSAKTHIUR / Face-Mask-Detection

Face-Mask-Detection is a binary image classification model based on google/siglip2-base-patch16-224, trained to detect whether a person is wearing a face mask or not. This model can be used in public health monitoring, access control systems, and workplace compliance enforcement.

gradio face-mask-detection facemask-detection huggingface-transformers face-mask-classification vision-transformer siglip2

Updated May 12, 2025
Python

PRITHIVSAKTHIUR / Multilabel-Portrait-SigLIP2

Multilabel-Portrait-SigLIP2 is a vision-language model fine-tuned from google/siglip2-base-patch16-224 using the SiglipForImageClassification architecture. It classifies portrait-style images into one of the following visual portrait categories:

python google autoencoder image-classification gradio multilabel-classification portraits huggingface-transformers vision-transformer vision-encoder siglip2

Updated Apr 16, 2025
Python

Multisource-121-DomainNet

PRITHIVSAKTHIUR / Multisource-121-DomainNet

Multisource-121-DomainNet is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify images into 121 domain categories using the SiglipForImageClassification architecture.

image-classification gradio f32 huggingface-transformers vision-transformer domainnet siglip2

Updated Mar 25, 2025
Python

PRITHIVSAKTHIUR / facial-age-detection

facial-age-detection is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-512 for multi-class image classification. It is trained to detect and classify human faces into age groups ranging from early childhood to elderly adults. The model uses the SiglipForImageClassification architecture.

google image-processing image-classification vit gradio age-estimation huggingface-transformers siglip2

Updated May 30, 2025
Python

PRITHIVSAKTHIUR / Fire-Detection-Siglip2

Fire-Detection-Siglip2 is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to detect fire, smoke, or normal conditions using the SiglipForImageClassification architecture.

google smoke image-classification llama vit normal fire-detection huggingface huggingface-transformers siglip siglip2

Updated Mar 31, 2025
Python

PRITHIVSAKTHIUR / Hand-Gesture-2-Robot

Hand-Gesture-2-Robot is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to recognize hand gestures and map them to specific robot commands using the SiglipForImageClassification architecture.

png robot jpeg pillow pil image-classification visionprocessing gesture-recognition huggingface-transformers vision-transformer vision-language-model siglip2

Updated Apr 2, 2025
Python

PRITHIVSAKTHIUR / Clipart-126-DomainNet

Clipart-126-DomainNet is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify clipart images into 126 domain categories using the SiglipForImageClassification architecture

art classification image-classification llama demo-app gradio torchvision huggingface-transformers vision-transformer huggingface-spaces siglip2

Updated Mar 26, 2025
Python

PRITHIVSAKTHIUR / open-age-detection

open-age-detection is a vision-language encoder model fine-tuned from google/siglip2-base-patch16-512 for multi-class image classification. It is trained to classify the estimated age group of a person from an image. The model uses the SiglipForImageClassification architecture.

google image-classification gradio age-estimation age-detection huggingface-transformers siglip2

Updated May 23, 2025
Python

PRITHIVSAKTHIUR / Document-Type-Detection

Document-Type-Detection is a multi-class image classification model based on google/siglip2-base-patch16-224, trained to detect and classify types of documents from scanned or photographed images. This model is helpful for automated document sorting, OCR pipelines, and digital archiving systems.

detection image-processing torch image-classification gradio colab-notebook document-type huggingface-transformers siglip2

Updated May 14, 2025
Python

Gender-Classifier-Mini

PRITHIVSAKTHIUR / Gender-Classifier-Mini

Gender-Classifier-Mini is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify images based on gender using the SiglipForImageClassification architecture.

gender-recognition vit gradio gender-classification gender-detection huggingface-transformers vision-transformer vision-language-model siglip siglip2

Updated Mar 30, 2025
Python

PRITHIVSAKTHIUR / Bird-Species-Classifier-526

Bird-Species-Classifier-526 is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224

bird image-classification vit species-identification huggingface-transformers siglip2

Updated Mar 28, 2025
Python

Traffic-Density-Classification

PRITHIVSAKTHIUR / Traffic-Density-Classification

Traffic-Density-Classification is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify images into traffic density categories using the SiglipForImageClassification architecture.

google analysis transformers traffic torch density vit gradio torchvision huggingface-transformers siglip2

Updated Mar 22, 2025
Python

PRITHIVSAKTHIUR / IndoorOutdoorNet

IndoorOutdoorNet is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify images as either Indoor or Outdoor using the SiglipForImageClassification architecture..

image-classification gradio indoor outdoors huggingface-transformers vision-transformer siglip2

Updated Apr 25, 2025
Python

meangrinch / LocalLens

Local image search engine powered by CLIP/SigLIP

search-engine ai clip gradio retreival siglip2

Updated Aug 30, 2025
Python

Improve this page

Add a description, image, and links to the siglip2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the siglip2 topic, visit your repo's landing page and select "manage topics."