Skip to content
View jacobmarks's full-sized avatar
🏗️
Building
🏗️
Building

Block or report jacobmarks

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
77 results for source starred repositories written in Python
Clear filter

Let us control diffusion models!

Python 33,791 3,007 Updated Feb 25, 2024

An open source implementation of CLIP.

Python 13,664 1,275 Updated Apr 6, 2026

Structured Outputs

Python 13,652 679 Updated Mar 26, 2026

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 11,939 880 Updated Jul 18, 2024

Refine high-quality datasets and visual AI models

Python 10,564 737 Updated Apr 10, 2026

🧙 Build, run, and manage data pipelines for integrating and transforming data.

Python 8,695 957 Updated Apr 2, 2026

OCR model that handles complex tables, forms, handwriting with full layout.

Python 8,510 872 Updated Apr 9, 2026

TripoSR: Fast 3D Object Reconstruction from a Single Image

Python 6,351 809 Updated Aug 16, 2024

OpenPCDet Toolbox for LiDAR-based 3D Object Detection.

Python 5,530 1,438 Updated Oct 8, 2025

Image to prompt with BLIP and CLIP

Python 2,947 434 Updated May 15, 2024

[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)

Python 2,250 181 Updated Dec 22, 2022

Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Python 1,838 371 Updated Apr 10, 2026

Official Repo For OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,345 54 Updated Oct 15, 2025

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Python 1,229 78 Updated Oct 30, 2025

❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119

Python 1,219 103 Updated Sep 2, 2023

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & C…

Python 961 69 Updated Jan 1, 2026

A Library for Differentiable Logic Gate Networks

Python 775 91 Updated Mar 19, 2024

DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight) / / / / When Does Perceptual Alignment Benefit Vision Representations? (NeurIPS 2024)

Python 596 32 Updated Nov 24, 2025

code for CVPR2024 paper: DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction

Python 446 54 Updated Jun 13, 2024

Liquid Audio - Speech-to-Speech audio models by Liquid AI

Python 430 72 Updated Apr 1, 2026

Code for ICML 2023 paper, "PFGM++: Unlocking the Potential of Physics-Inspired Generative Models"

Python 382 38 Updated Sep 11, 2023

Search docs.voxel51.com with an LLM!

Python 374 61 Updated Mar 23, 2026

[NeurIPS2023] DatasetDM:Synthesizing Data with Perception Annotations Using Diffusion Models

Python 328 15 Updated Nov 3, 2023

Computer Vision dataset analysis

Python 314 40 Updated Aug 6, 2024

AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods…

Python 314 11 Updated Nov 1, 2024

AI assistant that can query visual datasets, search the FiftyOne docs, and answer general computer vision questions

Python 253 18 Updated Mar 23, 2026

ICCV 2023 Paper Global Features are All You Need for Image Retrieval and Reranking Official Repository

Python 247 19 Updated Sep 14, 2023

PyTorch implementation of CLIP Maximum Mean Discrepancy (CMMD) for evaluating image generation models.

Python 164 11 Updated Apr 5, 2024

ACL 2025: Synthetic data generation pipelines for text-rich images.

Python 158 28 Updated Mar 1, 2025

Open source AI/ML capabilities for the FiftyOne ecosystem

Python 158 13 Updated Apr 8, 2026
Next