Stars
21 Lessons, Get Started Building with Generative AI
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A game theoretic approach to explain the output of any machine learning model.
A simple screen parsing tool towards pure vision based GUI agent
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
LAVIS - A One-stop Library for Language-Vision Intelligence
A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM …
A unified framework for 3D content generation.
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
A Bulletproof Way to Generate Structured JSON from Language Models
Neo4j graph construction from unstructured data using LLMs
The Open Source Memory Layer For Autonomous Agents
Notebooks & Example Apps for Search & AI Applications with Elasticsearch
Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning
Using advances in generative modeling to learn reward functions from unlabeled videos.
Stable Diffusion with Text-to-Image and Image-to-Text
wassname / prob_jsonformer
Forked from 1rgs/jsonformerGenerate Structured JSON with probs from Language Models
Testing DeepSpeed integration in 🤗 Accelerate