llava

Here are 147 public repositories matching this topic...

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot llama multimodal multi-modality gpt-4 foundation-models visual-language-learning chatgpt instruction-tuning vision-language-model llava llama2 llama-2

Updated Aug 12, 2024
Python

sgl-project / sglang

Star

SGLang is a fast serving framework for large language models and vision language models.

Updated Oct 9, 2025
Python

Fanghua-Yu / SUPIR

Star

SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.

deep-learning pytorch super-resolution restoration diffusion-models pytorch-lightning stable-diffusion llava sdxl

Updated May 12, 2025
Python

open-compass / VLMEvalKit

Star

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

computer-vision evaluation pytorch gemini openai vqa vit gpt multi-modal clip claude openai-api gpt4 large-language-models llm chatgpt llava qwen gpt-4v

Updated Oct 9, 2025
Python

om-ai-lab / OmAgent

Star

Build multimodal language agents for fast prototype and production

Updated Mar 19, 2025
Python

Blaizzy / mlx-vlm

Sponsor

Star

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

mlx vision-framework apple-silicon vision-transformer llm vision-language-model llava local-ai idefics florence2 paligemma pixtral molmo

Updated Oct 9, 2025
Python

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

chatbot llama clip mulit-modal vision-language vicuna gpt-4 vision-language-pretraining llava video-chatboat video-conversation

Updated Aug 5, 2025
Python

unum-cloud / uform

Star

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Updated Sep 3, 2025
Python

jhc13 / taggui

Star

Tag manager and captioner for image datasets

image-captioning image-tagging tag-manager pyside6 stable-diffusion llava cogvlm florence-2

Updated May 21, 2025
Python

TinyLLaVA / TinyLLaVA_Factory

Star

A Framework of Small-scale Large Multimodal Models

nlp transformers llama vision-language llava large-multimodal-models tinyllama

Updated Apr 26, 2025
Python

NVlabs / EAGLE

Star

Eagle: Frontier Vision-Language Models with Data-Centric Strategies

demo eagle llama lmm nvdia huggingface gpt4 large-language-models llm mllm llava lvlm llama3

Updated Aug 8, 2025
Python

mbzuai-oryx / LLaVA-pp

Star

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

conversation lmms vision-language llm llava llama3 phi3 llava-llama3 llava-phi3 llama3-llava phi3-llava llama-3-vision phi3-vision llama-3-llava phi-3-llava llama3-vision phi-3-vision

Updated Aug 5, 2025
Python

PsyChip / machina

Star

OpenCV+YOLO+LLAVA powered video surveillance system

python opencv camera rtsp yolo llava ollama-api

Updated Sep 24, 2025
Python

PaddlePaddle / PaddleMIX

Star

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

Updated Sep 3, 2025
Python

SkalskiP / awesome-foundation-and-multimodal-models

Sponsor

Star

👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + Tutorials]

nlp computer-vision image-captioning clip blip multimodal zero-shot-detection foundational-models llava segment-anything open-vocabulary-detection open-vocabulary-segmentation grounding-dino

Updated Feb 29, 2024
Python

gokayfem / ComfyUI_VLM_nodes

Star

Custom ComfyUI nodes for Vision Language Models, Large Language Models, Image to Music, Text to Music, Consistent and Random Creative Prompt Generation

image-captioning nodes vlm custom-nodes img2text llm mllm llava comfyui siglip phi15 joytag img2sfx

Updated Feb 13, 2025
Python

ictnlp / LLaVA-Mini

Star

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

video efficient vision llama multimodal large-language-models vision-language-model llava visual-instruction-tuning multimodal-large-language-models gpt4v large-multimodal-models gpt4o

Updated Jun 29, 2025
Python

nrl-ai / llama-assistant

Sponsor

Star

AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many more models on HuggingFace.

personal-assistant llama owen llava moondream private-gpt llama3 llama-3-2 deepseek-r1

Updated Mar 2, 2025
Python

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

Star

Fully Open Framework for Democratized Multimodal Training

llm mllm vision-language-model llava qwen3

Updated Sep 30, 2025
Python

apocas / restai

Sponsor

Star

RESTai is an AIaaS (AI as a Service) open-source platform. Built on top of LlamaIndex & Langchain. Supports any public LLM supported by LlamaIndex and any local LLM suported by Ollama/vLLM/etc. Precise embeddings usage and tuning. Built-in image generation (Dall-E, SD, Flux) and dynamic loading generators.

python transformers embeddings openai llama rag fastapi llm stable-diffusion langchain openaiapi llava llamaindex ollama

Updated Sep 21, 2025
Python

Improve this page

Add a description, image, and links to the llava topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llava topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llava

Here are 147 public repositories matching this topic...

haotian-liu / LLaVA

sgl-project / sglang

Fanghua-Yu / SUPIR

open-compass / VLMEvalKit

om-ai-lab / OmAgent

Blaizzy / mlx-vlm

mbzuai-oryx / Video-ChatGPT

unum-cloud / uform

jhc13 / taggui

TinyLLaVA / TinyLLaVA_Factory

NVlabs / EAGLE

mbzuai-oryx / LLaVA-pp

PsyChip / machina

PaddlePaddle / PaddleMIX

SkalskiP / awesome-foundation-and-multimodal-models

gokayfem / ComfyUI_VLM_nodes

ictnlp / LLaVA-Mini

nrl-ai / llama-assistant

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

apocas / restai

Improve this page

Add this topic to your repo