vision-language-pretraining

Star

Here are 42 public repositories matching this topic...

Sense-GVT / DeCLIP

Star

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

multi-model clip big-model zero-shot self-supervised image-text vision-language-pretraining

Updated Sep 19, 2022
Python

vgthengane / Continual-CLIP

Star

Official repository for "CLIP model is an Efficient Continual Learner".

baseline clip continual-learning vision-language-pretraining foundational-models

Updated Dec 13, 2022
Python

TencentARC / FLM

Star

Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)

language-modeling vision-language-pretraining

Updated May 15, 2023
Python

alinlab / b2t

Star

Bias-to-Text: Debiasing Unknown Visual Biases through Language Interpretation

explainable-ai vision-language-pretraining bias-and-fairness

Updated May 21, 2023
Python

sail-sg / ptp

Star

[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》

cross-modality vlp vision-language-pretraining

Updated Jun 7, 2023
Python

ChenDelong1999 / ITRA

Star

A codebase for flexible and efficient Image Text Representation Alignment

computer-vision deep-learning pytorch multimodal-learning vision-language-pretraining

Updated Jun 20, 2023
Python

ArrowLuo / SegCLIP

Star

PyTorch implementation of ICML 2023 paper "SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation"

transfer-learning semantic-segmentation contrastive-learning zero-shot-semantic-segmentation vision-language-pretraining open-vocabulary open-vocabulary-semantic-segmentation

Updated Jun 28, 2023
Python

adarobustness / adaptation_robustness

Star

Evaluate robustness of adaptation methods on large vision-language models

robustness adaptation parameter-efficient-tuning vision-language-pretraining

Updated Aug 23, 2023
Shell

Zoky-2020 / SGA

Star

Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models. [ICCV 2023 Oral]

adversarial-attack vision-language-pretraining

Updated Sep 6, 2023
Python

ahmdtaha / distributed_sigmoid_loss

Star

Unofficial implementation for Sigmoid Loss for Language Image Pre-Training

python3 pytorch unsupervised-learning vision-and-language multimodal-deep-learning self-supervised-learning vision-language contrastive-learning distributed-data-parallel vision-transformer vision-language-pretraining

Updated Sep 26, 2023
Python

megvii-research / protoclip

Star

📍 Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)

self-supervised-learning contrastive-learning vision-language-pretraining

Updated Nov 8, 2023
Python

yiren-jian / BLIText

Star

[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

multimodal-deep-learning vision-language-transformer vision-language-pretraining

Updated Dec 5, 2023
Python

omipan / svl_adapter

Star

SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models

self-supervised-learning vision-language-pretraining

Updated Jan 11, 2024
Python

deepseek-ai / DeepSeek-VL

Star

DeepSeek-VL: Towards Real-World Vision-Language Understanding

foundation-models vision-language-pretraining vision-language-model

Updated Apr 24, 2024
Python

unitaryai / VTC-dataset

Star

dataset video-understanding video-text-retrieval vision-language-pretraining vision-language-dataset

Updated May 1, 2024
Python

DAMO-NLP-SG / Video-LLaMA

Star

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

llama large-language-models video-language-pretraining vision-language-pretraining cross-modal-pretraining blip2 minigpt4 multi-modal-chatgpt

Updated Jun 4, 2024
Python

jaisidhsingh / LoRA-CLIP

Star

Easy wrapper for inserting LoRA layers in CLIP.

lora multimodal multimodal-deep-learning image-text-matching parameter-efficient-tuning vision-language-pretraining low-rank-adaptation

Updated Jun 16, 2024
Python

HieuPhan33 / CVPR2024_MAVL

Star

Multi-Aspect Vision Language Pretraining - CVPR2024

zero-shot-classification vision-language-pretraining vision-language-model zero-shot-segmentation medical-vision-and-language-pretraining

Updated Aug 20, 2024
Python

salesforce / LAVIS

Star

LAVIS - A One-stop Library for Language-Vision Intelligence

deep-learning salesforce image-captioning deep-learning-library vision-framework vision-and-language multimodal-deep-learning multimodal-datasets vision-language-transformer vision-language-pretraining visual-question-anwsering

Updated Nov 18, 2024
Jupyter Notebook

Surrey-UP-Lab / RegionSpot

Star

Recognize Any Regions

open-world object-detection zero-shot instance-segmentation auto-labeling vision-language-pretraining open-vocabulary vision-language-model multimodal-representation-learning vision-foundation-model vision-language-foundation-model

Updated Dec 18, 2024
Python

Improve this page

Add a description, image, and links to the vision-language-pretraining topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-language-pretraining topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vision-language-pretraining

Here are 42 public repositories matching this topic...

Sense-GVT / DeCLIP

vgthengane / Continual-CLIP

TencentARC / FLM

alinlab / b2t

sail-sg / ptp

ChenDelong1999 / ITRA

ArrowLuo / SegCLIP

adarobustness / adaptation_robustness

Zoky-2020 / SGA

ahmdtaha / distributed_sigmoid_loss

megvii-research / protoclip

yiren-jian / BLIText

omipan / svl_adapter

deepseek-ai / DeepSeek-VL

unitaryai / VTC-dataset

DAMO-NLP-SG / Video-LLaMA

jaisidhsingh / LoRA-CLIP

HieuPhan33 / CVPR2024_MAVL

salesforce / LAVIS

Surrey-UP-Lab / RegionSpot

Improve this page

Add this topic to your repo