AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understanding across diverse domains.
-
Updated
Mar 13, 2025 - HTML
AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understanding across diverse domains.
[CVPR 2026] ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering
A toolkit for vision-language processing to support the increasing popularity of mulit-modal transformer-based models
Part of our final year project work involving complex NLP tasks along with experimentation on various datasets and different LLMs
Explain how Transformer AI models work with an interactive, beginner-friendly guide covering key concepts from tokenization to image generation.
Add a description, image, and links to the vqa topic page so that developers can more easily learn about it.
To associate your repository with the vqa topic, visit your repo's landing page and select "manage topics."