mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
-
Updated
Apr 2, 2025 - Python
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Implementation of "YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception".
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers
Improved Residual Networks (https://arxiv.org/pdf/2004.04989.pdf)
Official PyTorch implementation of Fully Attentional Networks
Improving Generalization via Scalable Neighborhood Component Analysis
PyTorch reimplementation of the paper "Involution: Inverting the Inherence of Convolution for Visual Recognition" (2D and 3D Involution) [CVPR 2021].
Deep Isometric Learning for Visual Recognition (ICML 2020)
GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?
Multimodal Prompting with Missing Modalities for Visual Recognition, CVPR'23
Deep Understanding of Traffic Scenes for Autonomous Driving
[ICCV W] Contextual Convolutional Neural Networks (https://arxiv.org/pdf/2108.07387.pdf)
[ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Build Change - Post-Disaster Rapid Response Retrofit. Following Build Change's main premise to Build Disaster Resistant Buildings and Change Construction Practices Permanently, PD3R Team's main objective is to improve the safety conditions of buildings and reduce human and economic loss after the occurrence of a natural disaster.
Implementation for <Orthogonal Over-Parameterized Training> in CVPR'21.
[TMLR] "Adversarial Feature Augmentation and Normalization for Visual Recognition", Tianlong Chen, Yu Cheng, Zhe Gan, Jianfeng Wang, Lijuan Wang, Zhangyang Wang, Jingjing Liu
This repository contains the ViewFool and ImageNet-V proposed by the paper “ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints” (NeurIPS2022).
Un proyecto open source de visión artificial para reconocer la portada de libros implementado en TensorFlow.
Add a description, image, and links to the visual-recognition topic page so that developers can more easily learn about it.
To associate your repository with the visual-recognition topic, visit your repo's landing page and select "manage topics."