Stars
Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal LLMs
📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM Computing Surveys, 2026.
[ICCV 2025] VisualCloze: A universal image generation framework that can support a wide range of in-domain tasks and generalize to unseen ones. (🔥 🔥 🔥 Merged into offical pipelines of diffusers.)
A curated list of Awesome Personalized Large Multimodal Models resources
TensorFlow Implementation of the paper "End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures" and "Classifying Relations via Long Short Term Memory Networks along Shortest De…
Dynamic Memory Management for Serving LLMs without PagedAttention
Datasets and Code for Socio-Culturally Aware Evaluation Framework for LLM-Based Content Moderation published in COLING 2025
Code of our method MbLS (Margin-based Label Smoothing) for network calibration. To Appear at CVPR 2022. Paper : https://arxiv.org/abs/2111.15430
[ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723
Code for "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks"
IPyPlot is a small python package offering fast and efficient plotting of images inside Python Notebooks. It's using IPython with HTML for faster, richer and more interactive way of displaying big …
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Content and Style Disentanglement for Artistic Style Transfer [ICCV19]
Pytorch implementation of MixNMatch
Doodle to Search: Practical Zero Shot Sketch Based Image Retrieval
Natural Language Processing Best Practices & Examples
Image augmentation for machine learning experiments.