Tensorflow implementation of 'Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning'
-
Updated
Nov 11, 2025 - Jupyter Notebook
Tensorflow implementation of 'Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning'
A complete implementation of the "Attention Is All You Need" Transformer model from scratch using PyTorch. This project focuses on building and training a Transformer for neural machine translation (English-to-Italian) on the OpusBooks dataset.
[ICIP 2025] Official implementation of RT-X Net: RGB-Thermal cross attention network for Low-Light Image Enhancement
TCR Epitope Generation Model with Top-K Prediction
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Multimodal transformer for financial time-series prediction with dual configuration systems (YAML/programmatic), sophisticated data processing pipelines, file caching, and advanced numerical data augmentation
Official repository for "The Strawberry Problem 🍓: Emergence of Character-level Understanding in Tokenized Language Models"
[IV 2025, Oral] Official code of "6Img-to-3D: Few-Image Large-Scale Outdoor Novel View Synthesis"
🔥 [TAI 2025] Exploring Mutual Cross-Modal Attention for Context-Aware Human Affordance Generation (official code).
Investigating how text-to-image diffusion models internally represent artistic concepts like content and style when generating artworks.
PyTorch Implementation of SD-VSum and S-VideoXum Dataset Distribution from "SD-VSum: A Method and Dataset for Script-Driven Video Summarization" (ACM Multimedia 2025)
This is the project for the paper of "Low-Light Video Enhancement via Spatial-Temporal Consistent Decomposition" in IJCAI2025
Detecting word-level stress in English speech using wav2vec 2.0, with extensions to multimodal speech+text models via cross-attention fusion with BERT.
This is the implementation of the paper Enhanced Photovoltaic Power Forecasting: An iTransformer and LSTM-Based Model Integrating Temporal and Covariate Interactions
SOVL System (Self-Organizing Virtual Lifeform): A complex, purpose-agnostic autonomous agent with continuous, asynchronous learning capabilities via a dynamic scaffolded LLM and a frozen base LLM
Conditional Diffuser from scratch, applied on CelebA-HQ, Cifar10 and MNIST.
Photometry Guided Cross Attention Transformers for Astronomical Image Processing
3D Human-Object Interaction in Video A New Approach to Object Tracking via Cross-Modal Attention
Anime sketch colorization using diffusion models and photo-sketch correspondence — a lightweight architecture combining semantic feature extraction, deformation flow, and cross-attention guidance.
Add a description, image, and links to the cross-attention topic page so that developers can more easily learn about it.
To associate your repository with the cross-attention topic, visit your repo's landing page and select "manage topics."