Adaptation in OOD via SAEs in Vision Transformers

Original Authors of the project:
https://github.com/dynamical-inference/patchsae
https://github.com/Prisma-Multimodal/ViT-Prisma

*Additional works were done at section TTA

🚀 Quick Navigation

🛠 Getting Started

Set up your environment with these simple steps:

# Create and activate environment
conda create --name patchsae python=3.12
conda activate patchsae

# Install dependencies
pip install -r requirements.txt

# Always set PYTHONPATH before running any scripts
cd patchsae
PYTHONPATH=./ python src/demo/app.py

1. Setup Local Demo

First, download the necessary files:

You can download the files using gdown as follows:

# Activate environment first (see Getting Started)

# Download necessary files (35MB + 513MB)
gdown --id 1NJzF8PriKz_mopBY4l8_44R0FVi2uw2g  # out.zip
gdown --id 1reuDjXsiMkntf1JJPLC5a3CcWuJ6Ji3Z  # data.zip

# Extract files
unzip data.zip
unzip out.zip

💡 Need gdown? Install it with: conda install conda-forge::gdown

Your folder structure should look like:

patchsae/
├── configs/
├── data/      # From data.zip
├── out/       # From out.zip
├── src/
│   └── demo/
│       └── app.py
├── tasks/
├── requirements.txt
└── ... (other files)

⚠️ Note:

First run will download datasets from HuggingFace automatically (About 30GB in total)
Demo runs on CPU by default
Access the interface at http://127.0.0.1:7860 (or the URL shown in terminal)

📊 PatchSAE Training and Analysis

Training Instructions: See tasks/README.md
Analysis Notebooks:
- demo.ipynb
- analysis.ipynb

😊 Test-Time-Adaptation with Neuron Amplification

This section implements test-time adaptation using Sparse Autoencoders (SAE) for neuron amplification on Vision Transformers.

patchSAE Version of TTA

run run_tta.py for evaluation of Neuron Amplication
Implementation Wrapper at vit_tta.py using SAE-Tester
Simple evaluation logic at evalate.py
Additional experiments coming up...

Prisma Version of TTA

Setup

1. Install ViT-Prisma Dependencies

The implementation uses ViT-Prisma for SAE-based interventions.

# Navigate to ViT-Prisma directory
cd ViT-Prisma

# Install according to their documentation
pip install -e .

# Or see: ViT-Prisma/docs for detailed installation instructions

2. Download ImageNet-Sketch Dataset

# Download ImageNet-Sketch (sketch domain for evaluation)
# Place it in ./data/imagenet_sketch/

Directory Structure

.
├── prisma_tta.py       # Main evaluation script
├── tools/              # Core implementation modules
│   ├── config.py       # Configuration settings
│   ├── models.py       # Model and SAE loading
│   ├── data.py         # Dataset handling
│   ├── hooks.py        # Feature amplification hooks
│   ├── evaluation.py   # Evaluation logic
│   └── utils.py        # Utility functions
└── ViT-Prisma/         # SAE implementation (submodule)
    └── src/
        └── vit_prisma/

Usage Examples

Basic Evaluation

python prisma_tta.py --data_path ./data/imagenet_sketch

Custom Parameters

python prisma_tta.py --data_path ./data/imagenet_sketch \
    --layers 9 10 11 \
    --k 1 --gamma 2.0 --eta 1.0 \
    --batch_size 64 \
    --save_results

Memory-Efficient Execution

For systems with limited GPU memory, use separate passes:

python prisma_tta.py --data_path ./data/imagenet_sketch \
    --separate_passes \
    --batch_size 128

Quick Test with Subset

python prisma_tta.py --data_path ./data/imagenet_sketch \
    --subset_size 1000 \
    --batch_size 32

Key Parameters

--layers: Transformer layers to apply amplification (default: [9, 10, 11])
--k: Number of top-K features to amplify (default: 1)
--gamma: Amplification coefficient (default: 1.5)
--eta: Delta scaling coefficient (default: 1.0)
--selection_method: Patch selection method (topk, threshold, adaptive)
--top_k_percent: Percentage of patches to select (default: 0.4)
--separate_passes: Enable memory-efficient evaluation mode
--save_results: Save results to ./results/ directory

Implementation Details

SAE Integration: Uses pre-trained SAEs from Prisma-Multimodal
Neuron Amplification: Selectively amplifies top-K activated features in SAE latent space
Spatial Selection: Optional spatial masking for patch-wise feature control
CLIP-based Evaluation: Zero-shot classification using CLIP text embeddings

Additional Resources

Colab notebook: Prisma-TTA.ipynb (if available)
Additional experiments coming up...

📜 License & Credits

Reference Implementations

SAE for ViT
SAELens
Differentiable and Fast Geometric Median in NumPy and PyTorch
Self-regulating Prompts: Foundational Model Adaptation without Forgetting [ICCV 2023]
- Used in: configs/ and msrc/models/
MaPLe: Multi-modal Prompt Learning CVPR 2023
- Used in: configs/models/maple/...yaml and data/clip/maple/imagenet/model.pth.tar-2

License Notice

Our code is distributed under an MIT license, please see the LICENSE file for details. The NOTICE file lists license for all third-party code included in this repository. Please include the contents of the LICENSE and NOTICE files in all re-distributions of this code.

Citation

If you find our code or models useful in your work, please cite our paper:

@inproceedings{
  lim2025patchsae,
  title={Sparse autoencoders reveal selective remapping of visual concepts during adaptation},
  author={Hyesu Lim and Jinho Choi and Jaegul Choo and Steffen Schneider},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=imT03YXlG2}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adaptation in OOD via SAEs in Vision Transformers

*Additional works were done at section TTA

🚀 Quick Navigation

🛠 Getting Started

1. Setup Local Demo

📊 PatchSAE Training and Analysis

😊 Test-Time-Adaptation with Neuron Amplification

patchSAE Version of TTA

Prisma Version of TTA

Setup

1. Install ViT-Prisma Dependencies

2. Download ImageNet-Sketch Dataset

Directory Structure

Usage Examples

Basic Evaluation

Custom Parameters

Memory-Efficient Execution

Quick Test with Subset

Key Parameters

Implementation Details

Additional Resources

📜 License & Credits

Reference Implementations

License Notice

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ViT-Prisma		ViT-Prisma
analysis		analysis
assets		assets
configs		configs
scripts		scripts
src		src
tasks		tasks
tools		tools
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
demo.ipynb		demo.ipynb
evaluate.py		evaluate.py
prisma_tta.py		prisma_tta.py
requirements.txt		requirements.txt
run_tta.py		run_tta.py
vit_tta.py		vit_tta.py

License

Parkprogrammer/ViT-SAE

Folders and files

Latest commit

History

Repository files navigation

Adaptation in OOD via SAEs in Vision Transformers

*Additional works were done at section TTA

🚀 Quick Navigation

🛠 Getting Started

1. Setup Local Demo

📊 PatchSAE Training and Analysis

😊 Test-Time-Adaptation with Neuron Amplification

patchSAE Version of TTA

Prisma Version of TTA

Setup

1. Install ViT-Prisma Dependencies

2. Download ImageNet-Sketch Dataset

Directory Structure

Usage Examples

Basic Evaluation

Custom Parameters

Memory-Efficient Execution

Quick Test with Subset

Key Parameters

Implementation Details

Additional Resources

📜 License & Credits

Reference Implementations

License Notice

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages