Eva

Overview

Eva (Encoding of visual atlas) is a foundation model for tissue imaging data that learns complex spatial representations of tissues at the molecular, cellular, and patient levels. Eva uses a novel vision transformer architecture and is pre-trained on masked image reconstruction of spatial proteomics and matched histopathology.

Model Architecture

Installation

git clone https://github.com/YAndrewL/Eva.git
cd Eva

conda env create -f env.yaml
conda activate Eva

pip install -e .

Getting Started

Loading the Model

Eva model weights are open-sourced on HuggingFace Hub.

from Eva.utils import load_from_hf, extract_features, create_model
from omegaconf import OmegaConf
import torch

# Load configuration
conf = OmegaConf.load("config.yaml")

# Load model from HuggingFace Hub
device = "cuda" if torch.cuda.is_available() else "cpu"
model = load_from_hf(
    repo_id="yandrewl/Eva",
    conf=conf,
    device=device
)

Downloading Marker Embeddings

Download the GenePT marker embeddings from Zenodo record. Use the file GenePT_gene_protein_embedding_model_3_text.pickle and store it as marker_embeddings/marker_embedding.pkl locally.

Extracting Embeddings

# Extract embeddings
patch = torch.randn(1, 224, 224, 6)  # Shape: [B, H, W, C]
biomarkers = ["DAPI", "CD3e", "CD20", "CD4", "CD8", "PanCK"]  # biomarkers

features = extract_features(
    patch=patch,
    bms=[biomarkers],  # List of biomarker lists (one per batch item)
    model=model,
    device=device,
    cls=False,  # Use CLS token (True) or average patches (False)
    channel_mode="full"  # Options: "full", "HE", "MIF"
)

# or use model method
features = model.extract_features(
    patch=patch,
    bms=[biomarkers],
    device=device,
    cls=False,
    channel_mode="full"  # Options: "full", "HE", "MIF"
)

Multi-modality Inputs

When data includes H&E (Hematoxylin and Eosin) channels, H&E should be added as the last three channels:

mif_patch = torch.randn(1, 224, 224, 6) 
he_patch = torch.randn(1, 224, 224, 3)
patch = torch.cat([mif_patch, he_patch], dim=-1)
biomarkers = ["DAPI", "CD3e", "CD20", "CD4", "CD8", "PanCK", "HECHA1", "HECHA2", "HECHA3"]  # Last 3 are HE channels

# Extract features using different modality
features = extract_features(
    patch=patch,
    bms=[biomarkers],
    model=model,
    device=device,
    cls=False,
    channel_mode="MIF",  # Set to "HE" to use HE channels only, or "full" to use all channels
)

Configuration

The model requires a configuration file (YAML format) that specifies:

Dataset parameters (patch_size, token_size, marker_dim, etc.)
Channel mixer parameters (dim, n_layers, n_heads, etc.)
Patch mixer parameters (dim, n_layers, n_heads, etc.)
Decoder parameters (dim, n_layers, n_heads, etc.)

See config.yaml for an example configuration.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Eva		Eva
downstream		downstream
figures		figures
utils		utils
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
env.yaml		env.yaml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Eva

Overview

Model Architecture

Installation

Getting Started

Loading the Model

Downloading Marker Embeddings

Extracting Embeddings

Multi-modality Inputs

Configuration

About

Uh oh!

Releases

Packages

Languages

YAndrewL/Eva

Folders and files

Latest commit

History

Repository files navigation

Eva

Overview

Model Architecture

Installation

Getting Started

Loading the Model

Downloading Marker Embeddings

Extracting Embeddings

Multi-modality Inputs

Configuration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages