Skip to content

SparK ResNet and global feature interaction #80

@csvance

Description

@csvance

Hello, thanks for the great paper.

With the ResNet version of SparK using sparse convolution and sparse batch normalization together, the flow and mixing of global semantic information is heavily restricted due to effective masking on the receptive field caused by sparse operations and lack of global channel interaction with batch norm. It seems like this information will struggle to propagate especially in more shallow networks with lower receptive field like ResNet50. In the paper it was empirically shown that ResNet50 benefited the least from SparK, failing to match the performance of supervised ResNet101. I was wonder if the authors or anyone else tried using sparse group normalization with ResNet so there would be some global interaction of feature channels to better allow the learning of high level features. Masked autoencoder pretraining has shown alot of promise for data limited tasks in medical imaging and ResNet50 is commonly used by practitioners, so understanding how to most effectively use SparK pretraining has big implications for many in the field.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions