Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Subsampling while maintaining an identity block #79

Open
addisonklinke opened this issue Jul 14, 2022 · 0 comments
Open

Subsampling while maintaining an identity block #79

addisonklinke opened this issue Jul 14, 2022 · 0 comments

Comments

@addisonklinke
Copy link

addisonklinke commented Jul 14, 2022

In the paper, I noticed there is a lot of emphasis on allowing the Non-Local Block to be initialized as an identity block so that it can be inserted into pre-trained architectures without adverse effect. For instance

  • Section 4.1: "The scale parameter of this BN layer is initialized as zero, following [17]. This ensures that the initial state of the entire non-local block is an identity mapping, so it can be inserted into any pre-trained networks while maintaining its initial behavior"
  • Section 3.3: " The residual connection allows us to insert a new non-local block into any pre-trained model, without breaking its initial behavior (e.g., if Wz is initialized as zero)"

Separately, section 3.3 talks about a subsampling trick which inserts pooling layers after phi and g from figure 2. I am struggling to see how pooling can be used while maintaining the identity initialization of a Non-Local Block described above. If pooled, then the output of f(i, j) • g(x) goes from shape THW x 512 to TH'W' x 512 where H' < H and W' < W. With smaller spatial dimensions, this output can no longer do an element-wise sum with the input X for the residual connection.

What additional operations are used in your implementation to enable an element-wise sum when subsampling? Do you upsample the f(i, j) • g(x) output before applying the 1x1 convolution in W_z?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant