You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From the issue #191, I am still questioning how 1xK/Kx1 depth-wise can be directly translated to pure matrix multiplication or how Wave-MLP is an MLP model.
I understand that you are required to limit the window size to deal with dense prediction tasks with varying sizes of input images, but I am still wondering how 1xK/Kx1 depth-wise can be directly translated to pure matrix multiplication. From what I know MLP models such as MLP-mixer, ResMLP, etc, don't have weight sharing among pixels/patches, but they share the weights among channels.
In other words, for MLP-based models and even Swin transformers, each pixel/patch has its own filters, but the filters are shared among the channel dimension.
The text was updated successfully, but these errors were encountered:
Phuoc-Hoan-Le
changed the title
Wave-MLP looks like it uses depth-wise conv
Wave-MLP looks like it uses depth-wise conv (continued)
Mar 30, 2023
Hi,
From the issue #191, I am still questioning how 1xK/Kx1 depth-wise can be directly translated to pure matrix multiplication or how Wave-MLP is an MLP model.
I understand that you are required to limit the window size to deal with dense prediction tasks with varying sizes of input images, but I am still wondering how 1xK/Kx1 depth-wise can be directly translated to pure matrix multiplication. From what I know MLP models such as MLP-mixer, ResMLP, etc, don't have weight sharing among pixels/patches, but they share the weights among channels.
In other words, for MLP-based models and even Swin transformers, each pixel/patch has its own filters, but the filters are shared among the channel dimension.
The text was updated successfully, but these errors were encountered: