Toy implementation of An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale on MNIST.
tiles = einops.rearrange(images, 'b c (h t1) (w t2) -> b (h w) c t1 t2', t1=tile_size, t2=tile_size)
Toy implementation of An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale on MNIST.
tiles = einops.rearrange(images, 'b c (h t1) (w t2) -> b (h w) c t1 t2', t1=tile_size, t2=tile_size)