'POST' data size, is there a limit? #11

emanokaro · 2021-10-04T11:24:15Z

How's it possible to increase input data size? would it be by batch size?

emanokaro · 2021-10-04T11:48:06Z

@daniel-ziegler Thanks for your great work. I noticed your work on summarising books and the hierachical approach.
My question is what's the max input limit for the model so that is still able to apply attention to very first words/sentenses? And How I change that limit? tried already with batch_size and od_model in transformer.py. Thanks.

UntotaufUrlaub · 2023-01-22T16:03:23Z

Hi @emanokaro,
I don't know for sure but my educated guess is that you can't change the input size. Most standard transformer architectures only allow one fixed input size without any retraining. The reasons that each word is associated with a learned position embedding. So if you want to add words the position embedding is lacking. In principle this could be avoided but this is not the standard at the moment as far as I know.
Hints from the paper are:
"All models follow the standard Transformer architecture, with 2048 learned position embeddings." (page 17)
"The batch size ramped up throughout training to some maximum, with each input having 2048 tokens." (page 17)
"Our model always receives a byte-pair encoded string of a fixed size. When the input is too small, we
pad from the beginning of the input with a padding token, and if the input is too long we truncate the
post/article field at newlines to stay under the limit." (page 18)

(Batch size doesn't sound promising by the way. It means how many text are processed in parallel in one training forward pass. This is only connected to the input size - length of each individual text - as the memory requirement scales with input and batch size. So the need for bigger batch size in the training is a constraint for the input size of the model)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'POST' data size, is there a limit? #11

'POST' data size, is there a limit? #11

emanokaro commented Oct 4, 2021

emanokaro commented Oct 4, 2021

UntotaufUrlaub commented Jan 22, 2023

'POST' data size, is there a limit? #11

'POST' data size, is there a limit? #11

Comments

emanokaro commented Oct 4, 2021

emanokaro commented Oct 4, 2021

UntotaufUrlaub commented Jan 22, 2023