Use different positional embeddings #225

aflah02 · 2024-08-28T13:05:58Z

Hi
Really cool repo. Is there any way I can use different positional encodings during training? I want to essentially train several versions of tiny llama with say rope, absolute, and no positional encodings.

zzhhjjj · 2024-09-03T14:19:18Z

I think this will require you to modify the code a little bit.

aflah02 · 2024-09-04T06:12:23Z

I think this will require you to modify the code a little bit.

Can you point me to where you think these changes need to be done? Should I like a new copy of https://github.com/huggingface/nanotron/blob/main/src/nanotron/models/llama.py with different encodings? or should I be doing this elsewhere?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use different positional embeddings #225

Use different positional embeddings #225

aflah02 commented Aug 28, 2024

zzhhjjj commented Sep 3, 2024

aflah02 commented Sep 4, 2024

Use different positional embeddings #225

Use different positional embeddings #225

Comments

aflah02 commented Aug 28, 2024

zzhhjjj commented Sep 3, 2024

aflah02 commented Sep 4, 2024