About

This is a replication with custom modification of Karpahy's tutorial on language models: Let's build GPT: from scratch, in code, spelled out.. Basically a character level decoder gpt-like model.

Author of original content: @karpathy

Example of model output:

As the result of this little fun-project, two models were trained:

tiny-peare - basically a single-layer transformer decoder
clank-peare - a sort-of big transformer decoder (big as much as my gpu can afford)

Both models are trained, as in original lecture, on tiny_shakespear. Models hyperparamters are listed here.

How to run

To run this, only external library you need installed is torch. Assuming that it is, run:

git clone https://github.com/shredder67/gpt-playground
cd gpt-playground

To test the inference of pretrained model located in models/ folder you have to follow 2-step procedure:

Copy listed hyperparamters for model from here into the section highlited in main.py as Hyperparameters (yes, I'm too lazy for that argparse routine 🙂)
Run the following command with model_name.pt being the file with model weights located in models/ folder

python main.py --test_inference model_name.pt

If you want to train and save you model, you can run (model name doesn't need extension), it will be saved to models/

python main.py --save_model_as model_name

To simply train and see sample ouput, run without any arguments.

Hyperparameters

tiny.pt

BATCH_SIZE = 32
TRAIN_EPOCH_NUM_STEPS = 1000
TEST_EPOCH_NUM_STEPS = 200
LEARNING_RATE = 3e-3
EMBEDDING_SIZE = 128
BLOCK_SIZE = 128
N_LAYER = 1
NUM_HEADS = 4
DROP_PROB = 0.2

clank.pt

BATCH_SIZE = 64
TRAIN_EPOCH_NUM_STEPS = 5000
TEST_EPOCH_NUM_STEPS = 200
LEARNING_RATE = 3e-4
EMBEDDING_SIZE = 300
BLOCK_SIZE = 400
N_LAYER = 6
NUM_HEADS = 6
DROP_PROB = 0.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

About

How to run

Hyperparameters

Files

README.md

Latest commit

History

README.md

File metadata and controls

About

How to run

Hyperparameters