This repository contain code for 2nd place winning solution of Criteo Ad Placement Challenge hosted by crowdAI.
cd prepare_dataset
- Set paths in
config.py
(warning: around 130 GB of free disk space is needed) - Run
build.sh
(this takes about 2h on my setup) - After dataset processing, go back,
cd ..
- Start jupyter notebook
- Open and execute every cell in
make_batches.ipynb
- Open and execute every cell in
learn_model.ipynb
(in parallel with learning you can start tensorboard to see progress) - File
{TMP_DIR}/submit_scaled.gz
will produceIPS: 54.556
on the leaderboard.
Some details about the approach you can find in docs
folder.
- pypy (5.1.2)
- python (3.5.2)
- notebook (5.2.1)
- numpy (1.13.3)
- scipy (1.0.0)
- tensorflow-gpu (1.4.0)
- tensorflow-tensorboard (0.4.0rc2)
- tqdm (4.19.4)
This code was executed on machine with 64G RAM, i7-6800K core and NVIDIA GTX 1080 running under Linux Mint 18.1 (not tested, but should run without GPU and with less RAM).
If you want to learn useful ML techniques and competition specific tricks, check out this course from experienced kagglers like me and KazAnova.