The financial markets generally are unpredictable… The idea that you can actually predict what's going to happen contradicts my way of looking at the market. - George Soros
One of my mentors had a sequence classification project coming up with a client in a few weeks. He wanted me to demonstrate I could do time series/sequence classification in Tensorflow/Keras. I had never built such a model before, nor had I done a time series problem. Since we both have an interest in cryptocurrency, we thought it would be fun to build a model to predict the price of Bitcoin.
Disclaimer: since people dedicate their lives to building financial trading models, I thought there was a close to zero percent chance I would build a profitable model. So, I treated this as a project to build my ML skills.
I completed the project much to my mentor's satisfaction (and then we completed the client's sequence classification project) and he left the following ⭐⭐⭐⭐⭐ review:
Note: I spent more than 8 hours on this but asked to bill it hourly to increase the number of hours billed on my Upwork profile.
The best results were obtained by an LSTM with 5 layers each getting sequentially smaller. I ran multiple tests on wandb and got the lowest loss on the validation set to be 0.01816 RMSE.
X_val predictions vs. actuals for runs 421-423 (blue = actual, red = predictions)
Due to the stochastic nature of DL models, I ran each experiment at least 10 times. Best results were obtained on runs 413-424 (which you can search for using the regex 41[3-9]|42[0-3]
on the wandb project page).
The best model was pretty-vortex-422.
I manually tuned the learning rate and implemented a custom learning rate scheduler. The optimal batch size was 168, the best scaling was to first apply a log transformation, then scale the min/max to (0, 1), finally the Adam optimizer outperformed the others.
Pretty-vortex-422 X_train results - actuals vs. predictions
Pretty-vortex-422 X_val results - actuals vs. predictions - note the low RMSE of 0.01816 and how the red line tightly hugs the blue.The vast majority of the code I used is in price_predictor/helpers.py
. There are 29 functions that I have split into sections and one called train_and_validate
to perform all the training and validation steps for each experiment. The functions that make up train_and_validate
should make it clear what is happening at each step.
All model tuning for this project took place with Weights & Biases (wandb). You can see the results of all 540+ runs on the bitcoin_price_predictor wandb page. As such, the notebooks themselves are not that interesting - I just used them to run wandb experiments and saved everything to the cloud.
This was my first time building such a model with TensorFlow/Keras. Since then I have used PyTorch Lightning and love the flexibility of their Data Modules to encapsulate all the data processing code. I would like to encapsulate more of the code into easy-to-transport classes instead of the (rather large) collection of functions I wrote.
Since the Bitcoin price never stops, it's easy to re-train the model and see how it performs on brand new data. Because we only used the price of Bitcoin to make predictions, I doubt the model will perform well. But it would be great to get a measure of how it performs in production.
Some courses related to this that would not take long to implement are:
- Train and Deploy a Serverless API to predict crypto prices by Pau Labarta Bajo
- The Real-World ML Tutorial by Pau Labarta Bajo
I used Python and the following libraries:
- TensorFlow (and Keras) 2.4
- Numpy
- Pandas
- Scikit-learn
- Wandb
- Matplotlib
- Seaborn
- Tqdm
I finished this project in June 2021 and am in the process of tidying everything up so it can be presented to the world in a nice manner. You are one of the lucky souls who gets to see the repo in its raw form. But this means that not everything is as clean or orderly as it should be.