This repo contains my Master's end-of-year project, completed during my studies at ENSAI. It aims at reproducing experiments from the article "Gaussian Processes on Distributions based on Regularized Optimal Transport" and experiment on publicly available dataset.
The article is available at: Gaussian Processes on Distributions based on Regularized Optimal Transport
The first part of the project consisted of a methododogical part. The goal was to familiarize with the paper, the Optimal Transport theory and reproduce one of the experiment of the paper. Our written report, slides and Jupyter notebook can be found here.
The second part of the project, experimental part, is about applying the foundings of the paper on a real life dataset and compare the proposed method against established ones such as the Wasserstein distance. We explored the Rotor37 dataset and our notebooks and codes are here.
A blade | Disk as reference measure | Uniform sphere as reference measure |
Our report exhibit the performance of the Sinkhorn kernel presented in the paper on the Rotor37 dataset.
We found that the Sinkhorn kernel performs really well to predict the efficency, the massflow and the compression ratio of the blade.
Efficiency, |
Massflow, |
Compression ratio, |
We also explore multiple hyperparameters. Here are our main findings:
- There exist an epsilon from which the performance of the regression task are great. But going with even smaller epsilon do not seem to impact the model.
- One do not need much points in the reference measure, a few hundreds is enough. Therefore the computation is not that big even for huge datasets.