Releases: leriomaggio/ppml-tutorial
PPML: Machine Learning on data you cannot see
Privacy guarantees are the most crucial requirement when it comes to analyse sensitive data. These requirements could be sometimes very stringent, so that it becomes a real barrier for the entire pipeline. Reasons for this are manifold, and involve the fact that data could not be shared nor moved from their silos of resident, let alone analysed in their raw form. As a result, data anonymisation techniques are sometimes used to generate a sanitised version of the original data. However, these techniques alone are not enough to guarantee that privacy will be completely preserved. Moreover, the memoisation effect of Deep learning models could be maliciously exploited to attack the models, and reconstruct sensitive information about samples used in training, even if these information were not originally provided.
Privacy-preserving machine learning (PPML) methods hold the promise to overcome all those issues, allowing to train machine learning models with full privacy guarantees.
This workshop will be mainly organised in three main parts. In the first part, we will introduce the main concepts of differential privacy: what is it, and how this method differs from more classical anonymisation techniques (e.g. k-anonymity
). In the second part, we will focus on Machine learning experiments. We will start by demonstrating how DL models could be exploited (i.e. inference attack ) to reconstruct original data solely analysing models predictions; and then we will explore how differential privacy can help us protecting the privacy of our model, with minimum disruption to the original pipeline. Finally, we will conclude the tutorial considering more complex ML scenarios to train Deep learning networks on encrypted data, with specialised distributed federated learning strategies.
Privacy-Preserving Machine Learning: Machine learning on data you cannot see
Privacy guarantees are one of the most crucial requirements when it comes to analyse sensitive information. However, data anonymisation techniques alone do not always provide complete privacy protection; moreover Machine Learning (ML) models could also be exploited to leak sensitive data when attacked and no counter-measure is put in place.
Privacy-preserving machine learning (PPML) methods hold the promise to overcome all those issues, allowing to train machine learning models with full privacy guarantees.
This workshop will be mainly organised in two parts. In the first part, we will explore one example of ML model exploitation (i.e. inference attack ) to reconstruct original data from a trained model, and we will then see how differential privacy can help us protecting the privacy of our model, with minimum disruption to the original pipeline. In the second part of the workshop, we will examine a more complicated ML scenario to train Deep learning networks on encrypted data, with specialised distributed federated learning strategies.
PPML Tutorial @ JGI Data Week 2022
Workshop Material as presented at the JGI Data Week 2022, organised by the Jean Golding Institute of Data Science at University of Bristol
PPML Tutorial @ PyCon DE 2022
Tutorial on Privacy-Preserving Machine Learning as presented at PyCon DE 2022 (https://2022.pycon.de/program/QHJ7SX/)
Full Changelog: https://github.com/leriomaggio/ppml-tutorial/commits/pyconde