This repository provides an implementation of the paper A Geometric Method for Improved Uncertainty Estimation in Real-time. All results presented in our work were produced with this code.
Inside a python (>=3.9) virtual enviroment run:
pip install -e .
pip install -r ./Experiments/requirements.txt
Model calibration could achived easily by the following command:
GeoCalibrator = GeometricCalibrator(model, X_train, y_train)
GeoCalibrator.fit(X_val, y_val)
When ever we would like to get calibrated probabilities on an inputs 'x_test' we would calibrate it by our method:
calibrated_prob = GeoCalibrator.calibrate(x_test)
For realtime systems we advice to use our compressed version of geometric calibrator, you would only need to add few parameters (documentation of the parameters could be find in Geo_cal_utils.py:
GeoCalibrator_compressed = GeometricCalibrator(model, X_train, y_train, comprasion_mode='Maxpool', comprassion_param=2)
calibrated_prob = GeoCalibrator_compressed.calibrate(x_test)
You could also check the ECE error:
ECE_calc(calibrated_prob, y_pred_test, y_test)
Here you can find complete code example: Run_Example.ipynb
Note: in the code we call fast separation as "stability".
The notation {} represent dynamic string ; for examples :
{DatasetName} could be : "MNIST" or "CIFAR_RGB"
All the datasets could be download from the provided links:
-CIFAR10
-MNIST
-GTSRB
-FashionMNIST
-SignLanguageMNIST
-AIRLINE
-WINE
- other_calibrators/ - Folder that regards other calibration utills files.
- calibrators.py - All the different calibrators that we evaluate with.
- Data.py - Class that made for loading the train/test/val of data.
- ModelLoader.py - Class that made for loading different attributes of specific model.
- utils.py - utills functions.
- /{DatasetName}/{DatasetName}_divide-ALL.ipynb - Pre-process + slitting the data to train/test/val in 100 different shuffles folders.
- /{DatasetName}/{DatasetName}_paramTuning.ipynb - Param tuning to exact best model hyper params.
- /SLURM/{sklearn/pytorch}_config.py - Configuration of models (pytorch=CNN / SKlearn=RF,GB).
- /SLURM/VARS.json - Configuration of dataset batch size , epocs and #classes.
- /SLURM/{sklearn/pytorch}Shuffle.py - train models and calculate Geometric seperation. \
We use Slurm cluster system for this stage.
- /SLURM/{sklearn/pytorch}script.sh - Script that run the computation node of SLURM.
the data is saved in a form of:
├── {dataset}
│> └── {shuffle_num}
│> │> ├── model
│> │> │> └── model{dataset}{model} - the main model as 'sav' format
│> │> │> ├── model....
│> │> │> ├── m....
│> │> │> └── ...
│> │> ├── {model}
│> │> │> └── {model}
│> │> │> │> └── y_pred{val|test|train}.npy - predicted values on val|test|train.
│> │> │> │> ├── {fast_separation|separation}{val|test|train}{L1/L2/Linf}.npy - Geometric seperation calculations
│> │> │> │> ├── all_predictions_{val|test|train}.npy - the 'predict_proba' on specific shuffle of the dataset. \
- /Slurm/ECE_per_dataset.py - ece calculation for each of each (dataset,model,calibration_method) tuple.
- /Slurm/ECE_per_dataset_script.sh - Script that run the ece calculations and save it in "saved_calculations" folder.
- results.ipynb - main result notebook.
Average accuracy on datasets with different models:
Dataset | Accuracy |
---|---|
CNN-MNIST | 0.990157 |
RF-MNIST | 0.965964 |
GB-MNIST | 0.968300 |
CNN-GTSRB_RGB | 0.966850 |
RF-GTSRB_RGB | 0.975357 |
GB-GTSRB_RGB | 0.841127 |
CNN-SignLanguage | 0.998527 |
RF-SignLanguage | 0.994903 |
GB-SignLanguage | 0.978862 |
CNN-Fashion | 0.897221 |
RF-Fashion | 0.877793 |
GB-Fashion | 0.885064 |
CNN-CIFAR_RGB | 0.669542 |
RF-CIFAR_RGB | 0.467625 |
GB-CIFAR_RGB | 0.447675 |