Jacob Louis Hoover and Jonathan Palucci. This code was part of a project for Université de Montréal's Probabilistic Graphical Models course (IFT6269) with Simon Lacoste-Julien, 2020. It is a heavily modified version of glivan/tensor_networks_for_probabilistic_modeling.
For more details, see our writeup.
/tensornetworks_pytorch
contains a bottom-up Pytorch reimplementation of tensor train models from Glasser et al, to make use of autodifferentiation.
- Python v3.8 (3.7 should work too)
- Pytorch v1.8 (nightly build), to have autodifferentiation work with complex numbers (latest stable build, 1.7, has only partial support, lacking
bmm
ineinsum
, for instance).
Set a conda environment say, torch_nightly
, set up with:
conda create --name torch_nightly python=3.8
conda activate torch_nightly
conda install pytorch torchvision torchaudio -c pytorch-nightly
Original readme from glivan/tensor_networks_for_probabilistic_modeling:
This is the code accompanying the paper "Expressive power of tensor-network factorizations for probabilistic modeling" (Advances in Neural Information Processing Systems 32, proceedings of the NeurIPS 2019 Conference) which allows for reproduction of its numerical results. If you use this code or these results please cite [1].
A working python 2.7, 3.4 or more recent installation with the following python libraries (all included in Anaconda) :
numpy, scikit-learn, os, sys, pickle, time
For training a hidden Markov model the python library pomegranate [2] and its dependencies must also be installed. This is not necessary for running the tensor-network algorithms.
The preprocesssed datasets are included in the datasets
folder.
Preprocessing transformed the categorical data into a numpy array of integers. Each row corresponds to a training example and each column is an integer feature between 0 and d-1, where d is the number of different categories. As this work is only concerned with the expressivity of different functions, only training sets are used.
Included datasets :
From the R package TraMineR :
- Family life states from the Swiss Household Panel biographical survey :
biofam
[3]
From the UCI Machine Learning Repository [4]
- SPECT Heart Data Set :
spect
- Lymphography Data Set :
lymphography
[5] - Primary Tumor Data Set :
tumor
[6] - Congressional Voting Records Data Set :
votes
- Solar Flare Data Set :
flare
The tensornetworks
folder includes a generic tensor network class MPSClass.py
as well as classes for positive MPS, Born machine MPS and Locally Purified States (LPS) with real or complex tensor elements. These classes include simple methods for performing maximum likelihood estimation on a dataset using batch gradient descent. The training is done by computing the gradients of the log-likelihood over all tensors for each minibatch of training examples and then updating all tensors at once in a gradient descent optimization scheme. This is different from a DMRG-like algorithm where only one (or two) tensor is updated at a time. For this reason canonical forms (that would be different for each class of tensor network) are not used, but they might be required for the numerical stability over much larger datasets. The bond dimension/rank is fixed to the same value for all tensors. Code for approximating a given non-negative tensor representing a probability mass function is also available. Note that the code is not optimized for speed and performance, but is rather a tool demonstrating how the algorithms work.
We include a simple script to define a hidden Markov model corresponding to an MPS of a certain bond dimension in the hmm
folder. This script requires the pomegranate library.
We provide a jupyter notebook fitdataset.ipynb
that explains how to create a model, load a dataset and train the model on the dataset. We also provide a jupyter notebook fittensor.ipynb
that explains how to train the model to approximate a given non-negative tensor.
Input parameters of a tensor network (all parameters are optional):
- D : int [default: 2], bond dimension/rank of the tensor networks.
- learning_rate : float [default: 1.0], learning rate for gradient descent.
- batch_size : int [default: 20], number of training examples per minibatch.
- n_iter : int [default: 10], number of epochs over the training dataset to perform, or number of iterations of the optimization for approximating a given tensor.
- random_state : int or numpy.RandomState [default: None], a random number generator instance to define the state of the random permutations generator. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
- verbose : int [default: 0], the verbosity level. Zero means silent mode.
- mu : int [default: 2], only for real and complex LPS : the dimension of the purification index.
Experiments in the paper used the following parameters:
- batch size was set to 20
- learning_rate was chosen using a grid search on powers of 10 going from 10-5 to 105.
- n_iter was set to a maximum of 20000
Each data point indicated in the paper is the lowest negative log-likelihood obtained from 10 trials with different initial tensors.
For approximating a given non-negative tensor, the optimization is performed by a limited-memory BFGS algorithm. Batch size and learning rate parameters are not used, and experiments in the paper used a maximum number of iterations n_iter of 10000.
We also include code to train a hidden Markov model corresponding to an MPS with positive tensors. The training is performed using the Baum-Welch algorithm by running
python runHMM.py lymphography 2 100
Input parameters (all parameters are optional):
- datasetload : str [default: lymphography], Name of the dataset file which should be located in the datasets/ folder
- bond_dimension : int [default: 2], bond dimension/rank (here number of hidden states per variable)
- n_iter : int [default: 100], number of epochs over the training dataset to perform
[1] Glasser, I., Sweke, R., Pancotti, N., Eisert, J., Cirac, J. I. (2019) Expressive power of tensor-network factorizations for probabilistic modeling. Advances in Neural Information Processing Systems 32 (Proceedings of the NeurIPS 2019 Conference). [https://papers.nips.cc/paper/8429-expressive-power-of-tensor-network-factorizations-for-probabilistic-modeling]. See also extended version at arxiv:1907.03741.
[2] Schreiber, J. (2018). Pomegranate: fast and flexible probabilistic modeling in python. Journal of Machine Learning Research, 18(164), 1-6.
[3] Müller, N. S., M. Studer, G. Ritschard (2007). Classification de parcours de vie à l'aide de l'optimal matching. In XIVe Rencontre de la Société francophone de classification (SFC 2007), Paris, 5 - 7 septembre 2007, pp. 157–160.
[4] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
[5] This lymphography domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Slovenia. Thanks go to M. Zwitter and M. Soklic for providing the data.
[6] This primary tumor domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Slovenia. Thanks go to M. Zwitter and M. Soklic for providing the data.