Awesome-information-theoretic-representation-learning

A curated paper list for information-theoretic representation learning.

All papers are selected and sorted by topic/year. Please send a pull request if you would like to add any paper.

Theories and Analysis

Maximum Entropy

Information theory and statistical mechanics. Edwin T. Jaynes. Physical review, 1957 [paper]
Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. JE Shore, RW Johnson. IEEE Trans. Inf. Theory, 1980 [paper]
On the rationale of maximum-entropy methods. Edwin T. Jaynes. Proceedings of the IEEE, 1982 [paper]
Pac-bayes analysis of maximum entropy classification. John Shawe-Taylor, David R. Hardoon. AISTATS, 2009 [paper]
The role of entropy and reconstruction in multi-view self-supervised learning. Borja Rodrı́guez Gálvez, Arno Blaas, Pau Rodriguez, Adam Golinski, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella ICML, 2023 [paper]

Information Maximization

Self-Organization in a Perceptual Network Ralph Linsker. Computer, 1988 [paper]
On Mutual Information Maximization for Representation Learning Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, Mario Lucic. ICLR, 2020 [paper]
A Mutual Information Maximization Perspective of Language Representation Learning. Lingpeng Kong, Cyprien de Masson d'Autume, Lei Yu, Wang Ling, Zihang Dai, Dani Yogatama. ICLR, 2020 [paper]
Which Mutual-Information Representation Learning Objectives are Sufficient for Control? Kate Rakelly, Abhishek Gupta, Carlos Florensa, Sergey Levine. NeurIPS, 2021 [paper]

Information Bottleneck

The information bottleneck method. Naftali Tishby, Fernando C Pereira, and William Bialek. the 37th Allerton Conference on Communication and Computation, 1999 [paper]
Information bottleneck for Gaussian variables. Gal Chechik, Amir Globerson, Naftali Tishby, Yair Weiss. NeurIPS, 2003 [paper]
Deep learning and the information bottleneck principle. Naftali Tishby and Noga Zaslavsky. ITW, 2015 [paper]
Opening the black box of deep neural networks via information. Ravid Shwartz-Ziv, Naftali Tishby. CoRR, 2017 [paper]
The role of the information bottleneck in representation learning. Matías Vera, Pablo Piantanida, Leonardo Rey Vega. ISIT, 2018 [paper]
On the information bottleneck theory of deep learning. Andrew M. Saxe, Yamini Bansal, Joel Dapello, Madhu Advani, Artemy Kolchinsky, Brendan D. Tracey, David D. Cox. ICLR, 2018 [paper]
Learning representations for neural network-based classification using the information bottleneck principle. Rana Ali Amjad, Bernhard C. Geiger. IEEE Trans. Pattern Anal. Mach. Intell., 2020 [paper]
Learnability for the information bottleneck. Tailin Wu, Ian S. Fischer, Isaac L. Chuang, Max Tegmark. UAI, 2020 [paper]
Perturbation theory for the information bottleneck. Vudtiwat Ngampruetikorn, David J. Schwab. NeurIPS, 2021 [paper]
How does information bottleneck help deep learning? Kenji Kawaguchi, Zhun Deng, Xu Ji, and Jiaoyang Huang. ICML, 2023 [paper]

Others

Information-theoretic analysis of generalization capability of learning algorithms. Aolin Xu and Maxim Raginsky. NeurIPS, 2017 [paper]
Emergence of invariance and disentanglement in deep representations. Alessandro Achille, Stefano Soatto. ITA, 2018 [paper]
Understanding the Limitations of Variational Mutual Information Estimators. Jiaming Song, Stefano Ermon. ICLR, 2020 [paper]
Reasoning about generalization via conditional mutual information. Thomas Steinke and Lydia Zakynthinou. COLT, 2020 [paper]
A Unifying Mutual Information View of Metric Learning: Cross-Entropy vs. Pairwise Losses. Malik Boudiaf, Jérôme Rony, Imtiaz Masud Ziko, Eric Granger, Marco Pedersoli, Pablo Piantanida, Ismail Ben Aye. ECCV, 2020 [paper]

Learning Principle and Optimization

Entropy-based Representation Learning

Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. K Rose. Proceedings of the IEEE, 1998 [paper]
Unsupervised Learning of Finite Mixture Models. Mário A. T. Figueiredo, Anil K. Jain. IEEE Trans. Pattern Anal. Mach. Intell., 2002 [paper]
Semi-supervised learning by entropy minimization. Yves Grandvalet, Yoshua Bengio. NeurIPS, 2004 [paper]
Nonparametric Supervised Learning by Linear Interpolation with Maximum Entropy. Maya R. Gupta, Robert M. Gray, Richard A. Olshen. IEEE Trans. Pattern Anal. Mach. Intell., 2006 [paper]
Similarity-based Classification: Concepts and Algorithms. Yihua Chen, Eric K. Garcia, Maya R. Gupta, Ali Rahimi, Luca Cazzanti. J. Mach. Learn. Res., 2009 [paper]
Maximum Entropy Discrimination Markov Networks. Jun Zhu, Eric P. Xing. J. Mach. Learn. Res., 2009 [paper]
Regularizing neural networks by penalizing confident output distributions. Gabriel Pereyra, George Tucker, Jan Chorowski, Lukasz Kaiser, and Geoffrey E. Hinton. ICLR Workshop, 2017 [paper]
Compressing images by encoding their latent representations with relative entropy coding. Gergely Flamich, Marton Havasi, José Miguel Hernández-Lobato. NeurIPS, 2020 [paper]
Self-supervised learning via maximum entropy coding. Xin Liu, Zhongdao Wang, Yali Li, Shengjin Wang. NeurIPS, 2022 [paper]

Infomax-based Representation Learning

An information-maximization approach to blind separation and blind deconvolution. Anthony J. Bell, Terrence J. Sejnowski. Neural Comput., 1995 [paper]
Alignment by maximization of mutual information. Paul A. Viola, William M. Wells III ICCV, 1995 [paper]
Feature extraction by non-parametric mutual information maximization. Kari Torkkola. J. Mach. Learn. Res., 2003 [paper]
An information-theoretic framework for fast and robust unsupervised learning via neural population infomax. Wentao Huang, Kechen Zhang:. ICLR, 2017 [paper]
Representation learning with contrastive predictive coding. Aäron van den Oord, Yazhe Li, Oriol Vinyals. CoRR, 2018 [paper]
Learning deep representations by mutual information estimation and maximization. R. Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Philip Bachman, Adam Trischler, Yoshua Bengio ICLR, 2019 [paper]
Learning representations by maximizing mutual information across views. Philip Bachman, R. Devon Hjelm, William Buchwalter. NeurIPS, 2019 [paper]
On mutual information in contrastive learning for visual representations. Mike Wu, Chengxu Zhuang, Milan Mosse, Daniel Yamins, Noah D. Goodman. CoRR, 2019 [paper]
Learning adversarially robust representations via worst-case mutual information maximization. Sicheng Zhu, Xiao Zhang, David Evans. ICML, 2020 [paper]
Learning disentangled representations via mutual information estimation. Sanchez, Eduardo Hugo, Mathieu Serrurier, and Mathias Ortner. ECCV, 2020 [paper]
Rethinking Minimal Sufficient Representation in Contrastive Learning. Haoqing Wang, Xun Guo, Zhi-Hong Deng, Yan Lu. CVPR, 2022 [paper]
Representation Learning with Conditional Information Flow Maximization. Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu. ACL, 2024 [paper]

IB-based Representation Learning

The deterministic information bottleneck. DJ Strouse, David J. Schwab. UAI, 2016 [paper]
Deep Variational Information Bottleneck. Alexander A. Alemi, Ian Fischer, Joshua V. Dillon, Kevin Murphy. ICLR, 2017 [paper]
Information dropout: Learning optimal representations through noisy computation. Alessandro Achille, Stefano Soatto. IEEE Trans. Pattern Anal. Mach. Intell., 2018 [paper]
Mutual information neural estimation. Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville. ICML, 2018 [paper]
Nonlinear Information Bottleneck. Artemy Kolchinsky, Brendan D. Tracey, David H. Wolpert. Entropy, 2019 [paper]
Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow. Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, and Sergey Levine. ICLR, 2019 [paper]
The conditional entropy bottleneck. Ian S. Fischer. Entropy, 2020 [paper]
The HSIC bottleneck: Deep learning without back-propagation. Kurt Wan-Duo Ma, J. P. Lewis, W. Bastiaan Kleijn. AAAI, 2020 [paper]
Learning Robust Representations via Multi-View Information Bottleneck. Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, Zeynep Akata. ICLR, 2020 [paper]
Learning Optimal Representations with the Decodable Information Bottleneck. Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam. NeurIPS, 2020 [paper]
The Dual Information Bottleneck. Zoe Piran, Ravid Shwartz-Ziv, Naftali Tishby. CoRR, 2020 [paper]
Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness. Zifeng Wang, Tong Jian, Aria Masoomi, Stratis Ioannidis, Jennifer G. Dy. NeurIPS, 2021 [paper]
Multi-View Information-Bottleneck Representation Learning. Zhibin Wan, Changqing Zhang, Pengfei Zhu, Qinghua Hu. AAAI, 2021 [paper]
PAC-Bayes Information Bottleneck. Zifeng Wang, Shao-Lun Huang, Ercan Engin Kuruoglu, Jimeng Sun, Xi Chen, Yefeng Zheng. ICLR, 2022 [paper]
Maximum Entropy Information Bottleneck for Uncertainty-aware Stochastic Embedding. Sungtae An, Nataraj Jammalamadaka, Eunji Chong. CVPR Workshop, 2023 [paper]
Structured Probabilistic Coding. Dou Hu, Lingwei Wei, Yaxin Liu, Wei Zhou, and Songlin Hu. AAAI, 2024 [paper]

MI Estimation

Estimation of the information by an adaptive partitioning of the observation space. Georges A. Darbellay, Igor Vajda. IEEE Trans. Inf. Theory, 1999 [paper]
Estimation of entropy and mutual information. Liam Paninski. Neural Comput., 2003 [paper]
Estimating mutual information. Alexander Kraskov, Harald Stögbauer, and Peter Grassberger. Physical review, 2004 [paper]
Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization. XuanLong Nguyen, Martin J. Wainwright, Michael I. Jordan. NeurIPS, 2007 [paper]
Density functional estimators with k-nearest neighbor bandwidths. Weihao Gao, Sewoong Oh, Pramod Viswanath. ISIT, 2017 [paper]
f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization. Sebastian Nowozin, Botond Cseke, Ryota Tomioka. NeurIPS, 2017 [paper]
Estimating mutual information for discrete-continuous mixtures. Weihao Gao, Sreeram Kannan, Sewoong Oh, Pramod Viswanath. NeurIPS, 2017 [paper]
Mutual information neural estimation. Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville. ICML, 2018 [paper]
Representation learning with contrastive predictive coding. Aäron van den Oord, Yazhe Li, Oriol Vinyals. CoRR, 2018 [paper]
On variational bounds of mutual information. Ben Poole, Sherjil Ozair, Aäron van den Oord, Alexander A. Alemi, George Tucker. ICML, 2019 [paper]
Club: A contrastive log-ratio upper bound of mutual information. Pengyu Cheng, Weituo Hao, Shuyang Dai, Jiachang Liu, Zhe Gan, Lawrence Carin. ICML, 2020 [paper]
Conditional mutual information estimation for mixed, discrete and continuous data. Octavio César Mesner, Cosma Rohilla Shalizi. IEEE Trans. Inf. Theory, 2021 [paper]
Beyond normal: On the evaluation of mutual information estimators. Pawel Czyz, Frederic Grabowski, Julia E. Vogt, Niko Beerenwinkel, Alexander Marx NeurIPS, 2023 [paper]

Applications

Entropy-based Methods

Maximum-Entropy Fine Grained Classification. Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, Nikhil Naik. NeurIPS, 2018
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning. Rui Zhao, Xudong Sun, Volker Tresp. ICML, 2019
Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach. Proteek Chandan Roy, Vishnu Naresh Boddeti. CVPR, 2019
Semi-Supervised Domain Adaptation via Minimax Entropy. Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Trevor Darrell, Kate Saenko. ICCV, 2019
Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss. Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke. WWW, 2019
Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing. Clara Meister, Elizabeth Salesky, Ryan Cotterell. ACL, 2020
Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning. Riccardo Zamboni, Alberto Maria Metelli, Marcello Restelli. NeurIPS, 2023
MaxEnt Loss: Constrained Maximum Entropy for Calibration under Out-of-Distribution Shift. Dexter Neo, Stefan Winkler, Tsuhan Chen. AAAI, 2024

Infomax-based Methods

Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. NeurIPS, 2016
Deep graph infomax. Petar Velickovic, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, R. Devon Hjelm. ICLR, 2019
Jointly Learning Semantic Parser and Natural Language Generator via Dual Information Maximization. Hai Ye, Wenjie Li, Lu Wang. ACL, 2019
InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, Jian Tang. ICLR, 2020
An Unsupervised Sentence Embedding Method by Mutual Information Maximization. Yan Zhang, Ruidan He, Zuozhu Liu, Kwan Hui Lim, Lidong Bing. EMNLP, 2020
Graph Representation Learning via Graphical Mutual Information Maximization. Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, Junzhou Huang. WWW, 2020
Info3D: Representation Learning on 3D Objects Using Mutual Information Maximization and Contrastive Learning. Aditya Sanghi. ECCV, 2020
A Mutual Information Maximization Approach for the Spurious Solution Problem in Weakly Supervised Question Answering. Zhihong Shao, Lifeng Shang, Qun Liu, Minlie Huang. ACL/IJCNLP, 2021
Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis. Wei Han, Hui Chen, Soujanya Poria. EMNLP, 2021
Clustering by Maximizing Mutual Information Across Views. Kien Do, Truyen Tran, Svetha Venkatesh. ICCV, 2021
Online Continual Learning through Mutual Information Maximization. Yiduo Guo, Bing Liu, Dongyan Zhao. ICML, 2022
InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models. Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov. ICML, 2023
DualCL: Principled Supervised Contrastive Learning as Mutual Information Maximization for Text Classification. Junfan Chen, Richong Zhang, Yaowei Zheng, Qianben Chen, Chunming Hu, Yongyi Mao. WWW, 2024
Learning to Maximize Mutual Information for Chain-of-Thought Distillation. Xin Chen, Hanxian Huang, Yanjun Gao, Yi Wang, Jishen Zhao, Ke Ding. Findings of ACL, 2024

IB-based Methods

Compressing Neural Networks using the Variational Information Bottleneck. Bin Dai, Chen Zhu, Baining Guo, David P. Wipf. ICML, 2018
InfoBot: Transfer and Exploration via the Information Bottleneck. Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew M. Botvinick, Yoshua Bengio, Sergey Levine ICLR, 2019
Specializing Word Embeddings (for Parsing) by Information Bottleneck. Xiang Lisa Li, Jason Eisner. EMNLP/IJCNLP, 2019
BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle. Peter West, Ari Holtzman, Jan Buys, Yejin Choi. EMNLP/IJCNLP, 2019
Restricting the Flow: Information Bottlenecks for Attribution. Karl Schulz, Leon Sixt, Federico Tombari, Tim Landgraf. ICLR, 2020
Graph Information Bottleneck. Tailin Wu, Hongyu Ren, Pan Li, Jure Leskovec. NeurIPS, 2020
Multi-Task Variational Information Bottleneck. Weizhu Qian, Bowei Chen, Franck Gechter CoRR, 2020
DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation. Alexandre Ramé, Matthieu Cord. ICLR, 2021
Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization. Kartik Ahuja, Ethan Caballero, Dinghuai Zhang, Jean-Christophe Gagnon-Audet, Yoshua Bengio, Ioannis Mitliagkas, Irina Rish. NeurIPS, 2021
Variational information bottleneck for effective low-resource fine-tuning. Rabeeh Karimi Mahabadi, Yonatan Belinkov, and James Henderson. ICLR, 2021
Infobert: Improving robustness of language models from an information theoretic perspective. Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu. ICLR, 2021
Learning unbiased representations via mutual information backpropagation. Ruggero Ragonesi, Riccardo Volpi, Jacopo Cavazza, and Vittorio Murino. CVPR Workshop, 2021
IB-GAN: Disentangled Representation Learning with Information Bottleneck Generative Adversarial Networks. Insu Jeon, Wonkwang Lee, Myeongjang Pyeon, Gunhee Kim. AAAI, 2021
Invariant Information Bottleneck for Domain Generalization. Bo Li, Yifei Shen, Yezhen Wang, Wenzhen Zhu, Colorado Reed, Dongsheng Li, Kurt Keutzer, Han Zhao. AAAI, 2022
Self-Supervised Information Bottleneck for Deep Multi-View Subspace Clustering. Shiye Wang, Changsheng Li, Yanming Li, Ye Yuan, Guoren Wang. IEEE Trans. Image Process., 2023

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-information-theoretic-representation-learning

Theories and Analysis

Maximum Entropy

Information Maximization

Information Bottleneck

Others

Learning Principle and Optimization

Entropy-based Representation Learning

Infomax-based Representation Learning

IB-based Representation Learning

MI Estimation

Applications

Entropy-based Methods

Infomax-based Methods

IB-based Methods

About

Releases

Packages

License

zerohd4869/awesome-information-theoretic-representation-learning

Folders and files

Latest commit

History

Repository files navigation

Awesome-information-theoretic-representation-learning

Theories and Analysis

Maximum Entropy

Information Maximization

Information Bottleneck

Others

Learning Principle and Optimization

Entropy-based Representation Learning

Infomax-based Representation Learning

IB-based Representation Learning

MI Estimation

Applications

Entropy-based Methods

Infomax-based Methods

IB-based Methods

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages