A curated paper list for information-theoretic representation learning.
All papers are selected and sorted by topic/year. Please send a pull request if you would like to add any paper.
- Information theory and statistical mechanics.
Edwin T. Jaynes.
Physical review, 1957
[paper] - Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy.
JE Shore, RW Johnson.
IEEE Trans. Inf. Theory, 1980
[paper] - On the rationale of maximum-entropy methods.
Edwin T. Jaynes.
Proceedings of the IEEE, 1982
[paper] - Pac-bayes analysis of maximum entropy classification.
John Shawe-Taylor, David R. Hardoon.
AISTATS, 2009
[paper] - The role of entropy and reconstruction in multi-view self-supervised learning.
Borja Rodrı́guez Gálvez, Arno Blaas, Pau Rodriguez, Adam Golinski, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella
ICML, 2023
[paper]
- Self-Organization in a Perceptual Network
Ralph Linsker.
Computer, 1988
[paper] - On Mutual Information Maximization for Representation Learning
Michael Tschannen, Josip Djolonga, Paul K. Rubenstein, Sylvain Gelly, Mario Lucic.
ICLR, 2020
[paper] - A Mutual Information Maximization Perspective of Language Representation Learning.
Lingpeng Kong, Cyprien de Masson d'Autume, Lei Yu, Wang Ling, Zihang Dai, Dani Yogatama.
ICLR, 2020
[paper] - Which Mutual-Information Representation Learning Objectives are Sufficient for Control?
Kate Rakelly, Abhishek Gupta, Carlos Florensa, Sergey Levine.
NeurIPS, 2021
[paper]
- The information bottleneck method.
Naftali Tishby, Fernando C Pereira, and William Bialek.
the 37th Allerton Conference on Communication and Computation, 1999
[paper] - Information bottleneck for Gaussian variables.
Gal Chechik, Amir Globerson, Naftali Tishby, Yair Weiss.
NeurIPS, 2003
[paper] - Deep learning and the information bottleneck principle.
Naftali Tishby and Noga Zaslavsky.
ITW, 2015
[paper] - Opening the black box of deep neural networks via information.
Ravid Shwartz-Ziv, Naftali Tishby.
CoRR, 2017
[paper] - The role of the information bottleneck in representation learning.
Matías Vera, Pablo Piantanida, Leonardo Rey Vega.
ISIT, 2018
[paper] - On the information bottleneck theory of deep learning.
Andrew M. Saxe, Yamini Bansal, Joel Dapello, Madhu Advani, Artemy Kolchinsky, Brendan D. Tracey, David D. Cox.
ICLR, 2018
[paper] - Learning representations for neural network-based classification using the information bottleneck principle.
Rana Ali Amjad, Bernhard C. Geiger.
IEEE Trans. Pattern Anal. Mach. Intell., 2020
[paper] - Learnability for the information bottleneck.
Tailin Wu, Ian S. Fischer, Isaac L. Chuang, Max Tegmark.
UAI, 2020
[paper] - Perturbation theory for the information bottleneck.
Vudtiwat Ngampruetikorn, David J. Schwab.
NeurIPS, 2021
[paper] - How does information bottleneck help deep learning?
Kenji Kawaguchi, Zhun Deng, Xu Ji, and Jiaoyang Huang.
ICML, 2023
[paper]
- Information-theoretic analysis of generalization capability of learning algorithms.
Aolin Xu and Maxim Raginsky.
NeurIPS, 2017
[paper] - Emergence of invariance and disentanglement in deep representations.
Alessandro Achille, Stefano Soatto.
ITA, 2018
[paper] - Understanding the Limitations of Variational Mutual Information Estimators.
Jiaming Song, Stefano Ermon.
ICLR, 2020
[paper] - Reasoning about generalization via conditional mutual information.
Thomas Steinke and Lydia Zakynthinou.
COLT, 2020
[paper] - A Unifying Mutual Information View of Metric Learning: Cross-Entropy vs. Pairwise Losses.
Malik Boudiaf, Jérôme Rony, Imtiaz Masud Ziko, Eric Granger, Marco Pedersoli, Pablo Piantanida, Ismail Ben Aye.
ECCV, 2020
[paper]
- Deterministic annealing for clustering, compression, classification, regression, and related optimization problems.
K Rose.
Proceedings of the IEEE, 1998
[paper] - Unsupervised Learning of Finite Mixture Models.
Mário A. T. Figueiredo, Anil K. Jain.
IEEE Trans. Pattern Anal. Mach. Intell., 2002
[paper] - Semi-supervised learning by entropy minimization.
Yves Grandvalet, Yoshua Bengio.
NeurIPS, 2004
[paper] - Nonparametric Supervised Learning by Linear Interpolation with Maximum Entropy.
Maya R. Gupta, Robert M. Gray, Richard A. Olshen.
IEEE Trans. Pattern Anal. Mach. Intell., 2006
[paper] - Similarity-based Classification: Concepts and Algorithms.
Yihua Chen, Eric K. Garcia, Maya R. Gupta, Ali Rahimi, Luca Cazzanti.
J. Mach. Learn. Res., 2009
[paper] - Maximum Entropy Discrimination Markov Networks.
Jun Zhu, Eric P. Xing.
J. Mach. Learn. Res., 2009
[paper] - Regularizing neural networks by penalizing confident output distributions.
Gabriel Pereyra, George Tucker, Jan Chorowski, Lukasz Kaiser, and Geoffrey E. Hinton.
ICLR Workshop, 2017
[paper] - Compressing images by encoding their latent representations with relative entropy coding.
Gergely Flamich, Marton Havasi, José Miguel Hernández-Lobato.
NeurIPS, 2020
[paper] - Self-supervised learning via maximum entropy coding.
Xin Liu, Zhongdao Wang, Yali Li, Shengjin Wang.
NeurIPS, 2022
[paper]
- An information-maximization approach to blind separation and blind deconvolution.
Anthony J. Bell, Terrence J. Sejnowski.
Neural Comput., 1995
[paper] - Alignment by maximization of mutual information.
Paul A. Viola, William M. Wells III
ICCV, 1995
[paper] - Feature extraction by non-parametric mutual information maximization.
Kari Torkkola.
J. Mach. Learn. Res., 2003
[paper] - An information-theoretic framework for fast and robust unsupervised learning via neural population infomax.
Wentao Huang, Kechen Zhang:.
ICLR, 2017
[paper] - Representation learning with contrastive predictive coding.
Aäron van den Oord, Yazhe Li, Oriol Vinyals.
CoRR, 2018
[paper] - Learning deep representations by mutual information estimation and maximization.
R. Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Philip Bachman, Adam Trischler, Yoshua Bengio
ICLR, 2019
[paper] - Learning representations by maximizing mutual information across views.
Philip Bachman, R. Devon Hjelm, William Buchwalter.
NeurIPS, 2019
[paper] - On mutual information in contrastive learning for visual representations.
Mike Wu, Chengxu Zhuang, Milan Mosse, Daniel Yamins, Noah D. Goodman.
CoRR, 2019
[paper] - Learning adversarially robust representations via worst-case mutual information maximization.
Sicheng Zhu, Xiao Zhang, David Evans.
ICML, 2020
[paper] - Learning disentangled representations via mutual information estimation.
Sanchez, Eduardo Hugo, Mathieu Serrurier, and Mathias Ortner.
ECCV, 2020
[paper] - Rethinking Minimal Sufficient Representation in Contrastive Learning.
Haoqing Wang, Xun Guo, Zhi-Hong Deng, Yan Lu.
CVPR, 2022
[paper] - Representation Learning with Conditional Information Flow Maximization.
Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu.
ACL, 2024
[paper]
- The deterministic information bottleneck.
DJ Strouse, David J. Schwab.
UAI, 2016
[paper] - Deep Variational Information Bottleneck.
Alexander A. Alemi, Ian Fischer, Joshua V. Dillon, Kevin Murphy.
ICLR, 2017
[paper] - Information dropout: Learning optimal representations through noisy computation.
Alessandro Achille, Stefano Soatto.
IEEE Trans. Pattern Anal. Mach. Intell., 2018
[paper] - Mutual information neural estimation.
Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville.
ICML, 2018
[paper] - Nonlinear Information Bottleneck.
Artemy Kolchinsky, Brendan D. Tracey, David H. Wolpert.
Entropy, 2019
[paper] - Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow.
Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, and Sergey Levine.
ICLR, 2019
[paper] - The conditional entropy bottleneck.
Ian S. Fischer.
Entropy, 2020
[paper] - The HSIC bottleneck: Deep learning without back-propagation.
Kurt Wan-Duo Ma, J. P. Lewis, W. Bastiaan Kleijn.
AAAI, 2020
[paper] - Learning Robust Representations via Multi-View Information Bottleneck.
Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, Zeynep Akata.
ICLR, 2020
[paper] - Learning Optimal Representations with the Decodable Information Bottleneck.
Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam.
NeurIPS, 2020
[paper] - The Dual Information Bottleneck.
Zoe Piran, Ravid Shwartz-Ziv, Naftali Tishby.
CoRR, 2020
[paper] - Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness.
Zifeng Wang, Tong Jian, Aria Masoomi, Stratis Ioannidis, Jennifer G. Dy.
NeurIPS, 2021
[paper] - Multi-View Information-Bottleneck Representation Learning.
Zhibin Wan, Changqing Zhang, Pengfei Zhu, Qinghua Hu.
AAAI, 2021
[paper] - PAC-Bayes Information Bottleneck.
Zifeng Wang, Shao-Lun Huang, Ercan Engin Kuruoglu, Jimeng Sun, Xi Chen, Yefeng Zheng.
ICLR, 2022
[paper] - Maximum Entropy Information Bottleneck for Uncertainty-aware Stochastic Embedding.
Sungtae An, Nataraj Jammalamadaka, Eunji Chong.
CVPR Workshop, 2023
[paper] - Structured Probabilistic Coding.
Dou Hu, Lingwei Wei, Yaxin Liu, Wei Zhou, and Songlin Hu.
AAAI, 2024
[paper]
- Estimation of the information by an adaptive partitioning of the observation space.
Georges A. Darbellay, Igor Vajda.
IEEE Trans. Inf. Theory, 1999
[paper] - Estimation of entropy and mutual information.
Liam Paninski.
Neural Comput., 2003
[paper] - Estimating mutual information.
Alexander Kraskov, Harald Stögbauer, and Peter Grassberger.
Physical review, 2004
[paper] - Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization.
XuanLong Nguyen, Martin J. Wainwright, Michael I. Jordan.
NeurIPS, 2007
[paper] - Density functional estimators with k-nearest neighbor bandwidths.
Weihao Gao, Sewoong Oh, Pramod Viswanath.
ISIT, 2017
[paper] - f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization.
Sebastian Nowozin, Botond Cseke, Ryota Tomioka.
NeurIPS, 2017
[paper] - Estimating mutual information for discrete-continuous mixtures.
Weihao Gao, Sreeram Kannan, Sewoong Oh, Pramod Viswanath.
NeurIPS, 2017
[paper] - Mutual information neural estimation.
Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville.
ICML, 2018
[paper] - Representation learning with contrastive predictive coding.
Aäron van den Oord, Yazhe Li, Oriol Vinyals.
CoRR, 2018
[paper] - On variational bounds of mutual information.
Ben Poole, Sherjil Ozair, Aäron van den Oord, Alexander A. Alemi, George Tucker.
ICML, 2019
[paper] - Club: A contrastive log-ratio upper bound of mutual information.
Pengyu Cheng, Weituo Hao, Shuyang Dai, Jiachang Liu, Zhe Gan, Lawrence Carin.
ICML, 2020
[paper] - Conditional mutual information estimation for mixed, discrete and continuous data.
Octavio César Mesner, Cosma Rohilla Shalizi.
IEEE Trans. Inf. Theory, 2021
[paper] - Beyond normal: On the evaluation of mutual information estimators.
Pawel Czyz, Frederic Grabowski, Julia E. Vogt, Niko Beerenwinkel, Alexander Marx
NeurIPS, 2023
[paper]
- Maximum-Entropy Fine Grained Classification.
Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, Nikhil Naik.
NeurIPS, 2018
- Maximum Entropy-Regularized Multi-Goal Reinforcement Learning.
Rui Zhao, Xudong Sun, Volker Tresp.
ICML, 2019
- Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach.
Proteek Chandan Roy, Vishnu Naresh Boddeti.
CVPR, 2019
- Semi-Supervised Domain Adaptation via Minimax Entropy.
Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Trevor Darrell, Kate Saenko.
ICCV, 2019
- Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss.
Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke.
WWW, 2019
- Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing.
Clara Meister, Elizabeth Salesky, Ryan Cotterell.
ACL, 2020
- Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning.
Riccardo Zamboni, Alberto Maria Metelli, Marcello Restelli.
NeurIPS, 2023
- MaxEnt Loss: Constrained Maximum Entropy for Calibration under Out-of-Distribution Shift.
Dexter Neo, Stefan Winkler, Tsuhan Chen.
AAAI, 2024
- Infogan: Interpretable representation learning by information maximizing generative adversarial nets.
Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel.
NeurIPS, 2016
- Deep graph infomax.
Petar Velickovic, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, R. Devon Hjelm.
ICLR, 2019
- Jointly Learning Semantic Parser and Natural Language Generator via Dual Information Maximization.
Hai Ye, Wenjie Li, Lu Wang.
ACL, 2019
- InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization.
Fan-Yun Sun, Jordan Hoffmann, Vikas Verma, Jian Tang.
ICLR, 2020
- An Unsupervised Sentence Embedding Method by Mutual Information Maximization.
Yan Zhang, Ruidan He, Zuozhu Liu, Kwan Hui Lim, Lidong Bing.
EMNLP, 2020
- Graph Representation Learning via Graphical Mutual Information Maximization.
Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, Junzhou Huang.
WWW, 2020
- Info3D: Representation Learning on 3D Objects Using Mutual Information Maximization and Contrastive Learning.
Aditya Sanghi.
ECCV, 2020
- A Mutual Information Maximization Approach for the Spurious Solution Problem in Weakly Supervised Question Answering.
Zhihong Shao, Lifeng Shang, Qun Liu, Minlie Huang.
ACL/IJCNLP, 2021
- Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis.
Wei Han, Hui Chen, Soujanya Poria.
EMNLP, 2021
- Clustering by Maximizing Mutual Information Across Views.
Kien Do, Truyen Tran, Svetha Venkatesh.
ICCV, 2021
- Online Continual Learning through Mutual Information Maximization.
Yiduo Guo, Bing Liu, Dongyan Zhao.
ICML, 2022
- InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models.
Yingheng Wang, Yair Schiff, Aaron Gokaslan, Weishen Pan, Fei Wang, Christopher De Sa, Volodymyr Kuleshov.
ICML, 2023
- DualCL: Principled Supervised Contrastive Learning as Mutual Information Maximization for Text Classification.
Junfan Chen, Richong Zhang, Yaowei Zheng, Qianben Chen, Chunming Hu, Yongyi Mao.
WWW, 2024
- Learning to Maximize Mutual Information for Chain-of-Thought Distillation.
Xin Chen, Hanxian Huang, Yanjun Gao, Yi Wang, Jishen Zhao, Ke Ding.
Findings of ACL, 2024
- Compressing Neural Networks using the Variational Information Bottleneck.
Bin Dai, Chen Zhu, Baining Guo, David P. Wipf.
ICML, 2018
- InfoBot: Transfer and Exploration via the Information Bottleneck.
Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew M. Botvinick, Yoshua Bengio, Sergey Levine
ICLR, 2019
- Specializing Word Embeddings (for Parsing) by Information Bottleneck.
Xiang Lisa Li, Jason Eisner.
EMNLP/IJCNLP, 2019
- BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle.
Peter West, Ari Holtzman, Jan Buys, Yejin Choi.
EMNLP/IJCNLP, 2019
- Restricting the Flow: Information Bottlenecks for Attribution.
Karl Schulz, Leon Sixt, Federico Tombari, Tim Landgraf.
ICLR, 2020
- Graph Information Bottleneck.
Tailin Wu, Hongyu Ren, Pan Li, Jure Leskovec.
NeurIPS, 2020
- Multi-Task Variational Information Bottleneck.
Weizhu Qian, Bowei Chen, Franck Gechter
CoRR, 2020
- DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation.
Alexandre Ramé, Matthieu Cord.
ICLR, 2021
- Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization.
Kartik Ahuja, Ethan Caballero, Dinghuai Zhang, Jean-Christophe Gagnon-Audet, Yoshua Bengio, Ioannis Mitliagkas, Irina Rish.
NeurIPS, 2021
- Variational information bottleneck for effective low-resource fine-tuning.
Rabeeh Karimi Mahabadi, Yonatan Belinkov, and James Henderson.
ICLR, 2021
- Infobert: Improving robustness of language models from an information theoretic perspective.
Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu.
ICLR, 2021
- Learning unbiased representations via mutual information backpropagation.
Ruggero Ragonesi, Riccardo Volpi, Jacopo Cavazza, and Vittorio Murino.
CVPR Workshop, 2021
- IB-GAN: Disentangled Representation Learning with Information Bottleneck Generative Adversarial Networks.
Insu Jeon, Wonkwang Lee, Myeongjang Pyeon, Gunhee Kim.
AAAI, 2021
- Invariant Information Bottleneck for Domain Generalization.
Bo Li, Yifei Shen, Yezhen Wang, Wenzhen Zhu, Colorado Reed, Dongsheng Li, Kurt Keutzer, Han Zhao.
AAAI, 2022
- Self-Supervised Information Bottleneck for Deep Multi-View Subspace Clustering.
Shiye Wang, Changsheng Li, Yanming Li, Ye Yuan, Guoren Wang.
IEEE Trans. Image Process., 2023