Multivariate information decomposition

Figure 1

Placeholder for a particular trick in information theory:

Williams and Beer (2010) introduced partial information decomposition (PID) as a way to split the mutual information that a set of sources has about a target into non‑negative “atoms” corresponding to redundant, unique, and synergistic information. Their framework has become a standard reference point, but their specific redundancy measure has been heavily critiqued. There are now many alternative proposals and generalizations.

Williams & Beer’s original PID

Williams & Beer consider a target (Y) and sources (X_1, X_2, \dots), and aim to decompose (I(X_1, X_2, \dots : Y)) into atoms corresponding to:

**…

Figure 1

Placeholder for a particular trick in information theory:

Williams & Beer’s original PID

Williams & Beer consider a target (Y) and sources (X_1, X_2, \dots), and aim to decompose (I(X_1, X_2, \dots : Y)) into atoms corresponding to:

Redundant information (shared by multiple sources about (Y))

Unique information (available only from one source)

Synergistic information (available only from sources in combination).

They formalize this by:

Introducing a redundancy function (I_{\cap}(X_{1:n} : Y)) defined on sets of sources.

Requiring this redundancy measure to satisfy three axioms: symmetry, self‑redundancy, and monotonicity.

Using the lattice of “information antichains” over source subsets, they show that, given such a redundancy measure, the mutual information can be decomposed by Moebius inversion over this lattice, yielding guaranteed non‑negative information atoms.

As a concrete proposal, they define redundancy via the “minimum information” measure (I_{\min}), which roughly takes the minimal specific information any source provides about each outcome of (Y), averaged over outcomes.

Subsequent theoretical developments

While the framework (axioms + lattice) was widely accepted, the specific Williams–Beer redundancy (I_{\min}) quickly attracted criticism.

Much subsequent work has tried to preserve the Williams–Beer axioms and lattice structure while replacing (I_{\min}) with better‑behaved redundancy or intersection‑information measures.

References

Friedman, Mosenzon, Slonim, et al. 2001. “Multivariate Information Bottleneck.” In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. UAI’01.

Martínez-Sánchez, Arranz, and Lozano-Durán. 2024. “Decomposing Causality into Its Synergistic, Unique, and Redundant Components.” Nature Communications.

Studený, and Vejnarová. 1998. “On Multiinformation Function as a Tool for Measuring Stochastic Dependence.” In Learning in Graphical Models.

Williams & Beer’s original PID

Williams & Beer’s original PID

Redundant information (shared by multiple sources about (Y))

Unique information (available only from one source)

Introducing a redundancy function (I_{\cap}(X_{1:n} : Y)) defined on sets of sources.

Subsequent theoretical developments

References

Similar Posts