Mutual information as an imaging metric predicts decoder performance across four domains without task-specific training

Most imaging systems produce measurements that no human ever sees directly. A smartphone camera processes raw sensor data before generating the final JPEG. An MRI scanner collects frequency-space data that requires reconstruction before a clinician can interpret it. A self-driving car passes camera and LiDAR data straight into neural networks without any human-legible intermediate representation. In all these cases, the relevant question is not what measurements look like — it is how much useful information they contain.

A blog post from Berkeley AI Research describes a framework, published as a NeurIPS 2025 paper (arXiv:2405.20559), that makes this question tractable. The core claim is that mutual information between the object and its measurement serves as a single metric that predicts system performance, enables comparison across hardware configurations that trade off quality factors differently, and can be optimized directly to design new hardware — all without training a task-specific decoder.

Why traditional metrics fall short

The post identifies two problems with conventional hardware evaluation. First, metrics like resolution and signal-to-noise ratio are assessed independently. Two systems might score differently on each metric, making comparison difficult when they trade off in different directions. Second, training a neural network to reconstruct or classify images entangles the quality of the hardware with the quality of the algorithm. If a system performs poorly, you cannot tell whether the hardware is the bottleneck or the network is.

Mutual information avoids both problems. The post explains: “Two systems with the same mutual information are equivalent in their ability to distinguish objects, even if their measurements look completely different.” It captures resolution, noise, sampling, and all other factors simultaneously. A blurry, noisy image that preserves the features needed for a downstream task can contain more information than a sharp, clean image that discards those features.

Previous attempts to apply information theory to imaging failed in one of two ways: treating systems as unconstrained communication channels (ignoring physical sensor constraints, producing inaccurate estimates) or requiring explicit models of the objects being imaged (limiting generality). The new approach estimates information directly from measurements.

Estimating information from measurements

The mathematical decomposition is: $I(X; Y) = H(Y) - H(Y \mid X)$. The second term, $H(Y \mid X)$, is the entropy of measurement variation due to noise alone. The post notes that imaging systems have well-characterized noise physics — photon shot noise follows a Poisson distribution, electronic readout noise is Gaussian — so $H(Y \mid X)$ can be computed directly from a known noise model without fitting anything.

The first term, $H(Y)$, is the total variation in measurements across both object differences and noise. This must be learned from data. The researchers fit a probabilistic model to a dataset of measurements and use its learned distribution. The post describes three model options spanning efficiency-accuracy tradeoffs: a stationary Gaussian process (fastest), a full Gaussian (intermediate), and an autoregressive PixelCNN (most accurate). The approach yields an upper bound on true information: any modeling error can only overestimate, not underestimate.

Validation across four imaging domains

The framework is tested in four separate imaging applications, each chosen to represent a different kind of hardware design problem.

For color photography, different camera filter array designs were evaluated: the traditional Bayer pattern, a random arrangement, and a learned arrangement. The post reports that information estimates “correctly ranked which designs would produce better color reconstructions, matching the rankings from neural network demosaicing without requiring any reconstruction algorithm.”

For radio astronomy, telescope arrays achieve resolution by combining signals from geographically distributed sites. Selecting optimal site locations is computationally intractable because each site’s value depends on all other selected sites. The post states that information estimates predicted reconstruction quality across configurations, making site selection feasible without expensive image reconstruction for each candidate set.

For lensless imaging — cameras that replace traditional optics with light-modulating masks — measurements bear no visual resemblance to scenes. Information estimates predicted reconstruction accuracy across a lens, microlens array, and diffuser design at varying noise levels.

For microscopy, LED array microscopes using programmable illumination generate different contrast modes. The post reports that information estimates correlated with neural network accuracy at predicting protein expression from cell images, enabling evaluation without expensive protein labeling experiments.

The consistent finding: “In all cases, higher information meant better downstream performance.”

Hardware design via IDEAL

Beyond evaluation, the post describes a design framework called IDEAL that optimizes hardware configurations by maximizing the information metric. The post states that optimizing via IDEAL “produces designs that match state-of-the-art end-to-end methods while requiring less memory, less compute, and no task-specific decoder design.”

The advantage over end-to-end trained systems is that IDEAL does not require a fixed downstream task to be specified at design time. Hardware designed to maximize information content is task-agnostic: it preserves as much useful signal as possible, leaving the specific interpretation to downstream models. For hardware that must serve multiple applications — or applications not yet defined at design time — this flexibility is practically significant.

The framework represents a shift in how imaging hardware can be evaluated: moving from “train a network and see how it performs” to “measure the information content and predict performance directly.” For domains where hardware design cycles are long and dataset collection is expensive, the ability to evaluate and optimize based on information content alone reduces the cost of iteration substantially.