nifty7.minimization.kl_energies module¶
- GeoMetricKL(mean, hamiltonian, n_samples, minimizer_samp, mirror_samples, start_from_lin=True, constants=[], point_estimates=[], napprox=0, comm=None, nanisinf=True)[source]¶
Provides the sampled Kullback-Leibler used in geometric Variational Inference (geoVI).
In geoVI a probability distribution is approximated with a standard normal distribution in the canonical coordinate system of the Riemannian manifold associated with the metric of the other distribution. The coordinate transformation is approximated by expanding around a point. In order to infer the expansion point, a stochastic estimate of the Kullback-Leibler divergence is minimized. This estimate is obtained by sampling from the approximation using the current expansion point. During minimization these samples are kept constant; only the expansion point is updated. Due to the typically nonlinear structure of the true distribution these samples have to be updated eventually by instantiating GeoMetricKL again. For the true probability distribution the standard parametrization is assumed. The samples of this class can be distributed among MPI tasks.
- Parameters
mean (Field) – Expansion point of the coordinate transformation.
hamiltonian (StandardHamiltonian) – Hamiltonian of the approximated probability distribution.
n_samples (integer) – Number of samples used to stochastically estimate the KL.
minimizer_samp (DescentMinimizer) – Minimizer used to draw samples.
mirror_samples (boolean) – Whether the mirrored version of the drawn samples are also used. If true, the number of used samples doubles. Mirroring samples stabilizes the KL estimate as extreme sample variation is counterbalanced.
start_from_lin (boolean) – Whether the non-linear sampling should start using the inverse linearized transformation (i.e. the corresponding MGVI sample). If False, the minimization starts from the prior sample. Default is True.
constants (list) – List of parameter keys that are kept constant during optimization. Default is no constants.
point_estimates (list) – List of parameter keys for which no samples are drawn, but that are (possibly) optimized for, corresponding to point estimates of these. Default is to draw samples for the complete domain.
napprox (int) – Number of samples for computing preconditioner for linear sampling. No preconditioning is done by default.
comm (MPI communicator or None) – If not None, samples will be distributed as evenly as possible across this communicator. If mirror_samples is set, then a sample and its mirror image will preferably reside on the same task if necessary.
nanisinf (bool) – If true, nan energies which can happen due to overflows in the forward model are interpreted as inf. Thereby, the code does not crash on these occasions but rather the minimizer is told that the position it has tried is not sensible.
Note
The two lists constants and point_estimates are independent from each other. It is possible to sample along domains which are kept constant during minimization and vice versa. DomainTuples should never be created using the constructor, but rather via the factory function
make
!Note
As in MGVI, mirroring samples can help to stabilize the latent mean as it reduces sampling noise. But unlike MGVI a mirrored sample involves an additional solve of the non-linear transformation. Therefore, when using MPI, the mirrored samples also get distributed if enough tasks are available. If there are more total samples than tasks, the mirrored counterparts try to reside on the same task as their non mirrored partners. This ensures that at least the starting position can be re-used.
See also
Geometric Variational Inference, Philipp Frank, Reimar Leike, Torsten A. Enßlin, https://arxiv.org/abs/2105.10470 https://doi.org/10.3390/e23070853
- MetricGaussianKL(mean, hamiltonian, n_samples, mirror_samples, constants=[], point_estimates=[], napprox=0, comm=None, nanisinf=False)[source]¶
Provides the sampled Kullback-Leibler divergence between a distribution and a Metric Gaussian.
A Metric Gaussian is used to approximate another probability distribution. It is a Gaussian distribution that uses the Fisher information metric of the other distribution at the location of its mean to approximate the variance. In order to infer the mean, a stochastic estimate of the Kullback-Leibler divergence is minimized. This estimate is obtained by sampling the Metric Gaussian at the current mean. During minimization these samples are kept constant; only the mean is updated. Due to the typically nonlinear structure of the true distribution these samples have to be updated eventually by intantiating MetricGaussianKL again. For the true probability distribution the standard parametrization is assumed. The samples of this class can be distributed among MPI tasks.
- Parameters
mean (Field) – Mean of the Gaussian probability distribution.
hamiltonian (StandardHamiltonian) – Hamiltonian of the approximated probability distribution.
n_samples (integer) – Number of samples used to stochastically estimate the KL.
mirror_samples (boolean) – Whether the negative of the drawn samples are also used, as they are equally legitimate samples. If true, the number of used samples doubles. Mirroring samples stabilizes the KL estimate as extreme sample variation is counterbalanced. Since it improves stability in many cases, it is recommended to set mirror_samples to True.
constants (list) – List of parameter keys that are kept constant during optimization. Default is no constants.
point_estimates (list) – List of parameter keys for which no samples are drawn, but that are (possibly) optimized for, corresponding to point estimates of these. Default is to draw samples for the complete domain.
napprox (int) – Number of samples for computing preconditioner for sampling. No preconditioning is done by default.
comm (MPI communicator or None) – If not None, samples will be distributed as evenly as possible across this communicator. If mirror_samples is set, then a sample and its mirror image will always reside on the same task.
nanisinf (bool) – If true, nan energies which can happen due to overflows in the forward model are interpreted as inf. Thereby, the code does not crash on these occasions but rather the minimizer is told that the position it has tried is not sensible.
Note
The two lists constants and point_estimates are independent from each other. It is possible to sample along domains which are kept constant during minimization and vice versa.
See also
Metric Gaussian Variational Inference, Jakob Knollmüller, Torsten A. Enßlin, https://arxiv.org/abs/1901.11033