Mean field annealing provides another approach for solving combinatorial optimization problems using real computation. Peterson and Soderberg (1989) derive an alternative iteration equation which is consistent with the continuous RL representation.
Let be an unambiguous labeling assignment in defined in (8.24); each takes the value of one of the vectors , , and . Denote the energy with the RL representation still by . The Gibbs distribution of the energy under temperature T>0 is given as
where the partition function
sums over . The statistical mean of the distribution is defined as
As the temperature approaches zero, the mean of the Gibbs distribution approaches its mode or the global energy minimum
This suggests that instead of minimizing directly, we could try to evaluate the mean field at a sufficiently high temperature and then track it down using the continuation method [Wasserstrom 1973] as the temperature is lowered towards zero.
The analysis of the partition function is central in mean field theory because once it is calculated, all statistic information about the system can be deduced from it. The mean field trick for calculating the partition function is to (1) rewrite the sum over a discrete space as an integral over a pair of continuous variables p and q, and having the same dimensionality as f, and then (2) evaluate its integrand at the saddle point (saddle point approximation). As can be seen later, p and q corresponds to a labeling assignment as defined in (8.20) and a gradient as defined in (8.29), respectively. In the following, f, p and q are considered as matrices. To do step (1), we begin by rewriting any function as the following integral
where is the Dirac delta function and is the dimensional real space. The following definition of the Dirac delta function is used in the above
where denotes transpose, is the dimensional imaginary space and C is a normalizing constant. Therefore,
Using the above formula, we can rewrite the partition function, which is the sum of the function as
The summation inside the integrand can be written as
is called the effective energy. .
The above integral expression of Z is exact but too complicated for precise calculation. The calculation may be approximated based on the following heuristic argument: The double integral is dominated by saddle points in the integration intervals and therefore the integral is approximated by
where is a saddle point of . The saddle points are among the roots of the equations
where is the gradient with respect to p. This yields the mean field theory equations
The p matrix represents a labeling assignment in the feasibility space defined by (8.19). The mean field theory equations for combinatorial minimization have been derived in many places e.g. in [Yuille 1990,Simic 1990,Elfadel 1993,Haykin 1994,Yang and Kittler 1994].
Note that in terms of the RL representation, the energy can be written as and so q can be computed as
( cf. 8.29). Now we obtain the following fixed-point iteration
In iterative computation, the temperature T in q is decreased towards as iteration goes. The unambiguity of p is achieved when . In [Peterson and Soderberg 1989], the convergence is judged by a quantity defined as
also called the ``saturation'' of f. The iteration (9.22) is considered converged if . Analog network implementation schemes for mean field algorithms are investigated in [Elfadel 1993]
It is worth pointing out that the idea of mean field annealing for global optimization has been proposed earlier by [Pinkus 1968] as the integral limit method. There, the global minimum is expressed in closed form as a limit of an integral. Assume that is the closure of a bounded domain in . Suppose that is continuous on and has a unique global minimum . Pinkus shows that the following integral
approaches to the global minimum as . Obviously, the above integral is the continuous version of the mean field defined in (9.9) and the idea of the integral limit is similar to that of the mean field annealing. Theoretical analysis shows that this approximation has rather good asymptotic properties but its realization may need some heuristic trials. Computationally, the values can be obtained analytically only in exceptional cases and a numerical method has to be used in general.