Metropolis and Gibbs sampling

Metropolis algorithm

In 1953 Metropolis et al proposed one of the first Markov chain Monte Carlo sampling algorithm which is very general and surprisingly efficient for generating an arbitrary given distribution P(X).  In Metropolis sampling, we first propose a move, X -> X'.  In our previous example, we may consider chosen a particular pair (ij) and change nij to the other possible value (0 to 1, or 0 to 1).   We then compute the acceptance rate

a = min(1, P(X')/P(X)).

the move is accepted with probability a, and rejected with probability 1 - a.   Operationally, we draw a random number 0 < r < 1, and compare it with a.  If r < a, the move is accepted, the new state is X'.  Otherwise, the move is rejected, and the current state is still X.

If we use canonical distribution, P(X) = exp(-E(X))/Z,  the acceptance rate is then

min(1, exp[ - (E(X') - E(X))/(kT)]) = min(1, exp( - DE/kT)).                (1)

A general method to find the correct acceptance rate or flip rate is consider the so-called detailed balance equation

P(X) W(X - X') = P(X') PW(X -X').

Justification of the above takes us to the theory of Markov chains and stochastic processes, for which we'll not discuss them here.  Roughly speaking, you can identity W with a.  The acceptance rate is not unique, there are many possible choices.  Another choice that also satisfy the above detailed balance equation is

b = 1/(1 + exp(DE/kT)),                                                                (2)

this is known as heat-bath algorithm.  Using the expression for E(n), the energy increment is given by

DE = ( dij - g + g S(k=/=j) nik + g S(k=/=i) njk ) Dnij,

where Dnij is the difference between the new value and old value of nij.  We leave as an exercise to verify this.

Gibbs Sampling

Gibbs sampling is another class of sampling algorithm with the same aim as the Metropolis or heat bath algorithm.   The idea is the following.  Pick a particular variable nij of fixed pair (ij) [at random].  Let's erase it's value so we do not know whether nij is 0 or 1.  But since we know the values of all other variables, we can ask the question of what is the probability for nij = 0 (or 1) given the states of all other variables.  This is the conditional probability of nij given all other variables.   Using the basic fact about conditional probability

P(A|B) = P(A and B)/P(B)

we find (for nij = 0)

P(nij=0) = exp( - E(n, nij=0) )/ Snij exp(-E(n))                                (3)

where E(n) is the energy function.  For the matching problem, it is given in the previous section.  E(n, nij=0) is the energy of the system when the given pair (ij) take nij = 0 and the rest of variables takes arbitrary values.  The denominator is the Gibbs weight exp(-E(n)) summed up the two possible values that nij can take, namely nij = 0 and 1.   Note that P(nij=0) + P(nij=1) = 1.

Problem: Using the expression for E(n) from the previous section for the matching problem, derive an concrete expression for P(nij=0) and P(nij=1).   Explain exactly how you would use Gibbs sampling on a computer.

To do a simulated annealing for the matching problem, we can use either Metropolis algorithm (1), or the heat-bath algorithm (2), or the Gibbs sampling (3).  They are more or less equivalent in terms of efficiency.