Metropolis and Gibbs sampling

Metropolis algorithm

In 1953 Metropolis et al proposed one of the first Markov chain Monte Carlo sampling algorithm which is very general and surprisingly efficient for generating an arbitrary given distribution P(X). In Metropolis sampling, we first propose a move, X -> X'. In our previous example, we may consider chosen a particular pair (ij) and change n_ij to the other possible value (0 to 1, or 0 to 1). We then compute the acceptance rate

a = min(1, P(X')/P(X)).

the move is accepted with probability a, and rejected with probability 1 - a. Operationally, we draw a random number 0 < r < 1, and compare it with a. If r < a, the move is accepted, the new state is X'. Otherwise, the move is rejected, and the current state is still X.

If we use canonical distribution, P(X) = exp(-E(X))/Z, the acceptance rate is then

min(1, exp[ - (E(X') - E(X))/(kT)]) = min(1, exp( - DE/kT)). (1)

A general method to find the correct acceptance rate or flip rate is consider the so-called detailed balance equation

P(X) W(X - X') = P(X') PW(X -X').

Justification of the above takes us to the theory of Markov chains and stochastic processes, for which we'll not discuss them here. Roughly speaking, you can identity W with a. The acceptance rate is not unique, there are many possible choices. Another choice that also satisfy the above detailed balance equation is

b = 1/(1 + exp(DE/kT)), (2)

this is known as heat-bath algorithm. Using the expression for E(n), the energy increment is given by

DE = ( d_ij - g + g S_(k=/=j) n_ik+ g S_(k=/=i) n_jk ) Dn_ij,

where Dn_ij is the difference between the new value and old value of n_ij. We leave as an exercise to verify this.

Gibbs Sampling

Gibbs sampling is another class of sampling algorithm with the same aim as the Metropolis or heat bath algorithm. The idea is the following. Pick a particular variable n_ij of fixed pair (ij) [at random]. Let's erase it's value so we do not know whether n_ij is 0 or 1. But since we know the values of all other variables, we can ask the question of what is the probability for n_ij = 0 (or 1) given the states of all other variables. This is the conditional probability of n_ij given all other variables. Using the basic fact about conditional probability

P(A|B) = P(A and B)/P(B)

we find (for n_ij = 0)

P(n_ij=0) = exp( - E(n, n_ij=0) )/ Sn_ij exp(-E(n)) (3)

where E(n) is the energy function. For the matching problem, it is given in the previous section. E(n, n_ij=0) is the energy of the system when the given pair (ij) take n_ij = 0 and the rest of variables takes arbitrary values. The denominator is the Gibbs weight exp(-E(n)) summed up the two possible values that n_ij can take, namely n_ij = 0 and 1. Note that P(n_ij=0) + P(n_ij=1) = 1.

Problem: Using the expression for E(n) from the previous section for the matching problem, derive an concrete expression for P(n_ij=0) and P(n_ij=1). Explain exactly how you would use Gibbs sampling on a computer.

To do a simulated annealing for the matching problem, we can use either Metropolis algorithm (1), or the heat-bath algorithm (2), or the Gibbs sampling (3). They are more or less equivalent in terms of efficiency.