# Part 2: Markov chain Monte Carlo (MCMC)

Back to Part 1

We now aim to obtain the posterior distribution of $d$ and $k$ by MCMC sampling. The draft MCMC algorithm is:

1. Set initial states for $d$ and $k$.
2. Propose a new state $d^*$ (from an appropriate proposal density).
3. Accept the proposal with probability $\alpha=\min\Big(1, \frac{p(d^*)p(k)p(D|d^*)}{p(d)p(k)p(D|d)}\Big)$.
If the proposal is accepted set $d=d^*$, otherwise $d=d$.
4. Repeat 2-3 for $k$.
5. Save current values of $d$ and $k$.
6. Go to step 2.

To avoid numerical problems (i.e. underflows or overflows) it is best to work with the logarithm of the unnormalised posterior. This is defined as the sum of the log-prior and the log-likelihood. The R code for the log-unnormalised posterior is

# function returning the logarithm of the unscaled posterior:
# f(d) * f(k) * f(D|d,k)
# by default we set the priors as:
# f(d) = Gamma(d | 2, 20) and f(k) = Gamma(k | 2, .1)
ulnPf <- function(d, k, n=948, ns=84, nv=6, a.d=2, b.d=20,
a.k=2, b.k=.1) {
# The normalizing constant in the prior densities can
# be ignored:
lnpriord <- (a.d - 1)*log(d) - b.d * d
lnpriork <- (a.k - 1)*log(k) - b.k * k

# log-Likelihood (K80 model):
expd1 <- exp(-4*d/(k+2));
expd2 <- exp(-2*d*(k+1)/(k+2))
p0 <- .25 + .25 * expd1 + .5 * expd2
p1 <- .25 + .25 * expd1 - .5 * expd2
p2 <- .25 - .25 * expd1
lnL <- ((n - ns - nv) * log(p0/4) +
ns * log(p1/4) + nv * log(p2/4))

# Return unnormalised posterior:
return (lnpriord + lnpriork + lnL)
}


Note that the joint prior on $d$ and $k$ are two independent gamma densities. Recall the gamma density is $\mathrm{Gamma}(x|a,b) = \frac{1}{\Gamma(a)}x^{a-1}e^{-bx}$,

where $\Gamma$ is the complete gamma function. $\Gamma(a)$ is a constant factor (i.e. does not depend on $d$ or $k$), and so it will cancel out in the calculation of the posterior ratio in step 3 of the MCMC, and thus can be ignored in the calculation of the log-prior in ulnPf. Ignoring calculation of the constant speeds up the code. The calculation of the log-likelihood in ulnPf here involves two calls to the exp function and is slightly faster than the code in Part 1, which has two exp calls (For the K80 log-likelihood equation please see Part 1).

We will now write the R code implementing the MCMC algorithm above. We will use d and k to be the current values ( $d$ and $k$) and dprop and kprop to be the proposed values ( $d^*$ and $k^*$), with corresponding logarithms of the unnormalized posterior to be ulnP and ulnPprop. Note that if lnalpha > 0, we know $\alpha = 1$ so that the proposal is accepted. The code calculates the probability of acceptance $\alpha$ only when lnalpha < 0. The corresponding R function for the MCMC algorithm is:

mcmcf <- function(init.d, init.k, N, w.d, w.k) {
# init.d and init.k are the initial states
# N is the number of MCMC iterations.
# w.d is the width of the sliding-window proposal for d.
# w.k is the width of the sliding-window proposal for k.

# We keep the visited states (d, k) in sample.d and sample.k
# for easy plotting. In practical MCMC applications, these
# are usually written into a file.
sample.d <- sample.k <- numeric(N+1)

# STEP 1: Initialise the MCMC chain
d <- init.d;  sample.d <- init.d
k <- init.k;  sample.k <- init.k
ulnP <- ulnPf(d, k)
acc.d <- 0;  acc.k <- 0 # number of acceptances

for (i in 1:N) {
# STEP 2: Propose new state d*
# we use a uniform sliding window of width w with reflection
# to propose new values d* and k*
# propose d* and accept/reject the proposal
dprop <- d + runif(1, -w.d/2, w.d/2)
# reflect if dprop is negative
if (dprop < 0) dprop <- -dprop

ulnPprop <- ulnPf(dprop, k)
lnalpha <- ulnPprop - ulnP

# STEP 3: Accept or reject the proposal
# if ru < alpha accept proposed d*:
if (lnalpha > 0 || runif(1) < exp(lnalpha)){
d <- dprop;  ulnP <- ulnPprop;
acc.d  <- acc.d + 1
}
# else reject and stay where we are (so that nothing needs
# to be done).

# STEP 4: Repeat steps 2-3 for k
# propose k* and accept/reject the proposal
kprop <- k + runif(1, -w.k/2, w.k/2)
# reflect if kprop is negative
if (kprop < 0) kprop <- -kprop

ulnPprop <- ulnPf(d, kprop)
lnalpha <- ulnPprop - ulnP
# if ru < alpha accept proposed k*:
if (lnalpha > 0 || runif(1) < exp(lnalpha)){
k <- kprop;  ulnP <- ulnPprop
acc.k  <- acc.k + 1
}
# else reject and stay where we are.

# STEP 5: Save chain state
sample.d[i+1] <- d;  sample.k[i+1] <- k
}

# print out the proportion of times
# the proposals were accepted
print("Acceptance proportions (d, k):")
print(c(acc.d/N, acc.k/N))

# return vector of d and k visited during MCMC

return (list(d=sample.d, k=sample.k))
}


Note that as we use ulnP to store the log posterior for the current state, we make one call to the ulnPf function (to calculate the log unnormalized posterior for the newly proposed state) in each of the two proposals. This reduces the amount of computation by a half compared with the alternative strategy of calculating the log unnormalized posterior for both the current and proposed states making two calls to ulnPf. Note also that in the syntax “if(a || b)” on lines 32 and 48, the expression “a || b” evaluates to TRUE as soon as a is TRUE in which case b is not evaluated, saving one call to the exp function. Note that the algorithm is stochastic, so repetitions of it will give different results. Function system.time in R is useful to measure the amount of time an R expression takes to be evaluated. For example, running the MCMC algorithm above with 10,000 iterations takes about 0.3 seconds in a 2.2 GHz MacBook Air.

# Test run-time:
system.time(mcmcf(0.2, 20, 1e4, .12, 180)) # about 0.3s
# Run again and save MCMC output:
dk.mcmc <- mcmcf(0.2, 20, 1e4, .12, 180)


We now plot the ‘traces’ of the parameters. A trace plot is a plot of parameter state vs. MCMC iteration. Trace plots allow us to easily visualise the MCMC chain. They can also show us if there are convergence or mixing problems with the chain (we’ll see this in Part 3):

par(mfrow=c(1,3))

# trace plot of d
plot(dk.mcmc$d, ty='l', xlab="Iteration", ylab="d", main="Trace of d") # trace plot of k plot(dk.mcmc$k, ty='l', xlab="Iteration", ylab="k",
main="Trace of k")

# We can also plot the joint sample of d and k
# (points sampled from posterior surface)
plot(dk.mcmc$d, dk.mcmc$k, pch='.', xlab="d", ylab="k",
main="Joint of d and k")


Trace plots for distance, $d$ and kappa, $k$, and the plot for joint sampled from posterior surface. The trace plots show that the chain has mixed well (i.e., it has explored the parameter space well). The joint sample of $d$ and $k$ matches the plot of the posterior surface from Part 1. In Part 3 we will see examples of chains that have poor mixing. Chains with poor mixing may be the product of poorly chosen proposal densities of bad proposal step sizes. For example, trying running the MCMC chain again after modifying the step sizes (the widths) of the proposal densities, w.d and w.k.

Part 3: Efficiency of the MCMC chain

## 3 thoughts on “Part 2: Markov chain Monte Carlo (MCMC)”

1. Pingback: Part 1: Introduction | The DNA in Us