We now aim to obtain the posterior distribution of and by MCMC sampling. The draft MCMC algorithm is:

- Set initial states for and .
- Propose a new state (from an appropriate proposal density).
- Accept the proposal with probability

.

If the proposal is accepted set , otherwise . - Repeat 2-3 for .
- Save current values of and .
- Go to step 2.

To avoid numerical problems (i.e. underflows or overflows) it is best to work with the logarithm of the unnormalised posterior. This is defined as the sum of the log-prior and the log-likelihood. The R code for the log-unnormalised posterior is

# function returning the logarithm of the unscaled posterior: # f(d) * f(k) * f(D|d,k) # by default we set the priors as: # f(d) = Gamma(d | 2, 20) and f(k) = Gamma(k | 2, .1) ulnPf <- function(d, k, n=948, ns=84, nv=6, a.d=2, b.d=20, a.k=2, b.k=.1) { # The normalizing constant in the prior densities can # be ignored: lnpriord <- (a.d - 1)*log(d) - b.d * d lnpriork <- (a.k - 1)*log(k) - b.k * k # log-Likelihood (K80 model): expd1 <- exp(-4*d/(k+2)); expd2 <- exp(-2*d*(k+1)/(k+2)) p0 <- .25 + .25 * expd1 + .5 * expd2 p1 <- .25 + .25 * expd1 - .5 * expd2 p2 <- .25 - .25 * expd1 lnL <- ((n - ns - nv) * log(p0/4) + ns * log(p1/4) + nv * log(p2/4)) # Return unnormalised posterior: return (lnpriord + lnpriork + lnL) }

Note that the joint prior on and are two independent gamma densities. Recall the gamma density is

,

where is the complete gamma function. is a constant factor (i.e. does not depend on or ), and so it will cancel out in the calculation of the posterior ratio in step 3 of the MCMC, and thus can be ignored in the calculation of the log-prior in `ulnPf`

. Ignoring calculation of the constant speeds up the code. The calculation of the log-likelihood in `ulnPf`

here involves two calls to the exp function and is slightly faster than the code in Part 1, which has two exp calls (For the K80 log-likelihood equation please see Part 1).

We will now write the R code implementing the MCMC algorithm above. We will use `d`

and `k`

to be the current values ( and ) and `dprop`

and `kprop`

to be the proposed values ( and ), with corresponding logarithms of the unnormalized posterior to be `ulnP`

and `ulnPprop`

. Note that if `lnalpha > 0`

, we know so that the proposal is accepted. The code calculates the probability of acceptance only when `lnalpha < 0`

. The corresponding R function for the MCMC algorithm is:

mcmcf <- function(init.d, init.k, N, w.d, w.k) { # init.d and init.k are the initial states # N is the number of MCMC iterations. # w.d is the width of the sliding-window proposal for d. # w.k is the width of the sliding-window proposal for k. # We keep the visited states (d, k) in sample.d and sample.k # for easy plotting. In practical MCMC applications, these # are usually written into a file. sample.d <- sample.k <- numeric(N+1) # STEP 1: Initialise the MCMC chain d <- init.d; sample.d[1] <- init.d k <- init.k; sample.k[1] <- init.k ulnP <- ulnPf(d, k) acc.d <- 0; acc.k <- 0 # number of acceptances for (i in 1:N) { # STEP 2: Propose new state d* # we use a uniform sliding window of width w with reflection # to propose new values d* and k* # propose d* and accept/reject the proposal dprop <- d + runif(1, -w.d/2, w.d/2) # reflect if dprop is negative if (dprop < 0) dprop <- -dprop ulnPprop <- ulnPf(dprop, k) lnalpha <- ulnPprop - ulnP # STEP 3: Accept or reject the proposal # if ru < alpha accept proposed d*: if (lnalpha > 0 || runif(1) < exp(lnalpha)){ d <- dprop; ulnP <- ulnPprop; acc.d <- acc.d + 1 } # else reject and stay where we are (so that nothing needs # to be done). # STEP 4: Repeat steps 2-3 for k # propose k* and accept/reject the proposal kprop <- k + runif(1, -w.k/2, w.k/2) # reflect if kprop is negative if (kprop < 0) kprop <- -kprop ulnPprop <- ulnPf(d, kprop) lnalpha <- ulnPprop - ulnP # if ru < alpha accept proposed k*: if (lnalpha > 0 || runif(1) < exp(lnalpha)){ k <- kprop; ulnP <- ulnPprop acc.k <- acc.k + 1 } # else reject and stay where we are. # STEP 5: Save chain state sample.d[i+1] <- d; sample.k[i+1] <- k } # print out the proportion of times # the proposals were accepted print("Acceptance proportions (d, k):") print(c(acc.d/N, acc.k/N)) # return vector of d and k visited during MCMC return (list(d=sample.d, k=sample.k)) }

Note that as we use ulnP to store the log posterior for the current state, we make one call to the ulnPf function (to calculate the log unnormalized posterior for the newly proposed state) in each of the two proposals. This reduces the amount of computation by a half compared with the alternative strategy of calculating the log unnormalized posterior for both the current and proposed states making two calls to ulnPf. Note also that in the syntax “if(a || b)” on lines 32 and 48, the expression “a || b” evaluates to TRUE as soon as a is TRUE in which case b is not evaluated, saving one call to the exp function. Note that the algorithm is stochastic, so repetitions of it will give different results. Function `system.time`

in R is useful to measure the amount of time an R expression takes to be evaluated. For example, running the MCMC algorithm above with 10,000 iterations takes about 0.3 seconds in a 2.2 GHz MacBook Air.

# Test run-time: system.time(mcmcf(0.2, 20, 1e4, .12, 180)) # about 0.3s # Run again and save MCMC output: dk.mcmc <- mcmcf(0.2, 20, 1e4, .12, 180)

We now plot the ‘traces’ of the parameters. A trace plot is a plot of parameter state vs. MCMC iteration. Trace plots allow us to easily visualise the MCMC chain. They can also show us if there are convergence or mixing problems with the chain (we’ll see this in Part 3):

par(mfrow=c(1,3)) # trace plot of d plot(dk.mcmc$d, ty='l', xlab="Iteration", ylab="d", main="Trace of d") # trace plot of k plot(dk.mcmc$k, ty='l', xlab="Iteration", ylab="k", main="Trace of k") # We can also plot the joint sample of d and k # (points sampled from posterior surface) plot(dk.mcmc$d, dk.mcmc$k, pch='.', xlab="d", ylab="k", main="Joint of d and k")

Trace plots for distance, and kappa, , and the plot for joint sampled from posterior surface.

The trace plots show that the chain has mixed well (i.e., it has explored the parameter space well). The joint sample of and matches the plot of the posterior surface from Part 1. In Part 3 we will see examples of chains that have poor mixing. Chains with poor mixing may be the product of poorly chosen proposal densities of bad proposal step sizes. For example, trying running the MCMC chain again after modifying the step sizes (the widths) of the proposal densities, `w.d`

and `w.k`

.

Pingback: Part 1: Introduction | The DNA in Us

Pingback: Part 3: Efficiency of the MCMC chain | The DNA in Us

Pingback: Tutorial: Bayesian MCMC phylogenetics using R | The DNA in Us