(p.372) I Multidimensional integrals, Monte Carlo method
(p.372) I Multidimensional integrals, Monte Carlo method
The Monte Carlo method is a very powerful numerical technique used to evaluate multidimensional integrals in statistical mechanics and other branches of physics and chemistry. It is also used when initial conditions are chosen in classical reaction dynamics calculations, as we have discussed in Chapter 4. It will therefore be appropriate here to give a brief introduction to the method and to the ideas behind the method.
From Appendix A on statistical mechanics we have seen that the thermodynamic average value of an observable A is given by the expression
In a system with N atoms there are 3N momenta p and position coordinates q, so the integrals are 6N-dimensional. A(p, q) is an observable depending on the coordinates and momenta of the atoms and H(p, q) is the Hamiltonian of the system. Since the kinetic energy term in H is quadratic in the momenta, the integration over momenta can be carried out analytically. Hence, averages of functions that depend on momenta only are usually easy to evaluate. The very difficult problem is the computation of averages of functions depending on the positions q. Except for a few special cases, it is impossible to compute the 3N-dimensional configurational integral analytically, and in all other cases numerical techniques must be used.
The most straightforward approach may appear to be an evaluation of the integrals by numerical quadrature, for instance using Simpson’s rule. It is, however, easy to see that such a method quickly becomes hopeless to use. Suppose, for example, that we consider a system with N atoms. Let us assume that we take m equidistant points along each of the 3N Cartesian axes. The total number of points at which the integrand must be evaluated is then equal to m 3N. For all but the smallest systems this number becomes very large, even for small values of m. For instance, if we take m = 5 in a system with 100 atoms, then we need to evaluate the integrand at 5300 = 10210 points! This clearly demonstrates that better numerical techniques are needed to compute thermal averages. One such technique is the Monte Carlo method or, more precisely, the Monte Carlo importance sampling algorithm introduced by Metropolis et al. in 1953 , when simulating the neutron flux in the core of nuclear reactors.
(p.373) I.1 Random sampling and importance sampling
The Monte Carlo method includes both a random sampling scheme and an importance sampling scheme. Both sampling schemes have been used in Section 4.1 on classical trajectory calculations.
Let us first look at the simple random sampling scheme. Suppose we want to evaluate the one-dimensional integral
A simple quadrature scheme for evaluating this integral may look like
In brute force Monte Carlo, this average is determined by evaluating f (x) at a large number, say L, of x values randomly distributed in the interval [a, b]. This is equivalent to giving each x value the same weight in the summation, just like in the average in Eq. (I.3). No x values are preferred to others, and in the limit as L → ∞ there will be the same number of x values in each interval, no matter where the interval is chosen between a and b. Hence, an estimate of the integral may be found from
However, like conventional quadrature, this method is of little use for the evaluation of averages such as in Eq. (I.1) because most of the computing is spent at points where the Boltzmann factor is negligible and the integral is therefore zero. Obviously, it would be more preferable to sample many points in regions where the Boltzmann factor is larger than zero and few elsewhere. This is the basic idea behind importance sampling.
That is, instead of sampling the independent variable x in Eq. (I.2), and position coordinates q in Eq. (I.1), as in brute force Monte Carlo, we should choose the independent variables according to some distribution function, that is, some values are chosen more frequently than others.
Let us begin with the one-dimensional case in Eq. (I.2). Suppose we want to compute the definite integral by Monte Carlo importance sampling with sampling points (p.374) distributed non-uniformly over the interval [a, b], according to some non-negative normalized probability density w(x), that is,
Clearly, we may multiply and divide the integrand by w(x), and thereby rewrite the integral in the form
Let us now assume that there exists a function u(x) satisfying the relation
In Eq. (I.8) we have written x(u) to indicate that, if we consider u as the integration variable, then x must be expressed as a function of u by inverting the relation u = u(x). The expression in Eq. (I.8) is now similar to the expression in Eq. (I.3) and may be evaluated by generating L random numbers u i of u in the interval [u(a), u(a) + 1]. We then obtain the following estimate of the integral:
An obvious question now is what have we gained by rewriting the integral in Eq. (I.2) in the form of Eq. (I.8)? To see this, let us form the variance σ2 in the value of the estimate of the integral. We find
Here the bracket <…>denotes the true average as obtained for L → ∞. Since the different samples i and j are random and therefore statistically independent, all cross terms between i and j vanish when we evaluate the expression in Eq. (I.10). It may therefore be written
This shows that the variance behaves as 1/L, independent of the formulation of the integrals, which is clear when we set w(x) = 1. However, we may reduce the variance (p.375) significantly, for a given L, by a proper choice of w. For instance, if w = f, so f/w is a constant, then the variance will be zero. This is the ideal situation, but if we choose a w such that the ratio f/w will be a smoothly-varying function, we may still get a very small variance. In contrast, if w(x) is chosen constant, as in brute force Monte Carlo sampling, there may be large fluctuations in the ratio f/w, which is equivalent to a poor determination of the integral.
This becomes even more clear when we consider the multidimensional integrals encountered in statistical mechanics. Then only a very small fraction of the points in phase space are accessible, that is, correspond to states with a non-zero Boltzmann factor. So, even if L is very large, only ρL of the points will be in such a region corresponding to a sampling of the function with only ρL points, and thus a variance which behaves as 1/(ρL) rather than 1/L. Here ρ is the fraction of points in phase space accessible to the system. The variance in a random sampling may indeed be very large when it is observed that for a liquid of 100 atoms it has been estimated that the Boltzmann factor will be non-zero for 1 out of about 10260 points in phase space, that is, ρ = 10–260.
Hence, it would clearly be advisable to carry out a non-uniform Monte Carlo importance sampling of configuration space with a w approximately proportional to the Boltzmann factor.
Unfortunately, the simple importance sampling as described above cannot be used to sample multidimensional integrals over configuration space as in Eq. (I.1). The reason is simply that we do not know how to construct the transformation in Eq. (I.7) that will enable us to generate points in configuration space with a probability density as given by the Boltzmann factor. In fact, in order to do so, we must be able to compute analytically the partition function of the system. If we could do that, there would hardly be any need for computer simulations!
In many cases in statistical mechanics, we are not interested in the configurational part of the partition function itself, but in averages of the type in Eq. (I.1), where the ratio between integrals is involved. Metropolis et al.  showed that it is possible to devise an efficient Monte Carlo scheme to sample such a ratio even when we do not know the probability density P(q) in configuration space: