MD Tutorials – Lets Play with Digital Biology

Molecular dynamics is a computational technique which study the time dependent behavior of bio molecules and other chemical entities. MD is one of the essential tools to study the time dependent behavior of the microscopic system like bio-molecules.

Molecular dynamics simulations permit the study of complex, dynamic processes that occur in biological systems. These include, for example,

Protein stability
Conformational changes
Protein folding
Molecular recognition: proteins, DNA, membranes, complexes
Ion transport in biological systems
Drug Design

Historical background

The molecular dynamics method was first introduced by Alder and Wainwright in the late 1950’s (Alder and Wainwright, 1957,1959) to study the interactions of hard spheres. Many important insights concerning the behavior of simple liquids emerged from their studies. The next major advance was in 1964, when Rahman carried out the first simulation using a realistic potential for liquid argon (Rahman, 1964). The first molecular dynamics simulation of a realistic system was done by Rahman and Stillinger in their simulation of liquid water in 1974 (Stillinger and Rahman, 1974). The first protein simulations appeared in 1977 with the simulation of the bovine pancreatic trypsin inhibitor (BPTI) (McCammon, et al, 1977). Today in the literature, one routinely finds molecular dynamics simulations of solvated proteins, protein-DNA complexes as well as lipid systems addressing a variety of issues including the thermodynamics of ligand binding and the folding of small proteins. The number of simulation techniques has greatly expanded; there exist now many specialized techniques for particular problems, including mixed quantum mechanical – classical simulations, that are being employed to study enzymatic reactions in the context of the full protein. Molecular dynamics simulation techniques are widely used in experimental procedures such as X-ray crystallography and NMR structure determination.

3. Statistical Mechanics

Molecular dynamics simulations generate information at the microscopic level, including atomic positions and velocities. The conversion of this microscopic information to macroscopic observables such as pressure, energy, heat capacities, etc., requires statistical mechanics. Statistical mechanics is fundamental to the study of biological systems by molecular dynamics simulation. In this section, we provide a brief overview of some main topics. For more detailed information, refer to the numerous excellent books available on the subject.

Introduction to Statistical Mechanics:

In a molecular dynamics simulation, one often wishes to explore the macroscopic properties of a system through microscopic simulations, for example, to calculate changes in the binding free energy of a particular drug candidate, or to examine the energetics and mechanisms of conformational change. The connection between microscopic simulations and macroscopic properties is made via statistical mechanics which provides the rigorous mathematical expressions that relate macroscopic properties to the distribution and motion of the atoms and molecules of the N-body system; molecular dynamics simulations provide the means to solve the equation of motion of the particles and evaluate these mathematical formulas. With molecular dynamics simulations, one can study both thermodynamic properties and/or time dependent (kinetic) phenomenon.

Statistical mechanics is the branch of physical sciences that studies macroscopic systems from a molecular point of view. The goal is to understand and to predict macroscopic phenomena from the properties of individual molecules making up the system. The system could range from a collection of solvent molecules to a solvated protein-DNA complex. In order to connect the macroscopic system to the microscopic system, time independent statistical averages are often introduced. We start this discussion by introducing a few definitions.

Definitions

The thermodynamic state of a system is usually defined by a small set of parameters, for example, the temperature, T, the pressure, P, and the number of particles, N. Other thermodynamic properties may be derived from the equations of state and other fundamental thermodynamic equations.

The mechanical or microscopic state of a system is defined by the atomic positions, q, and momenta, p; these can also be considered as coordinates in a multidimensional space called phase space. For a system of N particles, this space has 6N dimensions. A single point in phase space, denoted by G, describes the state of the system. An ensemble is a collection of points in phase space satisfying the conditions of a particular thermodynamic state. A molecular dynamics simulations generates a sequence of points in phase space as a function of time; these points belong to the same ensemble, and they correspond to the different conformations of the system and their respective momenta. Several different ensembles are described below.

An ensemble is a collection of all possible systems which have different microscopic states but have an identical macroscopic or thermodynamic state.There exist different ensembles with different characteristics.

Calculating Averages from a Molecular Dynamics Simulation

An experiment is usually made on a macroscopic sample that contains an extremely large number of atoms or molecules sampling an enormous number of conformations. In statistical mechanics, averages corresponding to experimental observables are defined in terms of ensemble averages; one justification for this is that there has been good agreement with experiment. An ensemble average is average taken over a large number of replicas of the system considered simultaneously.

In statistical mechanics, average values are defined as ensemble averages.

The ensemble average is given by

where

is the observable of interest and it is expressed as a function of the momenta, p, and the positions, r, of the system. The integration is over all possible variables of r and p.

The probability density of the ensemble is given by

where H is the Hamiltonian, T is the temperature, kB is Boltzmann’s constant and Q is the partition function

This integral is generally extremely difficult to calculate because one must calculate all possible states of the system. In a molecular dynamics simulation, the points in the ensemble are calculated sequentially in time, so to calculate an ensemble average, the molecular dynamics simulations must pass through all possible states corresponding to the particular thermodynamic constraints.

Another way, as done in an MD simulation, is to determine a time average of A, which is expressed as

where t is the simulation time, M is the number of time steps in the simulation and A(pN,rN) is the instantaneous value of A.

The dilemma appears to be that one can calculate time averages by molecular dynamics simulation, but the experimental observables are assumed to be ensemble averages. Resolving this leads us to one of the most fundamental axioms of statistical mechanics, the ergodic hypothesis, which states that the time average equals the ensemble average.

The Ergodic hypothesis states

Ensemble average = Time average

The basic idea is that if one allows the system to evolve in time indefinitely, that system will eventually pass through all possible states. One goal, therefore, of a molecular dynamics simulation is to generate enough representative conformations such that this equality is satisfied. If this is the case, experimentally relevant information concerning structural, dynamic and thermodynamic properties may then be calculated using a feasible amount of computer resources. Because the simulations are of fixed duration, one must be certain to sample a sufficient amount of phase space.

Some examples of time averages:

Average potential energy

where M is the number of configurations in the molecular dynamics trajectory and Vi is the potential energy of each configuration.

Average kinetic energy

where M is the number of configurations in the simulation, N is the number of atoms in the system, mi is the mass of the particle i and vi is the velocity of particle i.

A molecular dynamics simulation must be sufficiently long so that enough representative conformations have been sampled.

Classical Mechanics

The molecular dynamics simulation method is based on Newton’s second law or the equation of motion, F=ma, where F is the force exerted on the particle, m is its mass and a is its acceleration. From a knowledge of the force on each atom, it is possible to determine the acceleration of each atom in the system. Integration of the equations of motion then yields a trajectory that describes the positions, velocities and accelerations of the particles as they vary with time. From this trajectory, the average values of properties can be determined. The method is deterministic; once the positions and velocities of each atom are known, the state of the system can be predicted at any time in the future or the past. Molecular dynamics simulations can be time consuming and computationally expensive. However, computers are getting faster and cheaper. Simulations of solvated proteins are calculated up to the nanosecond time scale, however, simulations into the millisecond regime have been reported.

Newton’s equation of motion is given by

where F_i is the force exerted on particle i, m_iis the mass of particle i and a_iis the acceleration of particle i. The force can also be expressed as the gradient of the potential energy,

Combining these two equations yields

where V is the potential energy of the system. Newton’s equation of motion can then relate the derivative of the potential energy to the changes in position as a function of time.

Newton’s Second Law of motion: a simple application

Taking the simple case where the acceleration is constant,

we obtain an expression for the velocity after integration

and since

we can once again integrate to obtain

Combining this equation with the expression for the velocity, we obtain the following relation which gives the value of x at time t as a function of the acceleration, a, the initial position, x₀ , and the initial velocity, v₀..

The acceleration is given as the derivative of the potential energy with respect to the position, r,

Therefore, to calculate a trajectory, one only needs the initial positions of the atoms, an initial distribution of velocities and the acceleration, which is determined by the gradient of the potential energy function. The equations of motion are deterministic, e.g., the positions and the velocities at time zero determine the positions and velocities at all other times, t. The initial positions can be obtained from experimental structures, such as the x-ray crystal structure of the protein or the solution structure determined by NMR spectroscopy.

The initial distribution of velocities are usually determined from a random distribution with the magnitudes conforming to the required temperature and corrected so there is no overall momentum, i.e.,

The velocities, vi, are often chosen randomly from a Maxwell-Boltzmann or Gaussian distribution at a given temperature, which gives the probability that an atom i has a velocity v_xin the x direction at a temperature T.

The temperature can be calculated from the velocities using the relation

where N is the number of atoms in the system.