## Large time-step molecular dynamics

Computational modeling of complex chemical systems has become widespread in the molecular sciences both as a tool to rationalize experimental results that are difficult to interpret on their own and as a means of predicting new phenomena and engineering new materials. Examples include predicting the positive and negative effects of certain drugs by modeling their interactions with different enzymes in the body, understanding the workings of proteins by modeling how they fold into their biologically active forms or misfold into shapes that lead to diseases such as Alzheimer’s, Parkinson’s, and cataracts, and accelerating, via the synergy between computation and experiment, the design of novel polymers for use in energy technologies, such as fuel cells, or environmental technologies, such as water desalinators.

One of the most commonly used computational modeling approaches, Molecular Dynamics (MD), starts with the basic Newtonian laws of mechanics and employs high-speed computers to solve the complex equations that generate the motion of each atom in the system. These equations require the input of the complete set interatomic forces, which are often intricate functions of the atomic positions. The accuracy of these forces is often of paramount importance for a reliable model of a complex system, and unfortunately, as a general rule, the more accurate the forces, the greater the time needed to calculate them and run the associated computer simulation. This is a significant bottleneck in computational modeling, as it is typically important to generate very long computer simulations in order to gain a deep understanding of the chemical processes and microscopic mechanisms at work in a complex system.

In principle, the exact interatomic forces arise from the solution of the highly celebrated Schrödinger equation of quantum mechanics that describes the distribution of electrons in the system. However, this equation is so complex that exact solution of it is simply impossible for systems containing more than about ten atoms, and approximations are needed. The approximation employed should be based on the phenomena one hopes to capture with it. The force model employed in this project is an empirical one that captures enough of the physics encoded in the Schrödinger equation to model structure and to approximate dynamics in a complex system provided no chemical bond breaking, i.e., no actual chemistry, occurs, and this is sufficient for many of the problems mentioned above. However, this model still carries a high computational overhead, and it has been our objective in this project to develop algorithms for reducing the cost of these force calculations.

The forces in our model can be broken down into contributions that are rapidly varying yet very fast to calculate and contributions that are slowly varying and highly time-consuming to calculate. Thus, this natural separation of the forces can be exploited to increase the efficiency of the computer simulations by evaluating the slow forces, which are the computational bottleneck, less often than the fast forces. In principle, it should be possible to calculate the slow forces roughly once for every 100-200 evaluations of the fast forces! In practice, however, this ratio can never be achieved with standard algorithms due to numerical stability issues, known as *resonances*, that cause the energy of the system to increase without bounds resulting in a calculation that generates incorrect results. We have developed an algorithm that implements a new energy stabilizing condition that entirely removes this numerical instability, thus allowing the full ratio between fast and slow force evaluations to be achieved. The breakthrough allows simulations that might otherwise take a year to run to be completed in less than one month.

For realistic force fields, the limit on the time step is around 3-5 fs when standard MTS methods are employed. We previously developed a resonance-free MTS integration scheme that solves this problem and allows time steps for the slow forces to be increased to as large as 100 fs, which greatly improves the efficiency of these schemes. The first version of the method, which was purely deterministic, was published in *Phys. Rev. Lett*. **93**, 150201 (2004), and more recently, a stochastic version of the method was published in *Mol. Phys.*. **111**, 3579 (2013). The method has now been adapted for polarizable models, as recently published in *J. Chem. Theor. Comput*. **12**, 2170 (2016). The basic scheme couples an isokinetic constraint with a Nosé-Hoover Langevin scheme. Given a coordinate *q* and velocity *v*, the basic equations of motion are

Here, λ is Lagrange multiplier given by

which enforces the following kinetic energy constraint:

In these equations, *v*_{1} and *v*_{2} are extended phase-space variables that act as a thermostat on the system. These equations can be shown to generate a proper canonical distribution in the potential energy, and because of the isokinetic constraint, no mode is allowed to accumulate too much energy, thereby circumventing resonances. Thus, when these equations are integrated within a multiple time-step algorithm, very large time steps can be achieved.

We demonstrate the performance of the algorithm on MD simulations of a fixed-charge water model, SPC/E. In these simulations, the time step for the bonds and bends is 0.5 fs, that for all non-bonded interactions within 6 Å is 3.0 fs, and then we check to see how large the time step Δ*T *can be chosen for all non-bonded interactions between 6 Å and 12.4 Å. Note that the division into short-range and long-range non-bonded forces is also applied to the reciprocal-space part of the Ewald sum, which is divided into large and small reciprocal-space vectors. The accompanying figure shows the radial distribution functions for different large time steps compared to a benchmark NVT simulation with a single time step of 0.5 fs. We also show the *L*_{1} error of each RDF compared to the benchmark as a function of time for the different large time steps.

Note that there is no change in the quality of the RDFs even if the large time step is 99 fs.

The method has also been adapted for the polarizable AMOEBA model by dividing the self-consistent polarization forces into short-range and long-range contributions, both of which need to be solved self-consistently. In these simulations, we designate the forces to which the Nosé-Hoover Langevin thermostat is directly coupled by XO (coupled to the long-range forces), XM (coupled to the short-range forces), and XI (coupled to the intramolecular forces). The large time step for XO is 75 fs while that for XM and XI is 120 fs. The RDFs are shown in the accompanying figure, together with the hydrogen-bond angles, α, β, and θ, all shown in the figure.

The method has now been implemented in the TINKER package, for which we have assessed the gains in computational time for the different schemes shown in this figure. These timings are shown below.