To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Carbon is a unique and, in many ways, a fundamental element. It can form several different structures, or allotropes. Up to the end of the 1970s, diamond, graphite and graphite-based fibres were the only known forms of carbon assemblies. The discovery in 1985, and subsequent synthesis, of the class of cage-like fullerene molecules, starting with the discovery of the spherically shaped C60 buckyball molecule composed of 20 hexagonal and 12 pentagonal carbon rings, led to the emergence of the third form of condensed carbon. By the beginning of the 1990s, elongated forms of the fullerene molecule, consisting of several layers of graphene sheets, i.e. graphite basal planes, rolled into multi-walled cylindrical structures and closed at both ends with caps containing pentagonal carbon rings, were discovered. Soon afterwards, another form of these cylindrical structures, made from only a single graphene sheet, was also discovered. These two structures have come to be known as carbon nanotubes, and they form the fourth allotrope of carbon.
Three experimental techniques have so far become available for the growth of carbon nanotubes. These are the arc-discharge technique, the laser ablation technique and, recently, the chemical vapour deposition technique. The use of these techniques has led to the worldwide availability of this material, and this has ushered in new, very active and truly revolutionary areas of fundamental and applied research within many diverse and already established fields of science and technology, and also within the new sciences and technologies associated with the new century.
Fluid flow through nanoscopic structures, such as carbon nanotubes, is very different from the corresponding flow through microscopic and macroscopic structures. For example, the flow of fluids through nanomachines is expected to be fundamentally different from the flow through large-scale machines since, for the latter flow, the atomistic degrees of freedom of the fluids can be safely ignored, and the flow in such structures can be characterised by viscosity, density and other bulk properties. Furthermore, for flows through large-scale systems, the no-slip boundary condition is often implemented, according to which the fluid velocity is negligibly small at the fluid–wall boundary.
Reducing the length scales immediately introduces new phenomena, such as diffusion, into the physics of the problem, in addition to the fact that at nanoscopic scales the motion of both the walls and the fluid, and their mutual interaction, must be taken into account. It is interesting to note that the movement of the walls is strongly size-dependent. On the conceptual front, the use of standard classical concepts, such as pressure and viscosity, might also be ambiguous at nanoscopic scales, since, for example, the surface area of a nanostructure, such as a nanotube, may not be amenable to a precise definition.
In Chapter 9 we saw how to perform molecular dynamics simulations, although they were not very sophisticated because there was no means of including the effect of the environment. Ways of overcoming this limitation were introduced in the last chapter, in which we discussed methods for calculating the energy and its derivatives for a system within the PBC approximation. As an example, a molecular dynamics simulation for a periodically replicated cubic box of water molecules was performed. Here, we shall start by describing in more detail the type of information that can be computed from molecular dynamics trajectories and also by indicating how the quality of that information can be assessed. Later we shall talk about more advanced molecular dynamics techniques including those that allow simulations to be carried out in various thermodynamic ensembles and those that permit the calculation of free energies.
Analysis of molecular dynamics trajectories
We have seen how to generate trajectories of data for a system, either in vacuum or with an environment, with the molecular dynamics technique. The point of performing a simulation is, of course, that we want to use these data to calculate some interesting quantities, preferably those which can be compared with experimentally measured ones. The aim of this section is to give a brief overview of some of the techniques that can be used to analyse molecular dynamics trajectories and some of the types of quantities that can be calculated.
In Section 9.4 we defined a time series for a property as a sequence of values for the property obtained from successive frames of a molecular dynamics trajectory.
This edition of A Practical Introduction has two major differences from the previous one. The first is a discussion of quantum chemical and hybrid potential methods for calculating the potential energies of molecular systems. Quantum chemical approaches are more costly than molecular mechanical techniques but are, in principle, more ‘exact’ and greatly extend the types of phenomena that can be studied with the other algorithms described in the book. The second difference is the replacement of fortran 90 by Python as the language in which the dynamo module library and the book's computer programs are written. This change was aimed to make the library more accessible and easier to use. As well as these major changes, there have been many minor modifications, some of which I wanted to make myself but many that were inspired by the suggestions of readers of the first edition.
Once again, I would like to acknowledge my collaborators at the Institut de Biologie Structurale in Grenoble and elsewhere for their comments and feedback. Special thanks go to all members, past and present, of the Laboratoire de Dynamique Moléculaire at the IBS, to Konrad Hinsen at the Centre de Biophysique Moléculaire in Orléans and to Troy Wymore at the Pittsburgh Supercomputing Center. I would also like to thank Michelle Carey, Anna Littlewood and the staff of Cambridge University Press for their help during the preparation of this edition, Johny Sebastian from TechBooks for answers to my many LATEX questions, and, of course, my family for their support.
In the last chapter we explored various ways of specifying the composition of a molecular system. Many of these representations contained not only information about the number and type of atoms in the system but also the atoms' coordinates, which are an essential element for most molecular simulation studies. Given the nature of the system's atoms and their coordinates the molecular structure of the system is known and it is possible to deduce information about the system's physical properties and its chemistry. The generation of sets of coordinates for particular systems is the major goal of a number of important experimental techniques, including X-ray crystallography and NMR spectroscopy, and there are data banks, such as the PDB and the Cambridge Structural Database (CSD), that act as repositories for the coordinate sets of molecular systems obtained by such methods.
There are several alternative coordinate systems that can be used to define the atom positions. For the most part in this book, Cartesian coordinates are employed. These give the absolute position of an atom in three-dimensional space in terms of its x, y and z coordinates. Other schemes include crystallographic coordinates in which the atom positions are given in a coordinate system that is based upon the crystallographic symmetry of the system and internal coordinates that define the position of an atom by its position relative to a number of other atoms (usually three).
The aim of this chapter is to describe the various ways in which coordinates can be analysed and manipulated. Because numerous analyses can be performed on a set of coordinates, only a sampling of some of the more common ones will be covered here.
The reason that I have written this book is simple. It is the book that I would have liked to have had when I was learning how to carry out simulations of complex molecular systems. There was certainly no lack of information about the theory behind the simulations but this was widely dispersed in the literature and I often discovered it only long after I needed it. Equally frustrating, the programs to which I had access were often poorly documented, sometimes not at all, and so they were difficult to use unless the people who had written them were available and preferably in the office next door! The situation has improved somewhat since then (the 1980s) with the publication of some excellent monographs but these are primarily directed at simple systems, such as liquids or Lennard-Jones fluids, and do not address many of the problems that are specific to larger molecules.
My goal has been to provide a practical introduction to the simulation of molecules using molecular mechanical potentials. After reading the book, readers should have a reasonably complete understanding of how such simulations are performed, how the programs that perform them work and, most importantly, how the example programs presented in the text can be tailored to perform other types of calculation. The book is an introduction aimed at advanced undergraduates, graduate students and confirmed researchers who are newcomers to the field. It does not purport to cover comprehensively the entire range of molecular simulation techniques, a task that would be difficult in 300 or so pages.
We saw in Chapter 7 how it was possible to explore relatively small parts of a potential energy surface and in Chapter 8 how to use some of this information to obtain approximate dynamical and thermodynamic information about a system. These methods, though, are local – they consider only a limited portion of the potential energy surface and the dynamics of the system within it. It is possible to go beyond these ‘static’ approximations to study the dynamics of a system directly. Some of these techniques will be introduced in the present chapter.
Molecular dynamics
As we discussed in Chapter 4, complete knowledge of the behaviour of a system can be obtained, in principle, by solving its time-dependent Schrödinger equation (Equation (4.1)), which governs the dynamics of all the particles in the system, both electrons and nuclei. To progress in the solution of this equation we introduced the Born–Oppenheimer approximation, which allows the electronic and the nuclear problems to be treated separately. This separation leads to the concept of a potential energy surface, which is the effective potential that the nuclei experience once the electronic problem has been solved. In principle, it is possible to study the dynamics of the nuclei under the influence of the effective electronic potential using an equivalent equation to Equation (4.1) but for the nuclei only. This can be done for systems consisting of a very small number of particles but proves impractical otherwise.
Fortunately, whereas it is difficult or impossible to study the dynamics of the system quantum mechanically, a classical dynamical study is relatively straightforward and provides much useful information.
In the last chapter we dealt with how to manipulate a set of coordinates and how to compare the structures defined by two sets of coordinates. This is useful for distinguishing between two different structures but it gives little indication regarding which structure is the more probable; i.e. which structure is most likely to be found experimentally. To do this, it is necessary to be able to evaluate the intrinsic stability of a structure, which is determined by its potential energy. The differences between the energies of different structures, their relative energies, will then determine which structure is the more stable and, hence, which structure is most likely to be observed.
This chapter starts off by giving an overview of the various strategies that are available for calculating the potential energy of molecular systems and then goes on to describe a specific class of techniques based upon the application of the theory of quantum mechanics to chemical systems.
The Born–Oppenheimer approximation
Quantum mechanics was developed during the first decades of the twentieth century as a result of shortcomings in the existing classical mechanics and, as far as is known, it is adequate to explain all atomic and molecular phenomena. In an oft-quoted, but nevertheless pertinent, remark, P. A. M. Dirac, one of the founders of quantum mechanics, said in 1929:
The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated to be soluble. It therefore becomes desirable that approximate practical methods of applying quantum mechanics should be developed, which can lead to an explanation of the main features of complex atomic systems without too much computation.
The last two chapters considered two distinct classes of methods for calculating the potential energy of a system. Chapter 4 discussed QC techniques. These are, in principle, the most ‘exact’ methods but they are expensive and so are limited to studying systems with relatively small numbers of atoms. MM approaches were introduced in Chapter 5. These represent the interactions between particles in a simpler way than do QC methods and so are more rapid and, hence, applicable to much larger systems. They have the disadvantage, though, of being unsuitable for treating some processes, notably chemical reactions. Hybrid potentials, which are described in this chapter, seek to overcome some of the limitations of QC and MM methods by putting them together.
Combining QC and MM potentials
Although a hybrid potential is, in principle, any method that employs different potentials to treat a system, either spatially or temporally, the potentials we focus upon in this chapter use a combination of QC and MM techniques. These methods are also known as QC/MM or QM/MM potentials. The first potential of this type was developed in the 1970s by A. Warshel and M. Levitt who were studying the mechanism of the chemical reaction catalyzed by the enyzme lysozyme. Enzymes are proteins that can greatly accelerate the rate of certain chemical reactions. How they achieve this is still a matter of active research but the reaction itself occurs when the substrate species are bound close together in a specific part of the enzyme called the active site.
The aim of this book is to give a practical introduction to performing simulations of molecular systems. This is accomplished by summarizing the theory underlying the various types of simulation method and providing a programming library, called pDynamo, which can be used to perform the calculations that are described. The style of the book is pragmatic. Each chapter, in general, contains some theory about related simulation topics together with descriptions of example programs that illustrate their use. Suggestions for further work (or exercises) are listed at the end.
By the end of the book, readers should have a good idea of how to simulate molecular systems as well as some of the difficulties that are involved. The pDynamo library should also be a reasonably convenient starting point for those wanting to write programs to study the systems they are interested in. The fact that users have to write their own programs to do their simulations has advantages and disadvantages. The major advantage is flexibility. Many molecular modeling programs come with interfaces that supply only a limited range of options. In contrast, the simulation algorithms in pDynamo can be combined arbitrarily and much of the data generated by the program is available for analysis. The drawback is that the programs have to be written – a task that many readers may not be familiar with or have little inclination to do themselves. However, those who fall into the latter category are urged to read on. pDynamo has been designed to be easy to use and should be accessible to everyone even if they have only a minimum amount of computing experience.
Models and representations are crucial in all areas of science. They are employed whenever one thinks about an object or a phenomenon and whenever one wants to interpret or to predict how it is going to behave. Models need not be unique. In fact, it is normal to have multiple representations of varying complexity and to choose the one that is most appropriate given the circumstances.
The same is true when thinking about chemical and molecular systems. Representations encompass the traditional chemical ones that employ atoms and bonds, and also span the range from the very fundamental, in which a molecule is considered as a collection of nuclei and electrons, through to the highly abstract, in which a molecule is treated as a chemical graph. Several of these are illustrated in Figure 2.1.
The purpose of this chapter is twofold. First, we introduce the way in which pDynamo represents and manipulates molecular systems. Of course, we can only start to discuss these topics here as pDynamo has a diversity of approaches that will take the rest of the book to explain. Second, we describe several common molecular representations that are used in modeling and simulation studies and show how pDynamo can transform between them.
The System class
System is the central class of the pDynamo program and it will be used directly or indirectly in most of the examples in this book. Its purpose is to gather together the data that are required to represent, to model and to simulate a molecular system. The fundamental data in a system are the sequence of atoms that it contains.