There are two main methods that are used to move molecules in a simulation;
Molecular Dynamics moves molecules by calculating forces on the atoms, converting those forces into accelerations (using Newton's second law) and then integrating those accelerations over time to generate a trajectory.
Monte Carlo moves molecules using random numbers, and by performing random moves. In this workshop you will be introduced to Monte Carlo sampling, and will study a simple Monte Carlo program (written in either Perl or C++). This will help you to understand exactly how Monte Carlo works, and to play with the key parameters that control a Monte Carlo simulation. To run this workshop, you will need access to a Perl interpreter, to a molecule (PDB) viewer (e.g. VMD), and to a spreadsheet program (e.g. Excel or Libreoffice Calc). If you want to use the C++ version of the code, then you will need access to a C++ compiler, such as g++ (part of GCC).
Monte Carlo is named after the city Monte Carlo, which is part of the principality of Monaco in the south of France. Monte Carlo is famous both for the Formula 1 Grand Prix, and for the Monte Carlo casinos. The Monte Carlo casinos are very famous in Europe and the US, and so the Monte Carlo method is named after this casinos, because, like gambling, it is based on lots of random numbers.
Monte Carlo methods have a long history, and have been applied to the modelling of everything from economics to traffic jams. In molecular science, we use a version of Monte Carlo that is called "Metropolis Monte Carlo". This method was developed in 1953 by Metropolis, Rosenbluth, Rosenbluth, Teller and Teller. Rosenbluth and Rosenbluth were man and wife, as were Teller and Teller, and the story is that the four of them, with Metropolis, developed the method one day while having a pic-nic meal together one sunny afternoon. The paper describing the method is one of the greatest, and shortest papers in the history of molecular modelling, and should be read by any serious student of computational science.
Here is a simple Monte Carlo program. It simulates a periodic box of krypton atoms. There are two versions;
The first is the program (metropolis_pl) written using Perl. If you don't know Perl, you can learn about Perl by working through this Perl workshop. Perl is a scripting language, so is easy to understand. However, programs written in Perl are quite slow. C++ is a compiled language. The second program (metropolis.cpp) is the C++ version of the program. This is identical to the Perl version, except it is written in C++, and will run about 100 times faster than the Perl version.
To use the Perl version, type on the command line;
$ perl metropolis_pl
This will print a lot of energies to the screen, and will produce a large number of output PDB coordinate files (called output??????.pdb, where ?????? is the number of the step for which the coordinates were generated).
To use the C++ version, you must first compile it. You can compile it by typing on the command line;
g++ -O3 metropolis.cpp -o metropolis
You can then run the C++ version by typing;
./metropolis
This too will print a large number of energies, and will produce a large number of output PDB files (named in the same way as the Perl script).
There are many different types of Monte Carlo. The one used in molecular simulation is called "Metropolis Monte Carlo". This is named after Metropolis, who was one of the five authors of the famous 1953 paper that first introduced the method.
Metropolis Monte Carlo is based on random numbers and has a very simple algorithm.
For Metropolis Monte Carlo, in a NVT (canonical) ensemble, the Monte Carlo test is;
exp( -(E_{new} - E_{old}) / kT ) >= random(0,1)
where E_{new} is the "new energy", E_{old} is the "old energy", kT is Boltzmann's constant (k) multiplied by temperature (T), and random(0,1) is a random number between 0 and 1.
Here is the code from the Perl Monte Carlo program that implements the above test;
# calculate the old energy
$old_energy = calculate_energy();
# Pick a random atom
$atom = int( rand($n_atoms) );
# save the old coordinates
@old_coords = ( $coords[$atom][0], $coords[$atom][1],
$coords[$atom][2] );
# Make the move - translate by a delta in each dimension
$delta_x = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$delta_y = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$delta_z = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$coords[$atom][0] = $coords[$atom][0] + $delta_x;
$coords[$atom][1] = $coords[$atom][1] + $delta_y;
$coords[$atom][2] = $coords[$atom][2] + $delta_z;
# wrap the coordinates back into the box
$coords[$atom][0] = wrap_into_box($coords[$atom][0], $box_size[0]);
$coords[$atom][1] = wrap_into_box($coords[$atom][1], $box_size[1]);
$coords[$atom][2] = wrap_into_box($coords[$atom][2], $box_size[2]);
# calculate the new energy
$new_energy = calculate_energy();
$accept = 0;
# Automatically accept if the energy goes down
if ($new_energy <= $old_energy)
{
$accept = 1;
}
else
{
# Now apply the Monte Carlo test - compare
# exp( -(E_new - E_old) / kT ) >= rand(0,1)
$x = exp( -($new_energy - $old_energy) / $kT );
if ($x >= rand(1.0))
{
$accept = 1;
}
else
{
$accept = 0
}
}
if ($accept == 1)
{
# accept the move
$naccept = $naccept + 1;
$total_energy = $new_energy;
}
else
{
# reject the move - restore the old coordinates
$nreject = $nreject + 1;
$coords[$atom][0] = $old_coords[0];
$coords[$atom][1] = $old_coords[1];
$coords[$atom][2] = $old_coords[2];
$total_energy = $old_energy;
}
Find this code in metropolis_pl (it is near the bottom of the script). The "old energy" is calculated using the line;
# calculate the old energy
$old_energy = calculate_energy();
Note that calculate_energy() is a function that calculates the total energy of the molecular system.
An atom to move is chosen at random using the line;
# Pick a random atom
$atom = int( rand($n_atoms) );
Note that int( rand($n_atoms) ) is a function that generates a random number between 0 and $n_atoms - 1. This chooses the index of the random atom in the $coords array, and saves that index in the variable $atom.
The coordinates of the chosen atom are saved into the array @old_coords using the line;
# save the old coordinates
@old_coords = ( $coords[$atom][0], $coords[$atom][1],
$coords[$atom][2] );
In this program, a random move is just a random translation along the x, y and z axis by a random value between -$max_translate and $max_translate Angstroms. This is performed using these lines;
# Make the move - translate by a delta in each dimension
$delta_x = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$delta_y = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$delta_z = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$coords[$atom][0] = $coords[$atom][0] + $delta_x;
$coords[$atom][1] = $coords[$atom][1] + $delta_y;
$coords[$atom][2] = $coords[$atom][2] + $delta_z;
The krypton atoms are in a periodic box. This random translation may have moved them outside of the box, so the atoms have to be wrapped back into the box. This is achieved using these lines;
# wrap the coordinates back into the box
$coords[$atom][0] = wrap_into_box($coords[$atom][0], $box_size[0]);
$coords[$atom][1] = wrap_into_box($coords[$atom][1], $box_size[1]);
$coords[$atom][2] = wrap_into_box($coords[$atom][2], $box_size[2]);
The "new energy" is calculated using this line, and stored in the variable $new_energy;
# calculate the new energy
$new_energy = calculate_energy();
The Metropolis Monte Carlo test is performed using these lines;
$accept = 0;
# Automatically accept if the energy goes down
if ($new_energy <= $old_energy)
{
$accept = 1;
}
else
{
# Now apply the Monte Carlo test - compare
# exp( E_new - E_old / kT ) >= rand(0,1)
$x = exp( -($new_energy - $old_energy) / $kT );
if ($x >= rand(1.0))
{
$accept = 1;
}
else
{
$accept = 0
}
}
If the energy goes down ($new_energy is less than or equal to $old_energy) then the test is automatically passed ($accept is set to 1, which means "pass"). Otherwise, the exponential of the difference of $new_energy and $old_energy, divided by kT is calculated, and stored in the variable $x. This is compared to a random number between 0 and 1 (rand(1.0)). If it is greater than or equal to the random number, then the test is passed. Otherwise the test fails (and $accept is set equal to 0, which means "fail").
If the Metropolis Monte Carlo test is passed, then the following code is run;
if ($accept == 1)
{
# accept the move
$naccept = $naccept + 1;
$total_energy = $new_energy;
}
This code increases the number of accepted moves (stored in $naccept) by one. It also saves the total energy of the system into the variable $total_energy.
If the Metropolis Monte Carlo test is failed, then the following code is run;
else
{
# reject the move - restore the old coordinates
$nreject = $nreject + 1;
$coords[$atom][0] = $old_coords[0];
$coords[$atom][1] = $old_coords[1];
$coords[$atom][2] = $old_coords[2];
$total_energy = $old_energy;
}
The number of rejected moves (stored in $nreject) is increased by one. The "old coordinates" of the atom are restored from the @old_coords array. The total energy of the system is then saved into the variable $total_energy.
That is Metropolis Monte Carlo :-).
Now you understand the algorithm, please run the metropolis_pl Monte Carlo program again using the command;
$ perl metropolis_pl
Watch what is printed out (the move number, then the total energy, then the number of accepted moves, then the number of rejected moves), e.g.
10: 263471.098167728 5 5 20: 114030.78800927 11 9 30: 5999.19636795258 17 13 40: 4719.87647170638 26 14 50: 4686.88406567588 31 19 60: 539.640252029864 37 23 70: 388.498103662198 43 27 80: 279.303981060543 51 29 90: 257.443473831169 60 30 100: 255.040363808318 68 32
Keep running the program until is has finished. It will produce a lot of PDB output coordinate files, called output??????.pdb. View these files in VMD. You can make a movie of these files in Linux by typing;
$ vmd output*.pdb
If you are using Windows, then you have to create a VMD animation script. You can do this by downloading the script animatepdb_pl. Then type;
$ perl animatepdb_pl output*.pdb > movie.vmd
Open VMD, then use the Tcl shell to change into the directory that contains movie.vmd. In the Tcl shell then type;
play movie.vmd
The movie is of a periodic box of krypton. To see the periodic cells, open the "Graphical Representations" window in VMD and select the "Periodic" tab. Under "Select periodic images to draw" click the "+X", "-X", "+Y", "-Y", "+Z" and "-Z" boxes. This will display those periodic images. Click on the "Play" icon in the "VMD Main" window to play the movie.
Metropolis Monte Carlo is controlled by a number of variables. These effect the efficiency of the sampling, and so the number of moves that are required to converge a thermodynamic average.
The key variables are set at the top of the metropolis_pl script;
# Set the number of atoms in the box $n_atoms = 25; # Set the number of Monte Carlo moves to perform $num_moves = 5000; # Set the size of the box (in Angstroms) @box_size = ( 15.0, 15.0, 15.0 ); # The maximum amount that the atom can be translated by $max_translate = 0.5; # angstroms # Simulation temperature $temperature = 298.15; # kelvin
The number of krypton atoms in the box is set by the variable $n_atoms. Currently there are only 25 atoms. Try increasing or decreasing the number and re-running the script. How does this change the way the program runs? How does this change the number of accepted and rejected moves?
The total number of Monte Carlo moves to perform is set by the $num_moves variable. Currently it is 5000. This is not a lot. Typical Monte Carlo simulations use millions or billions of moves. Perl is too slow to run this many moves. The C++ version of this program (metropolis.cpp) runs 100 times faster, and is set to up 500,000 moves. Use the C++ program if you want to run lots of moves.
The size of the periodic box is set in the @box_size array. Try making the box bigger and smaller, and re-running metropolis_pl. How does this change the energy? How does it change the movie? How does it change the number of accepted and rejected moves?
The maximum amount by which to translate an atom is set in the $max_translate variable. This is currently 0.5 Angstroms. Try increasing and decreasing this value and re-running metropolis_pl How does this affect the movie, and the number of accepted and rejected moves?
Finally, the simulation temperature is set in the $temperature variable. It is currently 298.15 Kelvin (room temperature). Try increasing and decreasing the temperature, and re-running metropolis_pl. How does this affect the energy, and the number of accepted and rejected moves?
Now you have an understanding of how the Metropolis Monte Carlo program works, it is time to make it do something useful. Try to change the temperature ($temperature, box size (@box_size) and maximum translation ($max_translate) variables so that you can make solid, then gas, then liquid krypton. You will need to change the variables, and then run the metropolis_pl script. For Monte Carlo to be efficient, you need about 40%-60% of the moves to be accepted. If too many moves are rejected, then reduce the value of $max_translate. If too many moves are accepted, then increase $max_translate.
Once you have working parameters, copy them into the C++ program metropolis.cpp. You can set the variables in the lines near the top of this program;
// Set the number of atoms in the box
const int n_atoms = 25;
// Set the number of Monte Carlo moves to perform
const int num_moves = 500000;
// Set the size of the box (in Angstroms)
const double box_size[3] = { 10.0, 10.0, 10.0 };
// The maximum amount that the atom can be translated by
const double max_translate = 0.5; // angstroms
// Simulation temperature
const double temperature = 298.15; // kelvin
The variables have the same name as in Perl.
Compile the C++ program and run the program by typing;
$ g++ -O3 metropolis.cpp -o metropolis $ ./metropolis
Just like with the Perl program, you will get a print out of the moves, energy, and number of accepted and rejected steps, e.g.
1000: 19.876323 264 736 2000: -1.941197 420 1580 3000: -8.801116 565 2435 4000: -10.333685 688 3312 5000: -10.266989 838 4162 6000: -13.508316 976 5024 7000: -3.880514 1141 5859 8000: -11.871016 1300 6700 9000: -9.966295 1455 7545 10000: -7.182279 1618 8382
Copy these into a spreadsheet program (like Excel) and graph the total energy versus move number. Also get the spreadsheet to calculate the acceptance ratio of the Monte Carlo moves. This is the number of accepted moves divided by the total number of moves (e.g. $naccept / ($naccept + $nreject)). Is the acceptance ratio constant during the simulation? Does the simulation need to be equilibrated, or is the energy stable after 500,000 moves?
You have simulated a periodic box of krypton. Each atom has three dimensions of motion; left-right, up-down and front-back. This means that the space of all possible locations of one atom is three dimensional. The space for all possible locations for two atoms is six dimensional. The space for all possible locations for N atoms is 3xN-dimensional(*). This space is given the name "phase space". A Monte Carlo simulation samples configurations from this phase space. If the simulation was run for an infinite time, then every possible configuration of the system would be sampled. The set of configurations sampled is called an "ensemble" (another word that means "collection"). The configurations in the ensemble appear with different probabilities. High energy structures appear with a low probability, while low energy structures appear with a high probability. The probability of a structure appearing in the collection can be calculated using the Boltzmann equation (download this document for a derivation and explanation of the Boltzmann equation).
In your simulation, the simulation was run using a constant number of particles (constant N), constant volume (constant V) and constant temperature (constant T). This meant that the simulation sampled an NVT ensemble, and the NVT Boltzmann equation was used to control the probability of whether a structure appeared in the ensemble. There are several different ensembles;
In a Monte Carlo simulation, the ensemble generated depends on the types of moves performed, and on the form of the Monte Carlo acceptance test. There are different Monte Caro tests for different ensembles. For example, for the canonical (NVT) ensemble, the Monte Carlo test is;
exp( -ΔE / kT ) >= rand(0,1)
The exponential of the change in energy (ΔE) divided by Boltzmann's constant (k) times temperature (T), must be greater than or equal to a random number drawn uniformly between 0 and 1 for the test to be passed.
For the isothermal-isobaric ensemble, the Monte Carlo test is;
exp( -( ΔE - pΔV ) / kT + N Δ ln V ) >= rand(0,1)
The exponential of the change in energy (ΔE) with the pressure (p) times the change in volume (ΔV), combined with the number of particles (N) and the change in the natural logarithm of the volume (Δ ln V) must be greater or equal to a random number drawn uniformly between 0 and 1 for the test to be passed.
Download this document for a derivation and explanation of the different Monte Carlo tests.
(*) Note that translating or rotating the entire set of all atoms does not change the atoms (as they all remain in relatively the same position). As there are three dimensions of translating the entire box of atoms, and three dimensions of rotating the entire box of atoms, the set of all atoms only has 3N-6 dimensions.
The Monte Carlo simulation you have run sampled configurations from the canonical ensemble. This meant that the volume of the periodic box was constant, and it was not possible to specify the pressure acting on the krypton atoms. We live in a constant pressure world, not a constant volume world. To be able to specify pressure, we need to sample from the isothermal-isobaric (NPT) ensemble. This can be achieved by making two changes to the metropolis_pl script;
The first part is easy. Currently, the script uses the canonical MC acceptance test;
# Now apply the Monte Carlo test - compare
# exp( -(E_new - E_old) / kT ) >= rand(0,1)
$x = exp( -($new_energy - $old_energy) / $kT );
This must be changed to the isothermal-isobaric test;
# Now apply the Monte Carlo test - compare
# exp( -(E_new - E_old + P(V_new - V_old)) / kT
# + N ln (V_new - V_old) ) >= rand(0,1)
$x = exp( -($new_energy - $old_energy + $pressure * ($V_new - $V_old)) / $kT
+ $n_atoms * (log($V_new) - log($V_old) ) );
(note that natural logarithm in Perl in "log")
This uses the old and new volumes of the box, that are calculated and stored in the V_old and V_new variables via;
# calculate the old energy
$old_energy = calculate_energy();
# calculate the old volume of the box
$V_old = $box_size[0] * $box_size[1] * $box_size[2];
and
# calculate the new energy
$new_energy = calculate_energy();
# calculate the new volume of the box
$V_new = $box_size[0] * $box_size[1] * $box_size[2];
(note here that you can see the old and new volumes calculated after the old and new energies, so that you can work out where in the script these lines should be.).
As well as the volumes, the test also requires the pressure, which can go at the top of the script underneath the temperature;
# Simulation temperature $temperature = 298.15; # kelvin # Simulation pressure $pressure = 1; # atmospheres # Convert pressure to internal units (kcal mol-1 A-3) $pressure = $pressure * 1.458397506863647E-005;
(note that pressure is normally given in units of atmospheres, while the internal units of this program are kcal mol-1 for energy, and angstrom for length. This is why the pressure has to be converted to internal units above).
Once you have made the changes and run the simulation, you should have found that adding in the volume test on its own has not changed the simulation at all. This is because the simulation has only one type of move - the translation of a single atom of krypton. This move does not change the volume of the periodic box, so V_new will always equal V_old. This means that V_new - V_old is always zero, so the NPT acceptance test becomes equal to the NVT acceptance test.
To have an NPT simulation, we need to add in a MC move that changes the volume of the box. This can be achieved by adding in a move that changes the volume of the box by a random amount. This move should only occur once for every $n_atoms moves. This can be achieved using this code;
# save the old coordinates
@old_coords = ( $coords[$atom][0], $coords[$atom][1],
$coords[$atom][2] );
# save the old box dimensions
@old_box_size = @box_size;
# Decide if we are performing an atom move, or a volume move
if (rand(1.0) <= 1.0 / $n_atoms)
{
$is_volume_move = 1;
# 1 in $n_atoms chance of being here. Perform a volume move
# by changing the volume for a random amount
$delta_vol = rand($max_volume_change);
# Volume is the cube of the box length, so add the cube root
# of this change onto the box size
$delta_vol = $delta_vol**(1.0/3.0);
# Are we making the box bigger or smaller?
if (rand(1.0) >= 0.5)
{
$box_size[0] = $box_size[0] + $delta_vol;
$box_size[1] = $box_size[1] + $delta_vol;
$box_size[2] = $box_size[2] + $delta_vol;
}
else
{
$box_size[0] = $box_size[0] - $delta_vol;
$box_size[1] = $box_size[1] - $delta_vol;
$box_size[2] = $box_size[2] - $delta_vol;
}
# wrap the atoms back into the box
wrap_all_atoms();
}
else
{
$is_volume_move = 0;
# Make the atom move - translate by a delta in each dimension
$delta_x = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$delta_y = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$delta_z = $max_translate * ( 2.0 * rand(1.0) - 1.0 );
$coords[$atom][0] = $coords[$atom][0] + $delta_x;
$coords[$atom][1] = $coords[$atom][1] + $delta_y;
$coords[$atom][2] = $coords[$atom][2] + $delta_z;
# wrap the coordinates back into the box
$coords[$atom][0] = wrap_into_box($coords[$atom][0], $box_size[0]);
$coords[$atom][1] = wrap_into_box($coords[$atom][1], $box_size[1]);
$coords[$atom][2] = wrap_into_box($coords[$atom][2], $box_size[2]);
}
We start saving the old coordinates as before, but now we must save the old box size too, using the code;
# save the old box dimensions
@old_box_size = @box_size;
We then have to choose between performing an atom move, or a volume move. Each atom should move once between each volume move, so we choose a volume move with a probability of 1 / n_atoms, using the code;
# Decide if we are performing an atom move, or a volume move
if (rand(1.0) <= 1.0 / $n_atoms)
{
$is_volume_move = 1;
...
}
else
{
$is_volume_move = 0;
...
}
The atom move code is the same as before. The volume move code is straightforward. This adds or subtracts a random volume chosen to have a maximum value of $max_vol_change to each dimension of the box, using the code;
# 1 in $n_atoms chance of being here. Perform a volume move
# by changing the volume for a random amount
$delta_vol = rand($max_volume_change);
# Volume is the cube of the box length, so add the cube root
# of this change onto the box size
$delta_vol = $delta_vol**(1.0/3.0);
# Are we making the box bigger or smaller?
if (rand(1.0) >= 0.5)
{
$box_size[0] = $box_size[0] + $delta_vol;
$box_size[1] = $box_size[1] + $delta_vol;
$box_size[2] = $box_size[2] + $delta_vol;
}
else
{
$box_size[0] = $box_size[0] - $delta_vol;
$box_size[1] = $box_size[1] - $delta_vol;
$box_size[2] = $box_size[2] - $delta_vol;
}
Because the box has changed size, it may be necessary to wrap some atoms back into the box. This is achieved by calling the wrap_all_atoms() function, which has been added using;
sub wrap_all_atoms
{
for ($i=0; $i<$n_atoms; $i = $i + 1)
{
# wrap the coordinates back into the box
$coords[$i][0] = wrap_into_box($coords[$i][0], $box_size[0]);
$coords[$i][1] = wrap_into_box($coords[$i][1], $box_size[1]);
$coords[$i][2] = wrap_into_box($coords[$i][2], $box_size[2]);
}
}
The flag $is_volume_move is used to say whether a volume move has been performed. It is equal to one if the move was a volume move, or zero if the move was an atom move. This is used in the code run when a move is accepted or rejected to increment the number of accepted and rejected volume moves, and also to restore the old box dimensions if the move failed, e.g.
if ($accept == 1)
{
# accept the move
$naccept = $naccept + 1;
if ($is_volume_move){ $nvolaccept = $nvolaccept + 1; }
$total_energy = $new_energy;
}
else
{
# reject the move - restore the old coordinates and box size
$nreject = $nreject + 1;
if ($is_volume_move){ $nvolreject = $nvolreject + 1; }
$coords[$atom][0] = $old_coords[0];
$coords[$atom][1] = $old_coords[1];
$coords[$atom][2] = $old_coords[2];
@box_size = @old_box_size;
if ($is_volume_move){ wrap_all_atoms(); }
$total_energy = $old_energy;
}
The parameter for the move is the maximum volume change. This is normally 10% of the number of atoms, e.g. the script should also be modified to add in the maximum volume change;
# The maximum amount that the atom can be translated by
$max_translate = 0.5; # angstroms
# The maximum amount to change the volume - the
# best value is 10% of the number of atoms
$max_volume_change = 0.1 * $n_atoms; # Angstroms**3
Finally, for completeness, it is a good idea to print out information about the box size and number of accepted and rejected volume moves during a simulation, e.g.
# print the energy every 10 moves
if ($move % 10 == 0)
{
print "$move: $total_energy ($naccept : $nreject) ".
"($nvolaccept : $nvolreject)\n";
$vol = $box_size[0] * $box_size[1] * $box_size[2];
print " Box size = @box_size. Volume = $vol A^3\n";
}
2. ADVANCED. Make the same changes to the C++ metropolis.cpp program. Run the program and see what happens during a longer simulation. Note that you can download the program if you get stuck.
Use the NPT version of metropolis_pl and metropolis.cpp to run Monte Carlo simulations of krypton at different temperatures and pressures. If you haven't made your own versions of these programs, you can download them from here;
Remember, you have to compile the C++ version every time you change it, using;
$ g++ -O2 npt_metropolis.cpp -o npt_metropolis
Change the temperature, pressure and move delta to run Monte Carlo simulations of krypton in the gas, solid and liquid phases. Do the temperatures and pressures of the phase changes correspond to the experimental melting and boiling points of krypton? If not, why do you think they could be wrong?
ProtoMS is a program that can be used to run Monte Carlo simulations. It is very fast and very powerful. It is relatively easy to use, and the manual can be downloaded from here.
If you want to install ProtoMS yourself, you can do so by going to the project webpage and clicking on the downloads tab. You should download the latest current version of the code (currently version 2.22).
ProtoMS can run on Linux, Windows (using cygwin), or Mac OS X. Once you have downloaded the code, untar it using the command;
tar -zxvf ProtoMS_2.22.tgz
(obviously replace 2.22 with whatever version number you downloaded)
To compile ProtoMS, follow the instructions in the README file.
Note that you only need to download and compile ProtoMS is a copy of the program is not already installed on your computer.
Now you should download a set of tutorial input files for ProtoMS. They can be found here. Download and unpack the files and change into the protoms_tutorial/oscillator directory. This contains a simple Monte Carlo simulation of a harmonic oscillator. The simulation is described in the file input.cmd, with the input coordinates coming from the PDB file harm1t2.pdb, and the forcefield parameters in oscillators.ff. The simulation performs one million MC moves on the harmonic oscillator, which should take about 1-3 seconds to complete (ProtoMS is very fast!).
E_{average} = 0.5 * k * T
where k is Boltzmann's constant (1.987 * 10-3 kcal mol-1 K-1), and T is temperature (298.15 K). The theoretical average energy should thus be 0.2961 kcal mol-1. How close does ProtoMS get using one million moves?
Question
How many moves does ProtoMS need to run to calculate the correct average energy to 4 decimal places?
To perform Monte Carlo simulations of molecules, you must have a way of changing the internal coordinates of the molecule, e.g. Monte Carlo moves that change bond lengths, angles and dihedrals in a molecule must be performed.
Change into the directory protoms_tutorial/ethane_methanol/gas_phase. In this directory you will see a ProtoMS input file called input.cmd. This input file describes a gas-phase simulation of ethane. Run the input file by typing
$ /path/protoms2 input.cmd
where /path/ is the full directory path to the location in which the protoms2 executable is installed.
This script will have run 500 intramolecular Monte Carlo moves of methane. Look at the trajectory by copying across the snapshot-*.pdb files and viewing them in a viewer like VMD (note that there is a movie.vmd VMD movie file already in this directory in case you are using VMD on windows).
Look at how the molecule moves. The movement is controlled by a z-matrix. This z-matrix is present in the protoms_tutorial/ethane_methanol/in directory. For ethane, it is called ethane.ff. Take a look at this file. There are many different parameters in this file. The z-matrix is contained in the lines;
mode template solute ethane info translate 0.0 rotate 0.0 atom C01 ETH 3000 3000 DM3 DUM DM2 DUM DM1 DUM atom C05 ETH 3001 3001 C01 ETH DM3 DUM DM2 DUM atom H06 ETH 3002 3002 C05 ETH C01 ETH DM3 DUM atom H07 ETH 3003 3003 C05 ETH C01 ETH H06 ETH atom H08 ETH 3004 3004 C05 ETH C01 ETH H06 ETH atom H02 ETH 3005 3005 C01 ETH C05 ETH H06 ETH atom H03 ETH 3006 3006 C01 ETH C05 ETH H02 ETH atom H04 ETH 3007 3007 C01 ETH C05 ETH H02 ETH bond H02 ETH C01 ETH flex 0.01 bond H03 ETH C01 ETH flex 0.01 bond H04 ETH C01 ETH flex 0.01 bond C05 ETH C01 ETH flex 0.01 bond H06 ETH C05 ETH flex 0.01 bond H07 ETH C05 ETH flex 0.01 bond H08 ETH C05 ETH flex 0.01 angle H06 ETH C05 ETH C01 ETH flex 0.5 angle H07 ETH C05 ETH C01 ETH flex 0.5 angle H08 ETH C05 ETH C01 ETH flex 0.5 angle H02 ETH C01 ETH C05 ETH flex 0.5 angle H03 ETH C01 ETH C05 ETH flex 0.5 angle H04 ETH C01 ETH C05 ETH flex 0.5 dihedral H02 ETH C01 ETH C05 ETH H06 ETH flex 15.0
The first set of lines describe how the molecule is constructed, e.g.;
atom H07 ETH 3003 3003 C05 ETH C01 ETH H06 ETH
this line says that the atom called H07 is bonded to the atom called C05, forms an angle with C05 and C01 and a dihedral with C05, C01 and H06.
Further down in the file is a line that reads;
angle H07 ETH C05 ETH C01 ETH flex 0.5
This says that the angle H07-C05-C01, which is contained in the z-matrix, is flexible, and that Monte Carlo moves that change this angle should be performed, using a maximum angle delta of 0.5 degrees.
Similarly, the line;
dihedral H02 ETH C01 ETH C05 ETH H06 ETH flex 15.0
says that the dihedral H02-C01-C05-H06, which is contained in the z-matrix, is flexible, and Monte Carlo moves that change this dihedral should be performed using a maximum dihedral angle change of 15 degrees.
Question
The flex lines in the z-matrix specify the maximum bond, angle and dihedral deltas for intramolecular Monte Carlo moves. The number of accepted and attempted Monte Carlo moves is printed in the output of ProtoMS on lines that look like this;
INFO Data collection complete. Attempted/accepted moves 100 77
Try changing the bond, angle and dihedral maximum deltas and see if you can improve the quality of sampling of ethane and methanol in the gas-phase.
Do you think that the optimum bond, angle and dihedral deltas would be suitable for use for a water phase simulation?
Molecular simulation nearly always requires modelling of solvent. Solvent plays a key and often little understood role in many molecular and biomolecular processes, and its correct modelling is crucial to allow the results of molecular simulation to be properly compared with experiment.
Solvent in Monte Carlo simulations can be modelled in many ways. The most common is to use an explicit model of the solvent. This means that a model of each solvent molecule is constructed, and these molecules are used to solvate the solute or system being studied.
Change to the protoms_tutorial/ethane_methanol/solvated directory. This contains an input.cmd file that runs a simulation of ethane solvated in a periodic box of water molecules. The TIP4P model is used to represent water. Run the simulation by typing;
$ /path/protoms2 input.cmd
(remember to replace /path/ with the full directory path to your protoms2 executable).
Visualise the resulting snapshot PDB files from the simulation using a molecular viewer such as VMD (and again, a movie.vmd file is already provided for you if you are using VMD on windows).
You will not see much movement in the simulation because it was very short. However, what you should see is that the water molecules move as rigid bodies. ProtoMS moves solvent molecules using rigid body translation and rotation moves. The maximum amount to translate and rotate each solvent molecule is set in the parameter file for the solvent. This is contained in the file protoms_tutorial/ethane_methanol/in/solvents.ff. This contains parameters for many different solvent molecules. The relevant lines for the TIP4P water model are;
# # TIP4P (T4P) # # O00 dist(OH) = 0.9572 A # / | \ dist(OM) = 0.15 A # H01 M03 H02 ang(HOH) = 104.52 deg # mode clj par 2003 OW 8 0.000 3.15363 0.1550 par 2004 HW 1 0.520 0.0 0.0 par 2005 ?? 0 -1.040 0.0 0.0 mode template solvent T4P info translate 0.15 rotate 15.0 atom O00 2003 2003 atom H01 2004 2004 atom H02 2004 2004 atom M03 2005 2005
These lines provide the change and Lennard Jones parameters for each of the atoms in water. They also provide the maximum amount by which to translate and rotate the molecule, e.g.
info translate 0.15 rotate 15.0
means that each water molecule will be translated by a maximum of 0.15 angstroms, and rotated about a random vector by a maximum of 15 degrees.
Question
Increase the number of Monte Carlo moves performed during the simulation. This is specified in the input.cmd file. Increase the number until the movie shows that the waters are diffusing through the box.
Question
Edit the input.cmd file so that you can simulate methanol in a box of water. Run a long simulation to allow the water to equilibrate around the methanol. How does the water structure around the final snapshot from the simulation of methanol compare to the water structure from the final snapshot from the simulation of ethane?
Question
Edit the z-matrix of ethane and methanol to change the bond, angle and dihedral maximum deltas. Try to optimise these values to get a good acceptance ratio and to get a large amount of motion of the solute. Also edit the solvents.ff file to change the translation and rotation deltas of TIP4P. Again, aim to get the most motion of the water, while keeping a high acceptance ratio (about 50-60%).
Monte Carlo simulations can be used to create large numbers of configurations of a molecular system, with each configuration appearing with its correct (relative) Boltzmann probability. This collection of configurations is called an ensemble. You can do many things with a correct Boltzmann weighted ensemble of configurations. One possibility is to use it to calculate a free energy.
Change to the protoms_tutorial/ethane_methanol/free_energy directory. In here, you will see the ProtoMS input command file input.cmd. This script performs a simulation that calculates the relative hydration free energy of ethane and methanol. The simulation works by simulating both ethane and methanol simultaneously in the centre of a box of water. A lambda coordinate is used to turn off the interactions between ethane and water as the interactions between methanol and water are switched on. This is a "dual topology" method of calculating a relative free energy. The input file simulates the system at lambda=0.5. This means that 50% of the methanol and 50% of the ethane interactions with solvent are calculated. The energy difference for each configuration and lambda=0 and lambda=1 is calculated. This energy difference is averaged using the equation;
dG = -kT ln < exp( -dE / kT ) >_lambda
where dE is the energy difference between lambda=1 or lambda=0 and lambda=0.5. kT is Boltzmann's constant multiplied by temperature. The angle brackets < ... > mean "average over configurations in an ensemble", with _lambda meaning the ensemble generated for a specific value of lambda. The resulting value, dG is the difference in free energy between lambda=1 and lambda=0.5, or lambda=0 and lambda=0.5. The sum of these two free energy differences is the relative hydration free energy.
Run the free energy simulation by typing;
$ /path/protoms2 input.cmd
The final lines of the output contain the results. Look for the lines;
RESULTS Backwards Free Energy = 1.9251908496670806503 ( 2.013836560 ) RESULTS Forwards Free Energy = 0.72142800990255362414 ( 1.045066552 )
These give the free energy average (and error in brackets) for the 0.5->0.0 perturbation and the 0.5->1.0 perturbation. As you can see, the error is very large, because the simulation is too short.
Question
Another way to reduce the size of the errors is to reduce the gap between the reference state (lambda=0.5) and the perturbed states (lambda=0 and lambda=1). Run a series of simulations that break the relative free energy into a series of steps, e.g. 0.1->0.0 and 0.1->0.2, then 0.3->0.2 and 0.3->0.4, then ... 0.9->0.8 and 0.9->1.0.
Run the simulations and see whether the resulting relative hydration free energy agrees more closely with experiment.
This workshop has provided only a short introduction to Monte Carlo. To learn more, I would recommend that you follow a workshop that uses ProtoMS to teach free energy methods. The workshop is available here, and the workshop files are available from here. I helped produce this workshop (with Dr. Julien Michel), and both he and I are happy to answer any questions you may have if you run into problems trying to run this workshop on your own computers.
Finally, while you have now learned a bit about Monte Carlo, and seen some programs, this workshop has not provided much of the mathematical background behind Monte Carlo. Daan Frenkel has produced an excellent book chapter that provides a complete background on Monte Carlo methods. You can download it from here, or from its original location here (this last link is to the complete book, which is well worth reading).
I would also recommend the following books;
The best way to learn Monte Carlo is to use it. Please play with ProtoMS (or other Monte Carlo programs, such as MCPRO or MCCCS Towhee). Read and work through the tutorials and then try to adapt the examples so that you can model your own systems. Above all, have fun!