# Introduction to Monte Carlo

There are two main methods that are used to move molecules in a simulation;

• Molecular Dynamics (MD)
• Monte Carlo (MC)

Molecular Dynamics moves molecules by calculating forces on the atoms, converting those forces into accelerations (using Newton's second law) and then integrating those accelerations over time to generate a trajectory.

Monte Carlo moves molecules using random numbers, and by performing random moves. In this workshop you will be introduced to Monte Carlo sampling, and will study a simple Monte Carlo program (written in either Perl or C++). This will help you to understand exactly how Monte Carlo works, and to play with the key parameters that control a Monte Carlo simulation. To run this workshop, you will need access to a Perl interpreter, to a molecule (PDB) viewer (e.g. VMD), and to a spreadsheet program (e.g. Excel or Libreoffice Calc). If you want to use the C++ version of the code, then you will need access to a C++ compiler, such as g++ (part of GCC).

Monte Carlo is named after the city Monte Carlo, which is part of the principality of Monaco in the south of France. Monte Carlo is famous both for the Formula 1 Grand Prix, and for the Monte Carlo casinos. The Monte Carlo casinos are very famous in Europe and the US, and so the Monte Carlo method is named after this casinos, because, like gambling, it is based on lots of random numbers.

Monte Carlo methods have a long history, and have been applied to the modelling of everything from economics to traffic jams. In molecular science, we use a version of Monte Carlo that is called "Metropolis Monte Carlo". This method was developed in 1953 by Metropolis, Rosenbluth, Rosenbluth, Teller and Teller. Rosenbluth and Rosenbluth were man and wife, as were Teller and Teller, and the story is that the four of them, with Metropolis, developed the method one day while having a pic-nic meal together one sunny afternoon. The paper describing the method is one of the greatest, and shortest papers in the history of molecular modelling, and should be read by any serious student of computational science.

# Software

Here is a simple Monte Carlo program. It simulates a periodic box of krypton atoms. There are two versions;

The first is the program (metropolis_pl) written using Perl. If you don't know Perl, you can learn about Perl by working through this Perl workshop. Perl is a scripting language, so is easy to understand. However, programs written in Perl are quite slow. C++ is a compiled language. The second program (metropolis.cpp) is the C++ version of the program. This is identical to the Perl version, except it is written in C++, and will run about 100 times faster than the Perl version.

To use the Perl version, type on the command line;

```\$ perl metropolis_pl
```

This will print a lot of energies to the screen, and will produce a large number of output PDB coordinate files (called output??????.pdb, where ?????? is the number of the step for which the coordinates were generated).

To use the C++ version, you must first compile it. You can compile it by typing on the command line;

```g++ -O3 metropolis.cpp -o metropolis
```

You can then run the C++ version by typing;

```./metropolis
```

This too will print a large number of energies, and will produce a large number of output PDB files (named in the same way as the Perl script).

# Metropolis Monte Carlo

There are many different types of Monte Carlo. The one used in molecular simulation is called "Metropolis Monte Carlo". This is named after Metropolis, who was one of the five authors of the famous 1953 paper that first introduced the method.

Metropolis Monte Carlo is based on random numbers and has a very simple algorithm.

1. Calculate the energy of the molecular system. Save this as the "old energy".
2. Randomly choose a molecule in the system to move, and save its coordinates (the "old coordinates").
3. Randomly move this molecule (e.g. randomly translate or rotate it). This results in the "new coordinates".
4. Calculate the energy of the molecular system. Save this as the "new energy".
5. Use the difference between the "new" and "old" energies in a Monte Carlo test. If this test passes, then keep the "new coordinates", else, restore the "old coordinates".

For Metropolis Monte Carlo, in a NVT (canonical) ensemble, the Monte Carlo test is;

```exp( -(E_{new} - E_{old}) / kT ) >= random(0,1)
```

where E_{new} is the "new energy", E_{old} is the "old energy", kT is Boltzmann's constant (k) multiplied by temperature (T), and random(0,1) is a random number between 0 and 1.

Here is the code from the Perl Monte Carlo program that implements the above test;

```    # calculate the old energy
\$old_energy = calculate_energy();

# Pick a random atom
\$atom = int( rand(\$n_atoms) );

# save the old coordinates
@old_coords = ( \$coords[\$atom][0], \$coords[\$atom][1],
\$coords[\$atom][2] );

# Make the move - translate by a delta in each dimension
\$delta_x = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );
\$delta_y = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );
\$delta_z = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );

\$coords[\$atom][0] = \$coords[\$atom][0] + \$delta_x;
\$coords[\$atom][1] = \$coords[\$atom][1] + \$delta_y;
\$coords[\$atom][2] = \$coords[\$atom][2] + \$delta_z;

# wrap the coordinates back into the box
\$coords[\$atom][0] = wrap_into_box(\$coords[\$atom][0], \$box_size[0]);
\$coords[\$atom][1] = wrap_into_box(\$coords[\$atom][1], \$box_size[1]);
\$coords[\$atom][2] = wrap_into_box(\$coords[\$atom][2], \$box_size[2]);

# calculate the new energy
\$new_energy = calculate_energy();

\$accept = 0;

# Automatically accept if the energy goes down
if (\$new_energy <= \$old_energy)
{
\$accept = 1;
}
else
{
# Now apply the Monte Carlo test - compare
# exp( -(E_new - E_old) / kT ) >= rand(0,1)
\$x = exp( -(\$new_energy - \$old_energy) / \$kT );

if (\$x >= rand(1.0))
{
\$accept = 1;
}
else
{
\$accept = 0
}
}

if (\$accept == 1)
{
# accept the move
\$naccept = \$naccept + 1;
\$total_energy = \$new_energy;
}
else
{
# reject the move - restore the old coordinates
\$nreject = \$nreject + 1;

\$coords[\$atom][0] = \$old_coords[0];
\$coords[\$atom][1] = \$old_coords[1];
\$coords[\$atom][2] = \$old_coords[2];
\$total_energy = \$old_energy;
}
```

Find this code in metropolis_pl (it is near the bottom of the script). The "old energy" is calculated using the line;

```    # calculate the old energy
\$old_energy = calculate_energy();
```

Note that calculate_energy() is a function that calculates the total energy of the molecular system.

An atom to move is chosen at random using the line;

```    # Pick a random atom
\$atom = int( rand(\$n_atoms) );
```

Note that int( rand(\$n_atoms) ) is a function that generates a random number between 0 and \$n_atoms - 1. This chooses the index of the random atom in the \$coords array, and saves that index in the variable \$atom.

The coordinates of the chosen atom are saved into the array @old_coords using the line;

```    # save the old coordinates
@old_coords = ( \$coords[\$atom][0], \$coords[\$atom][1],
\$coords[\$atom][2] );
```

In this program, a random move is just a random translation along the x, y and z axis by a random value between -\$max_translate and \$max_translate Angstroms. This is performed using these lines;

```    # Make the move - translate by a delta in each dimension
\$delta_x = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );
\$delta_y = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );
\$delta_z = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );

\$coords[\$atom][0] = \$coords[\$atom][0] + \$delta_x;
\$coords[\$atom][1] = \$coords[\$atom][1] + \$delta_y;
\$coords[\$atom][2] = \$coords[\$atom][2] + \$delta_z;
```

The krypton atoms are in a periodic box. This random translation may have moved them outside of the box, so the atoms have to be wrapped back into the box. This is achieved using these lines;

```    # wrap the coordinates back into the box
\$coords[\$atom][0] = wrap_into_box(\$coords[\$atom][0], \$box_size[0]);
\$coords[\$atom][1] = wrap_into_box(\$coords[\$atom][1], \$box_size[1]);
\$coords[\$atom][2] = wrap_into_box(\$coords[\$atom][2], \$box_size[2]);
```

The "new energy" is calculated using this line, and stored in the variable \$new_energy;

```    # calculate the new energy
\$new_energy = calculate_energy();
```

The Metropolis Monte Carlo test is performed using these lines;

```    \$accept = 0;

# Automatically accept if the energy goes down
if (\$new_energy <= \$old_energy)
{
\$accept = 1;
}
else
{
# Now apply the Monte Carlo test - compare
# exp( E_new - E_old / kT ) >= rand(0,1)
\$x = exp( -(\$new_energy - \$old_energy) / \$kT );

if (\$x >= rand(1.0))
{
\$accept = 1;
}
else
{
\$accept = 0
}
}
```

If the energy goes down (\$new_energy is less than or equal to \$old_energy) then the test is automatically passed (\$accept is set to 1, which means "pass"). Otherwise, the exponential of the difference of \$new_energy and \$old_energy, divided by kT is calculated, and stored in the variable \$x. This is compared to a random number between 0 and 1 (rand(1.0)). If it is greater than or equal to the random number, then the test is passed. Otherwise the test fails (and \$accept is set equal to 0, which means "fail").

If the Metropolis Monte Carlo test is passed, then the following code is run;

```    if (\$accept == 1)
{
# accept the move
\$naccept = \$naccept + 1;
\$total_energy = \$new_energy;
}
```

This code increases the number of accepted moves (stored in \$naccept) by one. It also saves the total energy of the system into the variable \$total_energy.

If the Metropolis Monte Carlo test is failed, then the following code is run;

```    else
{
# reject the move - restore the old coordinates
\$nreject = \$nreject + 1;

\$coords[\$atom][0] = \$old_coords[0];
\$coords[\$atom][1] = \$old_coords[1];
\$coords[\$atom][2] = \$old_coords[2];
\$total_energy = \$old_energy;
}
```

The number of rejected moves (stored in \$nreject) is increased by one. The "old coordinates" of the atom are restored from the @old_coords array. The total energy of the system is then saved into the variable \$total_energy.

That is Metropolis Monte Carlo :-).

# Running metropolis_pl

Now you understand the algorithm, please run the metropolis_pl Monte Carlo program again using the command;

```\$ perl metropolis_pl
```

Watch what is printed out (the move number, then the total energy, then the number of accepted moves, then the number of rejected moves), e.g.

```10: 263471.098167728  5   5
20: 114030.78800927  11   9
30: 5999.19636795258  17   13
40: 4719.87647170638  26   14
50: 4686.88406567588  31   19
60: 539.640252029864  37   23
70: 388.498103662198  43   27
80: 279.303981060543  51   29
90: 257.443473831169  60   30
100: 255.040363808318  68   32
```

Keep running the program until is has finished. It will produce a lot of PDB output coordinate files, called output??????.pdb. View these files in VMD. You can make a movie of these files in Linux by typing;

```\$ vmd output*.pdb
```

If you are using Windows, then you have to create a VMD animation script. You can do this by downloading the script animatepdb_pl. Then type;

```\$ perl animatepdb_pl output*.pdb > movie.vmd
```

Open VMD, then use the Tcl shell to change into the directory that contains movie.vmd. In the Tcl shell then type;

```play movie.vmd
```

The movie is of a periodic box of krypton. To see the periodic cells, open the "Graphical Representations" window in VMD and select the "Periodic" tab. Under "Select periodic images to draw" click the "+X", "-X", "+Y", "-Y", "+Z" and "-Z" boxes. This will display those periodic images. Click on the "Play" icon in the "VMD Main" window to play the movie.

# Control Variables

Metropolis Monte Carlo is controlled by a number of variables. These effect the efficiency of the sampling, and so the number of moves that are required to converge a thermodynamic average.

The key variables are set at the top of the metropolis_pl script;

```# Set the number of atoms in the box
\$n_atoms = 25;

# Set the number of Monte Carlo moves to perform
\$num_moves = 5000;

# Set the size of the box (in Angstroms)
@box_size = ( 15.0, 15.0, 15.0 );

# The maximum amount that the atom can be translated by
\$max_translate = 0.5;  # angstroms

# Simulation temperature
\$temperature = 298.15;   # kelvin
```

The number of krypton atoms in the box is set by the variable \$n_atoms. Currently there are only 25 atoms. Try increasing or decreasing the number and re-running the script. How does this change the way the program runs? How does this change the number of accepted and rejected moves?

The total number of Monte Carlo moves to perform is set by the \$num_moves variable. Currently it is 5000. This is not a lot. Typical Monte Carlo simulations use millions or billions of moves. Perl is too slow to run this many moves. The C++ version of this program (metropolis.cpp) runs 100 times faster, and is set to up 500,000 moves. Use the C++ program if you want to run lots of moves.

The size of the periodic box is set in the @box_size array. Try making the box bigger and smaller, and re-running metropolis_pl. How does this change the energy? How does it change the movie? How does it change the number of accepted and rejected moves?

The maximum amount by which to translate an atom is set in the \$max_translate variable. This is currently 0.5 Angstroms. Try increasing and decreasing this value and re-running metropolis_pl How does this affect the movie, and the number of accepted and rejected moves?

Finally, the simulation temperature is set in the \$temperature variable. It is currently 298.15 Kelvin (room temperature). Try increasing and decreasing the temperature, and re-running metropolis_pl. How does this affect the energy, and the number of accepted and rejected moves?

# Phase Changes

Now you have an understanding of how the Metropolis Monte Carlo program works, it is time to make it do something useful. Try to change the temperature (\$temperature, box size (@box_size) and maximum translation (\$max_translate) variables so that you can make solid, then gas, then liquid krypton. You will need to change the variables, and then run the metropolis_pl script. For Monte Carlo to be efficient, you need about 40%-60% of the moves to be accepted. If too many moves are rejected, then reduce the value of \$max_translate. If too many moves are accepted, then increase \$max_translate.

Once you have working parameters, copy them into the C++ program metropolis.cpp. You can set the variables in the lines near the top of this program;

```// Set the number of atoms in the box
const int n_atoms = 25;

// Set the number of Monte Carlo moves to perform
const int num_moves = 500000;

// Set the size of the box (in Angstroms)
const double box_size[3] = { 10.0, 10.0, 10.0 };

// The maximum amount that the atom can be translated by
const double max_translate = 0.5;  // angstroms

// Simulation temperature
const double temperature = 298.15;   // kelvin
```

The variables have the same name as in Perl.

Compile the C++ program and run the program by typing;

```\$ g++ -O3 metropolis.cpp -o metropolis
\$ ./metropolis
```

Just like with the Perl program, you will get a print out of the moves, energy, and number of accepted and rejected steps, e.g.

```1000: 19.876323  264  736
2000: -1.941197  420  1580
3000: -8.801116  565  2435
4000: -10.333685  688  3312
5000: -10.266989  838  4162
6000: -13.508316  976  5024
7000: -3.880514  1141  5859
8000: -11.871016  1300  6700
9000: -9.966295  1455  7545
10000: -7.182279  1618  8382
```

Copy these into a spreadsheet program (like Excel) and graph the total energy versus move number. Also get the spreadsheet to calculate the acceptance ratio of the Monte Carlo moves. This is the number of accepted moves divided by the total number of moves (e.g. \$naccept / (\$naccept + \$nreject)). Is the acceptance ratio constant during the simulation? Does the simulation need to be equilibrated, or is the energy stable after 500,000 moves?

# Phase Space and Ensembles

You have simulated a periodic box of krypton. Each atom has three dimensions of motion; left-right, up-down and front-back. This means that the space of all possible locations of one atom is three dimensional. The space for all possible locations for two atoms is six dimensional. The space for all possible locations for N atoms is 3xN-dimensional(*). This space is given the name "phase space". A Monte Carlo simulation samples configurations from this phase space. If the simulation was run for an infinite time, then every possible configuration of the system would be sampled. The set of configurations sampled is called an "ensemble" (another word that means "collection"). The configurations in the ensemble appear with different probabilities. High energy structures appear with a low probability, while low energy structures appear with a high probability. The probability of a structure appearing in the collection can be calculated using the Boltzmann equation (download this document for a derivation and explanation of the Boltzmann equation).

In your simulation, the simulation was run using a constant number of particles (constant N), constant volume (constant V) and constant temperature (constant T). This meant that the simulation sampled an NVT ensemble, and the NVT Boltzmann equation was used to control the probability of whether a structure appeared in the ensemble. There are several different ensembles;

• NVE: The microcanonical ensemble. Constant number of particles (N), volume (V) and energy (E).
• NVT: The canonical ensemble. Constant number of particles (N), volume (V) and temperature (T).
• NPT: The isothermal-isobaric ensemble. Constant number of particles (N), pressure (P) and temperature (T).
• μVT: The grand canonical ensemble. Constant chemical potential (μ), volume (V) and temperature (T).

In a Monte Carlo simulation, the ensemble generated depends on the types of moves performed, and on the form of the Monte Carlo acceptance test. There are different Monte Caro tests for different ensembles. For example, for the canonical (NVT) ensemble, the Monte Carlo test is;

```exp( -ΔE / kT )  >= rand(0,1)
```

The exponential of the change in energy (ΔE) divided by Boltzmann's constant (k) times temperature (T), must be greater than or equal to a random number drawn uniformly between 0 and 1 for the test to be passed.

For the isothermal-isobaric ensemble, the Monte Carlo test is;

```exp( -( ΔE - pΔV ) / kT  +  N Δ ln V ) >= rand(0,1)
```

The exponential of the change in energy (ΔE) with the pressure (p) times the change in volume (ΔV), combined with the number of particles (N) and the change in the natural logarithm of the volume (Δ ln V) must be greater or equal to a random number drawn uniformly between 0 and 1 for the test to be passed.

Download this document for a derivation and explanation of the different Monte Carlo tests.

(*) Note that translating or rotating the entire set of all atoms does not change the atoms (as they all remain in relatively the same position). As there are three dimensions of translating the entire box of atoms, and three dimensions of rotating the entire box of atoms, the set of all atoms only has 3N-6 dimensions.

# Volume Moves

The Monte Carlo simulation you have run sampled configurations from the canonical ensemble. This meant that the volume of the periodic box was constant, and it was not possible to specify the pressure acting on the krypton atoms. We live in a constant pressure world, not a constant volume world. To be able to specify pressure, we need to sample from the isothermal-isobaric (NPT) ensemble. This can be achieved by making two changes to the metropolis_pl script;

1. The acceptance test must be changed from the NVT test to the NPT test.
2. Monte Carlo moves that change the volume of the box need to be added.

The first part is easy. Currently, the script uses the canonical MC acceptance test;

```        # Now apply the Monte Carlo test - compare
# exp( -(E_new - E_old) / kT ) >= rand(0,1)
\$x = exp( -(\$new_energy - \$old_energy) / \$kT );
```

This must be changed to the isothermal-isobaric test;

```        # Now apply the Monte Carlo test - compare
# exp( -(E_new - E_old + P(V_new - V_old)) / kT
#             +  N ln (V_new - V_old) ) >= rand(0,1)
\$x = exp( -(\$new_energy - \$old_energy + \$pressure * (\$V_new - \$V_old)) / \$kT
+ \$n_atoms * (log(\$V_new) - log(\$V_old) ) );
```

(note that natural logarithm in Perl in "log")

This uses the old and new volumes of the box, that are calculated and stored in the V_old and V_new variables via;

```    # calculate the old energy
\$old_energy = calculate_energy();

# calculate the old volume of the box
\$V_old = \$box_size[0] * \$box_size[1] * \$box_size[2];
```

and

```    # calculate the new energy
\$new_energy = calculate_energy();

# calculate the new volume of the box
\$V_new = \$box_size[0] * \$box_size[1] * \$box_size[2];
```

(note here that you can see the old and new volumes calculated after the old and new energies, so that you can work out where in the script these lines should be.).

As well as the volumes, the test also requires the pressure, which can go at the top of the script underneath the temperature;

```# Simulation temperature
\$temperature = 298.15;   # kelvin

# Simulation pressure
\$pressure = 1;   # atmospheres

# Convert pressure to internal units (kcal mol-1 A-3)
\$pressure = \$pressure * 1.458397506863647E-005;
```

(note that pressure is normally given in units of atmospheres, while the internal units of this program are kcal mol-1 for energy, and angstrom for length. This is why the pressure has to be converted to internal units above).

1. Make the above changes to your metropolis_pl script and see if this makes any difference to the simulation.
2. ADVANCED. Make the same changes to the metropolis.cpp program and see if this makes any difference.

Once you have made the changes and run the simulation, you should have found that adding in the volume test on its own has not changed the simulation at all. This is because the simulation has only one type of move - the translation of a single atom of krypton. This move does not change the volume of the periodic box, so V_new will always equal V_old. This means that V_new - V_old is always zero, so the NPT acceptance test becomes equal to the NVT acceptance test.

To have an NPT simulation, we need to add in a MC move that changes the volume of the box. This can be achieved by adding in a move that changes the volume of the box by a random amount. This move should only occur once for every \$n_atoms moves. This can be achieved using this code;

```    # save the old coordinates
@old_coords = ( \$coords[\$atom][0], \$coords[\$atom][1],
\$coords[\$atom][2] );

# save the old box dimensions
@old_box_size = @box_size;

# Decide if we are performing an atom move, or a volume move
if (rand(1.0) <= 1.0 / \$n_atoms)
{
\$is_volume_move = 1;

# 1 in \$n_atoms chance of being here. Perform a volume move
# by changing the volume for a random amount
\$delta_vol = rand(\$max_volume_change);

# Volume is the cube of the box length, so add the cube root
# of this change onto the box size
\$delta_vol = \$delta_vol**(1.0/3.0);

# Are we making the box bigger or smaller?
if (rand(1.0) >= 0.5)
{
\$box_size[0] = \$box_size[0] + \$delta_vol;
\$box_size[1] = \$box_size[1] + \$delta_vol;
\$box_size[2] = \$box_size[2] + \$delta_vol;
}
else
{
\$box_size[0] = \$box_size[0] - \$delta_vol;
\$box_size[1] = \$box_size[1] - \$delta_vol;
\$box_size[2] = \$box_size[2] - \$delta_vol;
}

# wrap the atoms back into the box
wrap_all_atoms();
}
else
{
\$is_volume_move = 0;

# Make the atom move - translate by a delta in each dimension
\$delta_x = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );
\$delta_y = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );
\$delta_z = \$max_translate * ( 2.0 * rand(1.0) - 1.0 );

\$coords[\$atom][0] = \$coords[\$atom][0] + \$delta_x;
\$coords[\$atom][1] = \$coords[\$atom][1] + \$delta_y;
\$coords[\$atom][2] = \$coords[\$atom][2] + \$delta_z;

# wrap the coordinates back into the box
\$coords[\$atom][0] = wrap_into_box(\$coords[\$atom][0], \$box_size[0]);
\$coords[\$atom][1] = wrap_into_box(\$coords[\$atom][1], \$box_size[1]);
\$coords[\$atom][2] = wrap_into_box(\$coords[\$atom][2], \$box_size[2]);
}
```

We start saving the old coordinates as before, but now we must save the old box size too, using the code;

```    # save the old box dimensions
@old_box_size = @box_size;
```

We then have to choose between performing an atom move, or a volume move. Each atom should move once between each volume move, so we choose a volume move with a probability of 1 / n_atoms, using the code;

```    # Decide if we are performing an atom move, or a volume move
if (rand(1.0) <= 1.0 / \$n_atoms)
{
\$is_volume_move = 1;
...
}
else
{
\$is_volume_move = 0;
...
}
```

The atom move code is the same as before. The volume move code is straightforward. This adds or subtracts a random volume chosen to have a maximum value of \$max_vol_change to each dimension of the box, using the code;

```        # 1 in \$n_atoms chance of being here. Perform a volume move
# by changing the volume for a random amount
\$delta_vol = rand(\$max_volume_change);

# Volume is the cube of the box length, so add the cube root
# of this change onto the box size
\$delta_vol = \$delta_vol**(1.0/3.0);

# Are we making the box bigger or smaller?
if (rand(1.0) >= 0.5)
{
\$box_size[0] = \$box_size[0] + \$delta_vol;
\$box_size[1] = \$box_size[1] + \$delta_vol;
\$box_size[2] = \$box_size[2] + \$delta_vol;
}
else
{
\$box_size[0] = \$box_size[0] - \$delta_vol;
\$box_size[1] = \$box_size[1] - \$delta_vol;
\$box_size[2] = \$box_size[2] - \$delta_vol;
}
```

Because the box has changed size, it may be necessary to wrap some atoms back into the box. This is achieved by calling the wrap_all_atoms() function, which has been added using;

```sub wrap_all_atoms
{
for (\$i=0; \$i<\$n_atoms; \$i = \$i + 1)
{
# wrap the coordinates back into the box
\$coords[\$i][0] = wrap_into_box(\$coords[\$i][0], \$box_size[0]);
\$coords[\$i][1] = wrap_into_box(\$coords[\$i][1], \$box_size[1]);
\$coords[\$i][2] = wrap_into_box(\$coords[\$i][2], \$box_size[2]);
}
}
```

The flag \$is_volume_move is used to say whether a volume move has been performed. It is equal to one if the move was a volume move, or zero if the move was an atom move. This is used in the code run when a move is accepted or rejected to increment the number of accepted and rejected volume moves, and also to restore the old box dimensions if the move failed, e.g.

```    if (\$accept == 1)
{
# accept the move
\$naccept = \$naccept + 1;

if (\$is_volume_move){ \$nvolaccept = \$nvolaccept + 1; }

\$total_energy = \$new_energy;
}
else
{
# reject the move - restore the old coordinates and box size
\$nreject = \$nreject + 1;

if (\$is_volume_move){ \$nvolreject = \$nvolreject + 1; }

\$coords[\$atom][0] = \$old_coords[0];
\$coords[\$atom][1] = \$old_coords[1];
\$coords[\$atom][2] = \$old_coords[2];

@box_size = @old_box_size;

if (\$is_volume_move){ wrap_all_atoms(); }

\$total_energy = \$old_energy;
}
```

The parameter for the move is the maximum volume change. This is normally 10% of the number of atoms, e.g. the script should also be modified to add in the maximum volume change;

```# The maximum amount that the atom can be translated by
\$max_translate = 0.5;  # angstroms

# The maximum amount to change the volume - the
# best value is 10% of the number of atoms
\$max_volume_change = 0.1 * \$n_atoms;   # Angstroms**3
```

Finally, for completeness, it is a good idea to print out information about the box size and number of accepted and rejected volume moves during a simulation, e.g.

```    # print the energy every 10 moves
if (\$move % 10 == 0)
{
print "\$move: \$total_energy  (\$naccept : \$nreject)  ".
"(\$nvolaccept : \$nvolreject)\n";

\$vol = \$box_size[0] * \$box_size[1] * \$box_size[2];

print "    Box size = @box_size. Volume = \$vol A^3\n";
}
```

1. Make the above changes to your metropolis_pl script so that you can perform volume moves. Run the script. Does it make any difference to the simulation? Note that you can download the program if you get stuck.

2. ADVANCED. Make the same changes to the C++ metropolis.cpp program. Run the program and see what happens during a longer simulation. Note that you can download the program if you get stuck.

# NPT Simulations

Use the NPT version of metropolis_pl and metropolis.cpp to run Monte Carlo simulations of krypton at different temperatures and pressures. If you haven't made your own versions of these programs, you can download them from here;

Remember, you have to compile the C++ version every time you change it, using;

```\$ g++ -O2 npt_metropolis.cpp -o npt_metropolis
```

Change the temperature, pressure and move delta to run Monte Carlo simulations of krypton in the gas, solid and liquid phases. Do the temperatures and pressures of the phase changes correspond to the experimental melting and boiling points of krypton? If not, why do you think they could be wrong?

# ProtoMS

ProtoMS is a program that can be used to run Monte Carlo simulations. It is very fast and very powerful. It is relatively easy to use, and the manual can be downloaded from here.

If you want to install ProtoMS yourself, you can do so by going to the project webpage and clicking on the downloads tab. You should download the latest current version of the code (currently version 2.22).

ProtoMS can run on Linux, Windows (using cygwin), or Mac OS X. Once you have downloaded the code, untar it using the command;

```tar -zxvf ProtoMS_2.22.tgz
```

Note that you only need to download and compile ProtoMS is a copy of the program is not already installed on your computer.

Now you should download a set of tutorial input files for ProtoMS. They can be found here. Download and unpack the files and change into the protoms_tutorial/oscillator directory. This contains a simple Monte Carlo simulation of a harmonic oscillator. The simulation is described in the file input.cmd, with the input coordinates coming from the PDB file harm1t2.pdb, and the forcefield parameters in oscillators.ff. The simulation performs one million MC moves on the harmonic oscillator, which should take about 1-3 seconds to complete (ProtoMS is very fast!).

Question
The average energy of the harmonic oscillator is accumulated from the energies of each configuration of the simulation. The average energy of a harmonic oscillator can be calculated analytically from thermodynamics by using the equation;
```E_{average} = 0.5 * k * T
```

where k is Boltzmann's constant (1.987 * 10-3 kcal mol-1 K-1), and T is temperature (298.15 K). The theoretical average energy should thus be 0.2961 kcal mol-1. How close does ProtoMS get using one million moves?

Question
How many moves does ProtoMS need to run to calculate the correct average energy to 4 decimal places?

# Intramolecular Moves

To perform Monte Carlo simulations of molecules, you must have a way of changing the internal coordinates of the molecule, e.g. Monte Carlo moves that change bond lengths, angles and dihedrals in a molecule must be performed.

Change into the directory protoms_tutorial/ethane_methanol/gas_phase. In this directory you will see a ProtoMS input file called input.cmd. This input file describes a gas-phase simulation of ethane. Run the input file by typing

```\$ /path/protoms2 input.cmd
```

where /path/ is the full directory path to the location in which the protoms2 executable is installed.

This script will have run 500 intramolecular Monte Carlo moves of methane. Look at the trajectory by copying across the snapshot-*.pdb files and viewing them in a viewer like VMD (note that there is a movie.vmd VMD movie file already in this directory in case you are using VMD on windows).

Look at how the molecule moves. The movement is controlled by a z-matrix. This z-matrix is present in the protoms_tutorial/ethane_methanol/in directory. For ethane, it is called ethane.ff. Take a look at this file. There are many different parameters in this file. The z-matrix is contained in the lines;

```
mode template
solute ethane
info translate 0.0 rotate 0.0
atom  C01 ETH 3000 3000  DM3 DUM DM2 DUM DM1 DUM
atom  C05 ETH 3001 3001  C01 ETH DM3 DUM DM2 DUM
atom  H06 ETH 3002 3002  C05 ETH C01 ETH DM3 DUM
atom  H07 ETH 3003 3003  C05 ETH C01 ETH H06 ETH
atom  H08 ETH 3004 3004  C05 ETH C01 ETH H06 ETH
atom  H02 ETH 3005 3005  C01 ETH C05 ETH H06 ETH
atom  H03 ETH 3006 3006  C01 ETH C05 ETH H02 ETH
atom  H04 ETH 3007 3007  C01 ETH C05 ETH H02 ETH
bond  H02 ETH  C01 ETH flex 0.01
bond  H03 ETH  C01 ETH flex 0.01
bond  H04 ETH  C01 ETH flex 0.01
bond  C05 ETH  C01 ETH flex 0.01
bond  H06 ETH  C05 ETH flex 0.01
bond  H07 ETH  C05 ETH flex 0.01
bond  H08 ETH  C05 ETH flex 0.01
angle H06 ETH C05 ETH C01 ETH flex 0.5
angle H07 ETH C05 ETH C01 ETH flex 0.5
angle H08 ETH C05 ETH C01 ETH flex 0.5
angle H02 ETH C01 ETH C05 ETH flex 0.5
angle H03 ETH C01 ETH C05 ETH flex 0.5
angle H04 ETH C01 ETH C05 ETH flex 0.5
dihedral H02 ETH C01 ETH C05 ETH H06 ETH flex 15.0
```

The first set of lines describe how the molecule is constructed, e.g.;

```atom  H07 ETH 3003 3003  C05 ETH C01 ETH H06 ETH
```

this line says that the atom called H07 is bonded to the atom called C05, forms an angle with C05 and C01 and a dihedral with C05, C01 and H06.

Further down in the file is a line that reads;

```angle H07 ETH C05 ETH C01 ETH flex 0.5
```

This says that the angle H07-C05-C01, which is contained in the z-matrix, is flexible, and that Monte Carlo moves that change this angle should be performed, using a maximum angle delta of 0.5 degrees.

Similarly, the line;

```dihedral H02 ETH C01 ETH C05 ETH H06 ETH flex 15.0
```

says that the dihedral H02-C01-C05-H06, which is contained in the z-matrix, is flexible, and Monte Carlo moves that change this dihedral should be performed using a maximum dihedral angle change of 15 degrees.

Question
The protoms_tutorial/ethane_methanol/in directory also contains coordinate and parameter files for methanol. Update input.cmd so that you can run a gas-phase simulation of methanol.

Question
The flex lines in the z-matrix specify the maximum bond, angle and dihedral deltas for intramolecular Monte Carlo moves. The number of accepted and attempted Monte Carlo moves is printed in the output of ProtoMS on lines that look like this;

```INFO     Data collection complete. Attempted/accepted moves          100          77
```

Try changing the bond, angle and dihedral maximum deltas and see if you can improve the quality of sampling of ethane and methanol in the gas-phase.

Do you think that the optimum bond, angle and dihedral deltas would be suitable for use for a water phase simulation?

# Solvent Rigid Body Moves

Molecular simulation nearly always requires modelling of solvent. Solvent plays a key and often little understood role in many molecular and biomolecular processes, and its correct modelling is crucial to allow the results of molecular simulation to be properly compared with experiment.

Solvent in Monte Carlo simulations can be modelled in many ways. The most common is to use an explicit model of the solvent. This means that a model of each solvent molecule is constructed, and these molecules are used to solvate the solute or system being studied.

Change to the protoms_tutorial/ethane_methanol/solvated directory. This contains an input.cmd file that runs a simulation of ethane solvated in a periodic box of water molecules. The TIP4P model is used to represent water. Run the simulation by typing;

```\$ /path/protoms2 input.cmd
```

(remember to replace /path/ with the full directory path to your protoms2 executable).

Visualise the resulting snapshot PDB files from the simulation using a molecular viewer such as VMD (and again, a movie.vmd file is already provided for you if you are using VMD on windows).

You will not see much movement in the simulation because it was very short. However, what you should see is that the water molecules move as rigid bodies. ProtoMS moves solvent molecules using rigid body translation and rotation moves. The maximum amount to translate and rotate each solvent molecule is set in the parameter file for the solvent. This is contained in the file protoms_tutorial/ethane_methanol/in/solvents.ff. This contains parameters for many different solvent molecules. The relevant lines for the TIP4P water model are;

```#
# TIP4P (T4P)
#
#     O00       dist(OH) = 0.9572 A
#   /  |  \     dist(OM) = 0.15 A
# H01 M03 H02   ang(HOH) = 104.52 deg
#

mode clj
par 2003  OW  8  0.000  3.15363  0.1550
par 2004  HW  1  0.520  0.0      0.0
par 2005  ??  0 -1.040  0.0      0.0

mode template
solvent T4P
info translate 0.15 rotate 15.0
atom O00 2003 2003
atom H01 2004 2004
atom H02 2004 2004
atom M03 2005 2005
```

These lines provide the change and Lennard Jones parameters for each of the atoms in water. They also provide the maximum amount by which to translate and rotate the molecule, e.g.

```info translate 0.15 rotate 15.0
```

means that each water molecule will be translated by a maximum of 0.15 angstroms, and rotated about a random vector by a maximum of 15 degrees.

Question
Increase the number of Monte Carlo moves performed during the simulation. This is specified in the input.cmd file. Increase the number until the movie shows that the waters are diffusing through the box.

Question
Edit the input.cmd file so that you can simulate methanol in a box of water. Run a long simulation to allow the water to equilibrate around the methanol. How does the water structure around the final snapshot from the simulation of methanol compare to the water structure from the final snapshot from the simulation of ethane?

Question
Edit the z-matrix of ethane and methanol to change the bond, angle and dihedral maximum deltas. Try to optimise these values to get a good acceptance ratio and to get a large amount of motion of the solute. Also edit the solvents.ff file to change the translation and rotation deltas of TIP4P. Again, aim to get the most motion of the water, while keeping a high acceptance ratio (about 50-60%).

# Free Energy Calculations

Monte Carlo simulations can be used to create large numbers of configurations of a molecular system, with each configuration appearing with its correct (relative) Boltzmann probability. This collection of configurations is called an ensemble. You can do many things with a correct Boltzmann weighted ensemble of configurations. One possibility is to use it to calculate a free energy.

Change to the protoms_tutorial/ethane_methanol/free_energy directory. In here, you will see the ProtoMS input command file input.cmd. This script performs a simulation that calculates the relative hydration free energy of ethane and methanol. The simulation works by simulating both ethane and methanol simultaneously in the centre of a box of water. A lambda coordinate is used to turn off the interactions between ethane and water as the interactions between methanol and water are switched on. This is a "dual topology" method of calculating a relative free energy. The input file simulates the system at lambda=0.5. This means that 50% of the methanol and 50% of the ethane interactions with solvent are calculated. The energy difference for each configuration and lambda=0 and lambda=1 is calculated. This energy difference is averaged using the equation;

```dG = -kT ln < exp( -dE / kT ) >_lambda
```

where dE is the energy difference between lambda=1 or lambda=0 and lambda=0.5. kT is Boltzmann's constant multiplied by temperature. The angle brackets < ... > mean "average over configurations in an ensemble", with _lambda meaning the ensemble generated for a specific value of lambda. The resulting value, dG is the difference in free energy between lambda=1 and lambda=0.5, or lambda=0 and lambda=0.5. The sum of these two free energy differences is the relative hydration free energy.

Run the free energy simulation by typing;

```\$ /path/protoms2 input.cmd
```

The final lines of the output contain the results. Look for the lines;

```RESULTS Backwards Free Energy =      1.9251908496670806503     ( 2.013836560     )
RESULTS Forwards  Free Energy =     0.72142800990255362414     ( 1.045066552     )
```

These give the free energy average (and error in brackets) for the 0.5->0.0 perturbation and the 0.5->1.0 perturbation. As you can see, the error is very large, because the simulation is too short.

Question
Increase the number of steps of the simulation and see if you can reduce the size of the errors. What is the predicted relative hydration free energy of ethane and methanol from your simulation? How does this compare to experiment? What sources of error are there in the simulation?

Question
Another way to reduce the size of the errors is to reduce the gap between the reference state (lambda=0.5) and the perturbed states (lambda=0 and lambda=1). Run a series of simulations that break the relative free energy into a series of steps, e.g. 0.1->0.0 and 0.1->0.2, then 0.3->0.2 and 0.3->0.4, then ... 0.9->0.8 and 0.9->1.0.

Run the simulations and see whether the resulting relative hydration free energy agrees more closely with experiment.

# What's next?

This workshop has provided only a short introduction to Monte Carlo. To learn more, I would recommend that you follow a workshop that uses ProtoMS to teach free energy methods. The workshop is available here, and the workshop files are available from here. I helped produce this workshop (with Dr. Julien Michel), and both he and I are happy to answer any questions you may have if you run into problems trying to run this workshop on your own computers.

Finally, while you have now learned a bit about Monte Carlo, and seen some programs, this workshop has not provided much of the mathematical background behind Monte Carlo. Daan Frenkel has produced an excellent book chapter that provides a complete background on Monte Carlo methods. You can download it from here, or from its original location here (this last link is to the complete book, which is well worth reading).

I would also recommend the following books;

The best way to learn Monte Carlo is to use it. Please play with ProtoMS (or other Monte Carlo programs, such as MCPRO or MCCCS Towhee). Read and work through the tutorials and then try to adapt the examples so that you can model your own systems. Above all, have fun!