Step 3: QM/MM Monte Carlo

In the last step, you calculated the difference between the QM/MM and MM interaction energy for only a single configuration of the side-chain interacting with the water box. This difference will depend on the configuration of the waters around the side-chain. We thus need to sample different configurations of the water molecules. Unfortunately, QM/MM energies are computationally expensive to calculate, making direct Monte Carlo sampling of the QM/MM model too slow. We will therefore use multiscale Monte Carlo to accelerate sampling of the QM/MM model by running most Monte Carlo moves using the MM model.

Multiscale Monte Carlo works by defining a “slow” model and a “fast” model. In this case, the “slow” model is QM/MM, while the “fast” model is MM.

where “SIDECHAIN” is the name of the side-chain (“PHE” for phenylalanine and “TRP” for tryptophan) and “METHOD” is one of the five semi-empirical methods (“AM1”, “AM1/d”, “PM3”, “PM6” or “DFTB”). For example, to sample water configurations around phenylalanine using the DFTB model, you should type;

The script will perform 20 multiscale Monte Carlo moves, using 5000 MM moves per multiscale move (so 100,000 moves in total). The energies used for each multiscale Monte Carlo move are printed, together with whether the move is accepted or rejected. The average energy difference between the QM/MM and MM models is printed at the end. The coordinates from each move are saved to the files “coordsXXX.pdb”, where “XXX” is the multiscale Monte Carlo move number.

EXERCISE 2 - Use multiscale Monte Carlo to calculate the average energy difference

Choose one of the semi-empirical models and remember this choice (you must use this same semi-empirical model for the rest of the practical). If possible, choose a different semi-empirical model to the people sitting next to you, so that you can later compare your result. Use “sample.py” to sample the water molecules around each of the side-chains. Calculate the average difference in QM/MM and MM interaction energy for each side-chain for your choice of semi-empirical model. Place these differences into a table. If possible, talk to the other people in your class to get the average energy differences for the other semi-empirical models, and add these to your table. If you want, you can also view a movie of your simulation using VMD. You can do this by typing;

QUESTION 2.1 - How does the average difference between the QM/MM and MM interaction energies compare with the single-point energy difference calculated in the last exercise?

QUESTION 2.2 - Which of the semi-empirical models show the strongest average interaction energy between the side-chains and water? How does this compare to the MM interaction energy?

QUESTION 2.3 - Do you expect the average difference between the QM/MM and MM interaction energies to be the same every time you run sample.py? If not, why not? How could you calculate a statistical error?

If you get stuck or need help, you can take a peek at this model answer for exercise 2.

A diagram of a multiscale Monte Carlo move is shown above. The Monte Carlo move starts at configuration i. The energy of this configuration is evaluated using the “slow” QM/MM model giving EQM/MM(i) and on the “fast” MM model giving EMM(i). Standard Metropolis Monte Carlo moves are then attempted from configuration i using only the “fast” MM model, until after a set number of moves (in our case, 2000), the system is in configuration j. The energy of configuration j is evaluated using both the QM/MM and MM models giving EQM/MM(j) and EMM(j). Configuration j is then accepted into the QM/MM ensemble according to the probability min(1, exp{-βΔΔE}), where ΔΔE = (EQM/MM(j) - EMM(j)) - (EQM/MM(i) - EMM(i)).

We are now going to run a simple python script that uses multiscale Monte Carlo to sample the water using the QM/MM model. This script is called “sample.py” and you should find it in the current directory (the qmmm directory). If you are curious, you can take a look at this script. In particular, the “multiMCMove” function defined near the end of the script shows the detail of how to perform a multiscale Monte Carlo move.

You can run the script using the command;