Efficient Vectorisation with C++
Welcome to a short course that will teach you what vectorisation is, and how you can use it to speed up your C++ programs. Learning how to efficiently vectorise your code is important to allow you to make good use of the increasingly large vector units found on modern multicore and massively multicore processors.
This course will teach you what vectorisation means, and how it can be achieved at a variety of levels. This will also discuss why processor manufacturers are increasingly turning to vectorisation to improve processor speed, and you will also learn how to lay out the data in your program in memory to ensure optimal performance. By combinging efficient vectorisation with parallel programming (i.e. as described in my Parallel Programming in C++ course), you will ensure that your program makes optimal use of modern processors such as Xeon and Xeon Phi.
To follow this course you should already have a good basic understanding of C++, e.g. loops, functions, containers and classes. In addition, this course will use modern C++ (C++ 2014).
NOTE - this course will assume that you are compiling using the g++ command, via gcc version 5 or above, or clang version 3.7 or above. This is available for Windows (e.g. via MSYS2), Linux or OS X. The course will also assume that you are comfortable using the command line, and a text editor, such as nano
or vim
.
To start, you will need to download all of the course material. This is available by clicking here. This will download a file called workshop.tgz
. Unpack this file using the command
tar -zxvf workshop.tgz
(if you are on windows, type this into an MSYS2
command shell)
This will unpack a directory called workshop
. Change into this directory by typing
cd workshop
Typing ls
should show you the following files;
include test.cpp
You can test that your compiler is installed and working by typing on Linux or Windows (MSYS2) with GCC;
g++ -O2 --std=c++14 -fopenmp-simd test.cpp -Iinclude -o test
./test
or by typing on OS X (or on Linux with clang);
g++ -O2 --std=c++14 -openmp-simd test.cpp -Iinclude -o test
./test
(GCC uses -fopenmp-simd
while clang uses -openmp-simd
)
If this works, you should see output
Everything is ok :-)
If not, then something went wrong. Double-check your installation of GCC or clang.