`numba`

is the first tool we will explore to accelerate the code.

`numba`

is a “just-in-time” compiler. It works at the level of a function, by compiling that function to machine code just before it is executed (just in time!).

The machine code is executed directly on the processor, thus bypassing the Python virtual machine, and therefore running quicker. The inputs to the function are passed into this function, this all executes as machine code, and then the results are passed by to Python.

You use `numba`

by marking functions that you want to be “just-in-time” (jit) compiled using the `numba.jit`

decorator. For example, here is a very simple function that calculates the square root of an array of numbers;

```
import math
import numpy as np
def calculate_roots(numbers):
num_vals = len(numbers)
result = np.zeros(num_vals, "f")
for i in range(0, num_vals):
result[i] = math.sqrt(numbers[i])
return result
```

Let’s see how long this takes to calculate 10 million square roots. We’ll do this by asking `numpy`

to generate an array of 10 million random numbers between 0 and 500.

`numbers = 500.0 * np.random.rand(10000000)`

Now let’s time the function using `timeit`

`timeit(calculate_roots(numbers)`

On my computer I get this result;

`1.42 s ± 25.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`

So, it takes ~1.4 seconds to calculate 10 million square roots.

We can speed this up by asking `numba`

to jit our `calculate_roots`

function. We do this by adding the `@numba.jit()`

decorator to the function, e.g.

```
import numba
@numba.jit()
def calculate_roots(numbers):
num_vals = len(numbers)
result = np.zeros(num_vals, "f")
for i in range(0, num_vals):
result[i] = math.sqrt(numbers[i])
return result
```

Now lets time the function…

`timeit(calculate_roots(numbers))`

On my computer I get this result;

`5.16 ms ± 35.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)`

It now takes ~5 milliseconds(!) to calculate 10 million square roots. This is almost 300 times faster, just by adding a single `@numba.jit()`

to the top of the function.

Add a `@numba.jit()`

decorator to the `calculate_scores`

function.

Using the

`timeit`

function, measure how long the function now takes to complete. How many times faster is the function compared to before you added the`@numba.jit()`

decorator?Now measure how long the total script takes to run, using, e.g. the

`time`

function on MacOS/Linux, or`Measure-Command`

on Windows. How many times faster is the script compared to before you added the`@numba.jit()`

decorator?Does the speed up of the function match the speed up of the overall script?

Part of the runtime of the script is the just-in-time compilation of the `calculate_scores`

function. The more complex the function, the longer it takes for `numba`

to create and then compile the function to machine code.

You can cache the result of compilation by passing `cache=True`

to the decorator, e.g.

`@numba.jit(cache=True)`

You will still pay the cost of compilation the first time you run your script. But subsequent runs will load the cached machine code and will use that (thereby avoiding compiling the code again).

Make the change to cache the results of JIT compilation in your copy of `slow.py`

.

- How does this affect the runtime of your script?

The line

`(ids, varieties, data) = slow.load_and_parse_data(5)`

loads only 5% of the data. This is because, before numba, processing more than 5% of the data took too long. Now that you have accelerated the code, increase this to 100% of the data, e.g.

`(ids, varieties, data) = slow.load_and_parse_data(100)`

Using

`timeit`

in your Jupyter notebook, how much long does it take`calculate_scores`

to run now? Is this about what you expect (twenty times longer to process twenty times the amount of data)?Make the change to your copy of

`slow.py`

. Does increasing the amount of data processed by twenty times change the total runtime of this script?