line_profiler is an excellent tool that can help you quickly profile your python code and find where the performance bottlenecks are. In this blog post I will walk through a simple example and a few tips about using this tool within the Jupyter notebook.

## Installation¶

To install `line_profiler`

with Anaconda, simply do:

`conda install line_profiler`

or if you use `pip`

, install with:

`pip install line-profiler`

## Using it with Jupyter notebook¶

To use `line_profiler`

, normally you'd need to modify your code and decorate the functions you want to profile with `@profile`

.

```
@profile
def slow_function(a, b, c):
...
```

However one trick that I find especially useful is to use the `line_profiler`

extension within Jupyter. This eliminates the need to modify the code with the `@profile`

decorator, and instead you simply load the extension by doing

```
%load_ext line_profiler
```

## Example¶

Here is a simple (and somewhat contrived) example to illustrate how to use this.

Suppose you have a sequence of coordinates in (latitude / longitude) that represents a random walk from an origin. At each step the latitude and longitude change is determined by drawing from a Normal distribution:

```
import numpy as np
import pandas as pd
```

```
origin = {'lat': 34, 'lon': -120}
np.random.seed(1)
n = 100000
changes = np.random.randn(n, 2) / 30
changes[0] = [0, 0]
trace = pd.DataFrame.from_records(changes, columns=['lat', 'lon']).cumsum()
trace['lat'] += origin['lat']
trace['lon'] += origin['lon']
trace.head()
```

And suppose you are interested in computing the maximum distance from the origin for the duration of the random walk. To compute distances between two points on Earth, we can use the haversine formula

```
from math import radians, cos, sin, asin, sqrt
def haversine(lat1, lon1, lat2, lon2):
"""
Calculate the great circle distance between two points
on the earth (specified in lat/lon)
"""
# convert to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2
c = 2 * asin(sqrt(a))
earth_radius = 6367
distance_km = earth_radius * c
return distance_km
```

this is what the trace of the random walk looks like:

```
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn; seaborn.set()
import matplotlib as mpl
mpl.rcParams['axes.formatter.useoffset'] = False
```

```
plt.plot(trace.lat, trace.lon)
```

We can then implement a few funtions to find the maximum distance from the random walk to the origin:

```
def get_distances(trace, origin):
distances = {}
for i in trace.index:
distances[i] = haversine(trace['lat'].loc[i], trace['lon'].loc[i], origin['lat'], origin['lon'])
distances = pd.Series(distances)
return distances
def get_farthest(trace, origin):
distance = get_distances(trace, origin)
max_idx = distance.argmax()
return trace.loc[max_idx], distance.loc[max_idx]
```

```
get_farthest(trace, origin)
```

The performance of this implementation is not great, however. We can use the timeit extension to quickly time the runtime performance:

```
%timeit get_farthest(trace, origin)
```

To further understand why this is slow, we can profile the code line by line with `%lprun`

, where "lp" stands for "line profiler", and the `-f`

argument specifies the function you'd like to profile

```
%lprun -f get_farthest get_farthest(trace, origin)
```

Timer unit: 1e-06 s Total time: 18.9899 s File:Function: get_farthest at line 8 Line # Hits Time Per Hit % Time Line Contents ============================================================== 8 def get_farthest(trace, origin): 9 1 18985235 18985235.0 100.0 distance = get_distances(trace, origin) 10 1 720 720.0 0.0 max_idx = distance.argmax() 11 1 3905 3905.0 0.0 return trace.loc[max_idx], distance.loc[max_idx]

We therefore see that the time to do `argmax()`

is negligible, but the distance computation `get_distances()`

is the hot spot. We can then further analyze `get_distances()`

by doing:

```
%lprun -f get_distances get_farthest(trace, origin)
```

Timer unit: 1e-06 s Total time: 19.2948 s File:Function: get_distances at line 1 Line # Hits Time Per Hit % Time Line Contents ============================================================== 1 def get_distances(trace, origin): 2 1 2 2.0 0.0 distances = {} 3 100001 99179 1.0 0.5 for i in trace.index: 4 100000 19165708 191.7 99.3 distances[i] = haversine(trace['lat'].loc[i], trace['lon'].loc[i], origin['lat'], origin['lon']) 5 1 29945 29945.0 0.2 distances = pd.Series(distances) 6 1 2 2.0 0.0 return distances

and we see that the overhead of creating the `pd.Series`

is negligible, but the loop with the `haversine()`

computation is the hot spot. So we chase it down further:

```
%lprun -f haversine get_farthest(trace, origin)
```

Timer unit: 1e-06 s Total time: 0.941073 s File:Function: haversine at line 2 Line # Hits Time Per Hit % Time Line Contents ============================================================== 2 def haversine(lat1, lon1, lat2, lon2): 3 """ 4 Calculate the great circle distance between two points 5 on the earth (specified in lat/lon) 6 """ 7 # convert to radians 8 100000 281846 2.8 29.9 lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2]) 9 # haversine formula 10 100000 80497 0.8 8.6 dlon = lon2 - lon1 11 100000 60033 0.6 6.4 dlat = lat2 - lat1 12 100000 233382 2.3 24.8 a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlon / 2)**2 13 100000 112446 1.1 11.9 c = 2 * asin(sqrt(a)) 14 100000 56156 0.6 6.0 earth_radius = 6367 15 100000 61922 0.6 6.6 distance_km = earth_radius * c 16 100000 54791 0.5 5.8 return distance_km

Now notice the "Total time" value has dropped from around 19s to 0.94s. This suggests that the expensive part is not actually within `haversine()`

itself, but the loop around it.

## Vectorization¶

The above example illustrates a key concept when using `numpy`

and `pandas`

known as vectorization. Essentially, we want to avoid writing loops in python, and instead give `numpy`

or `pandas`

an entire "vector" (i.e. `numpy`

array) to work with. There are many excellent resources on this topic including Modern Pandas by Tom Augspurger, Why Python is Slow by Jake VanderPlas, and Performance Pandas by Jeff Reback.

In this example, vectorization means being able to pass the entire lat/lon arrays into the `haversine()`

computation somehow, instead of looping through one point at a time.

Luckily it is not too hard to modify `haversine()`

to make it work with arrays:

```
def haversine_vectorized(lat1, lon1, lat2, lon2):
"""
Calculate the great circle distance between two points
on the earth. Note that lat1/lon1/lat2/lon2 can either be
scalars or numpy arrays.
"""
# convert to radians
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = np.sin(dlat / 2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2)**2
c = 2 * np.arcsin(np.sqrt(a))
earth_radius = 6367
distance_km = earth_radius * c
return distance_km
```

Note that the code change is fairly minimal. Essentially all we have done here is replacing all the `math`

functions with the `numpy`

equivalent which can operate on entire arrays instead of just scalars.

In fact most of the functions have identical names except for `math.asin()`

and `np.arcsin()`

. Also note that `numpy`

broadcasting will take care of the case where `lat1`

is an array and `lat2`

is a scalar.

Now our implementation becomes: (note the loop is gone)

```
def get_distances_vectorized(trace, origin):
distances = haversine_vectorized(trace['lat'], trace['lon'], origin['lat'], origin['lon'])
return distances
def get_farthest_vectorized(trace, origin):
distance = get_distances_vectorized(trace, origin)
max_idx = distance.argmax()
return trace.loc[max_idx], distance.loc[max_idx]
```

Let's make sure the results are the same:

```
get_farthest(trace, origin)
```

```
get_farthest_vectorized(trace, origin)
```

And we are getting a tremendous increase in performance with vectorization, going from more than 9 seconds to 15 milliseconds!

```
%timeit get_farthest_vectorized(trace, origin)
```