# Fibonacci Numbers in Python

## Fibonacci numbers¶

The Fibonacci numbers are defined recursively by the following difference equation:

\left\{ \begin{aligned} F_{n} & = F_{n-1} + F_{n-2} \\ F_1 & = 1 \\ F_0 & = 0 \\ \end{aligned} \right.

It is easy to compute the first few elements in the sequence:

$0, 1, 1, 2, 3, 5, 8, 13, 21, 34 \cdots$

## Derivation of the general formula¶

It is possible to derive a general formula for $F_n$ without computing all the previous numbers in the sequence. If a gemetric series (i.e. a series with a constant ratio between consecutive terms $r^n$) is to solve the difference equation, we must have

\begin{aligned} r^n = r^{n-1} + r^{n-2} \\ \end{aligned}

which is equivalent to

\begin{aligned} r^2 = r + 1 \\ \end{aligned}

This equation has two unique solutions \begin{aligned} \varphi = & \frac{1 + \sqrt{5}}{2} \approx 1.61803\cdots \ \psi = & \frac{1 - \sqrt{5}}{2} = 1 - \varphi = - {1 \over \varphi} \approx -0.61803\cdots \ \end{aligned}

In particular the larger root is known as the golden ratio \begin{align} \varphi = \frac{1 + \sqrt{5}}{2} \approx 1.61803\cdots \end{align}

Now, since both roots solve the difference equation for Fibonacci numbers, any linear combination of the two sequences also solves it

\begin{aligned} a \left(\frac{1 + \sqrt{5}}{2}\right)^n + b \left(\frac{1 - \sqrt{5}}{2}\right)^n \\ \end{aligned}

It's not hard to see that all Fibonacci numbers must be of this general form because we can uniquely solve for $a$ and $b$ such that the initial conditions of $F_1 = 1$ and $F_0 = 0$ are met

\left\{ \begin{aligned} F_0 = 0 = a \left(\frac{1 + \sqrt{5}}{2}\right)^0 + b \left(\frac{1 - \sqrt{5}}{2}\right)^0 \\ F_1 = 1 = a \left(\frac{1 + \sqrt{5}}{2}\right)^1 + b \left(\frac{1 - \sqrt{5}}{2}\right)^1 \\ \end{aligned} \right.

yielding

\left\{ \begin{aligned} a = \frac{1}{\sqrt{5}} \\ b = \frac{-1}{\sqrt{5}} \\ \end{aligned} \right.

We have therefore derived the general formula for the $n$-th Fibonacci number

\begin{aligned} F_n = \frac{1}{\sqrt{5}} \left(\frac{1 + \sqrt{5}}{2}\right)^n - \frac{1}{\sqrt{5}} \left(\frac{1 - \sqrt{5}}{2}\right)^n \\ \end{aligned}

Since the second term has an absolute value smaller than $1$, we can see that the ratios of Fibonacci numbers converge to the golden ratio

\begin{aligned} \lim_{n \rightarrow \infty} \frac{F_n}{F_{n-1}} = \frac{1 + \sqrt{5}}{2} \end{aligned}

## Various implementations in Python¶

Writing a function in Python that outputs the $n$-th Fibonacci number seems simple enough. However even in this simple case one should be aware of some of the computational subtleties in order to avoid common pitfalls and improve efficiency.

### Common pitfall #1: inefficient recursion¶

Here's a very straight-forward recursive implementation

In [1]:
import math
from __future__ import print_function
In [2]:
def fib_recursive(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib_recursive(n-1) + fib_recursive(n-2)
In [3]:
print([fib_recursive(i) for i in range(20)])
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]

this seems to work fine, however the recursion overhead is actually very significant when $n$ is just slightly large. Here I'm computing $F_{34}$ and it takes more than 3 seconds! (on a 2013 model Macbook Air)

In [4]:
%timeit fib_recursive(34)
1 loops, best of 3: 3.58 s per loop

The overhead incurred by creating a large number of stack frames is tremendous. Python by default does not perform what's known as tail recursion elimination http://stackoverflow.com/questions/13543019/why-is-recursion-in-python-so-slow, and therefore this is a very inefficient implemenation. In contrast, if we have an iterative implementation, the speed is dramatically faster

In [5]:
def fib_iterative(n):
a, b = 0, 1
while n > 0:
a, b = b, a + b
n -= 1
return a
In [6]:
%timeit fib_iterative(34)
100000 loops, best of 3: 4.59 µs per loop

Now, let's see if we can make it even faster by eliminating the loop altogether and just go straight to the general formula we derived earlier

In [7]:
def fib_formula(n):
golden_ratio = (1 + math.sqrt(5)) / 2
val = (golden_ratio**n - (1 - golden_ratio)**n) / math.sqrt(5)
return int(round(val))
In [8]:
%timeit fib_formula(34)
1000000 loops, best of 3: 1.36 µs per loop

Even faster, great! And since we are not looping anymore, we should expect to see the computation time to scale better as $n$ increases. That's indeed what we see:

In [9]:
import pandas as pd
import numpy as np
In [10]:
%matplotlib inline
import matplotlib.pyplot as plt
from IPython.core.pylabtools import figsize
figsize(15, 5)
In [11]:
elapsed = {}
elapsed['iterative'] = {}
elapsed['formula'] = {}
for i in range(34):
result = %timeit -n 10000 -q -o fib_iterative(i)
elapsed['iterative'][i] = result.best
result = %timeit -n 10000 -q -o fib_formula(i)
elapsed['formula'][i] = result.best
In [12]:
elapased_ms = pd.DataFrame(elapsed) * 1000
elapased_ms.plot(title='time taken to compute the n-th Fibonaccis number')
plt.ylabel('time taken (ms)')
plt.xlabel('n')
Out[12]:
<matplotlib.text.Text at 0x107c72ed0>

Indeed as we expect, the iterative approach scales linearly, while the formula approach is basically constant time.

However we need to be careful with using a numerical formula like this for getting integer results.

### Common pitfall #2: numerical precision¶

Here we compare the actual values obtained by fib_iterative() and fib_formula(). Notice that it does not take a very large n for us to run into numerical precision issues.

When n is 71 we are starting to get different results from the two implementations!

In [13]:
df = {}
df['iterative'] = {}
df['formula'] = {}
df['diff'] = {}

for i in range(100):
df['iterative'][i] = fib_iterative(i)
df['formula'][i] = fib_formula(i)
df['diff'][i] = df['formula'][i] - df['iterative'][i]
df = pd.DataFrame(df, columns=['iterative', 'formula', 'diff'])
df.index.name = 'n-th Fibonacci'
df.ix[68:74]
Out[13]:
iterative formula diff
n-th Fibonacci
68 72723460248141 72723460248141 0
69 117669030460994 117669030460994 0
70 190392490709135 190392490709135 0
71 308061521170129 308061521170130 1
72 498454011879264 498454011879265 1
73 806515533049393 806515533049395 2
74 1304969544928657 1304969544928660 3

You can see that fib_iterative() produces the correct result by eyeballing the sum of $F_{69}$ and $F_{70}$, while fib_formual() starts to have precision errors as the number gets larger. So, be mindful with precision issues when doing numerical computing. Here's a nice article on this topic http://www.codeproject.com/Articles/25294/Avoiding-Overflow-Underflow-and-Loss-of-Precision

Also notice that unlike C/C++, in Python there's technically no limit in the precision of its integer representation. In Python 2 any overflowing operation on int is automatically converted into long, and long has arbitrary precision. In Python 3 it is just int. More information on Python's arbitrary-precision integers can be found here http://stackoverflow.com/questions/9860588/maximum-value-for-long-integer