# Introduction to Financial Python

## Rate of Return, Mean and Variance

### Introduction

In this chapter we are going to introduce some basic concepts in quantitative finance. We start with rate of return, mean and variance. You may think it's simple to calculate these values, however, there are number of different methods to calculate them. It's important to choose the appropriate calculation methods case by case.

### Rate of Return

#### Single-period Return

The single-period rate of return can be calculated as following:

\[r = \frac{p_t}{p_0} - 1 = \frac{p_t - p_0}{p_0}\]Where \(r\) is the rate of return, \(p_t\) is the asset price at time \(t\), and \(p_0\) is the asset price at time 0.

import numpy as np rate_return = 102.0/100 - 1 print rate_return [out]: 0.02

Let's say we bought a stock at $100, and half a year later it will grow to $102. A year later the price will come to $104. How to calculate our total return? Well, we can either deem it as a single-period:

\[r = 104/100 - 1 = 0.04\]or as a two-stage period:

\[ r = (1+r_1)*(1+r_2) - 1 = \frac{102}{100} * \frac{104}{102} -1 = 0.04\]Here we make calculations twice a year. It's called semi-annual compounding. How about quarterly compounding? Let's assume the stock prices at the end of each quarter are \(p_1, p_2, p_3, p_4\) respectively.

\[r = (1+r_1)*(1+r_2)*(1+r_3)*(1+r_4) -1\]
The rate of return we calculate here is called **cumulative return** or** overall return**. It measures the total return of this asset over a period of time.

Now consider the following situation: we have two strategies: strategy A and strategy B. We backtested strategy A for 1 years and the cumulative return is 20%, while we backtested strategy B and the cumulative return is 65%. Which strategy has a high rate of return? Our commonly used method is to convert all the returns into **compounding annual return**, regardless of the investing horizon of each strategy. We can compare the returns of strategies with different time horizon now!

Strategy A has an higher compounding annual return!

#### Logarithm Return

We introduced compounding annual return above, which is a kind of **effective rate of return**. You can regard it as a 'hypothetical return'. Strategy B might never have a 18.167% rate of rate annually during the 3-year backtesting period. However, if we assume that the strategy has a 18.167% rate of return every year, it has the same cumulative return over the 3 years. As we mentioned previously, if we assume a strategy is quarterly compounding, the relation between quarterly effective rate of return and annual return is:

More generally, if the times of compounding in one year is \(n\) and the annual rate of return is \(r\), the relation is given by:

\[(1+\frac{r}{n})^n = 1+r\]
Now imagine the stock markets. The prices of your assets is changing every second, or even every millisecond. If the times of compounding, or n, approach to infinite, this is called **continuous compounding**. The calculation formula is given below:

From the above limitation equation, we know that if we assume continuous compounding:

\[e^r = 1 + r = \frac{p_t}{p_0}\]Then we take \(ln\) on both side of the equation:

\[r = ln\frac{p_t}{p_0} = lnp_t - lnp_0\]
Here we got the** logarithmic return, **or **continuously compounded return**. This is frequently used when calculating returns, because once we take logarithm of asset prices, we can calculate the logarithm return by simply doing a subtraction. Here we use Apple stock prices as a example:

import quandl import numpy as np import quandl quandl.ApiConfig.api_key = 'zNXvSaz2oX5afVGKjf6o' #get quandl data aapl_table = quandl.get('WIKI/AAPL') aapl = aapl_table.loc['2017-3',['Open','Close']] #take log return aapl['log_price'] = np.log(aapl.Close) aapl['log_return'] = np.log_price.diff() print aapl

The output is:

Date Open Close log_price log_return 2017-03-01 137.890 139.79 4.940141 NaN 2017-03-02 140.000 138.96 4.934186 -0.005955 2017-03-03 138.780 139.78 4.940070 0.005884 2017-03-06 139.365 139.34 4.936917 -0.003153 2017-03-07 139.060 139.52 4.938208 0.001291 2017-03-08 138.950 139.00 4.934474 -0.003734 2017-03-09 138.740 138.68 4.932169 -0.002305 2017-03-10 139.250 139.14 4.935481 0.003311 2017-03-13 138.850 139.20 4.935912 0.000431 2017-03-14 139.300 138.99 4.934402 -0.001510 2017-03-15 139.410 140.46 4.944923 0.010521 2017-03-16 140.720 140.69 4.946559 0.001636 2017-03-17 141.000 139.99 4.941571 -0.004988 2017-03-20 140.400 141.46 4.952017 0.010446 2017-03-21 142.110 139.84 4.940499 -0.011518 2017-03-22 139.845 141.42 4.951734 0.011235 2017-03-23 141.260 140.92 4.948192 -0.003542 2017-03-24 141.500 140.64 4.946203 -0.001989 2017-03-27 139.390 140.88 4.947908 0.001705 2017-03-28 140.910 143.80 4.968423 0.020515 2017-03-29 143.680 144.12 4.970646 0.002223 2017-03-30 144.190 143.93 4.969327 -0.001319 2017-03-31 143.720 143.66 4.967449 -0.001878

Here we calculated the daily logarithmic return of Apple stock. Given that we know the daily logarithm return of in this month, we can calculate the monthly return by simply sum all the daily returns up.

month_return = aapl.log_return.sum() print month_return [out]: 0.0273081001636

It may sounds incorrect to sum up the daily returns, but we can prove that it's mathematically correct. Let's assume the stock prices in a period of time are represented by \([p_0, p_1, p_2, p_3.....p_n]\). Then the cumulative rate of return is given by:

\[1+r = ln\frac{p_t}{p_0} = ln\frac{p_t}{p_{t-1}} + ln\frac{p_{t-1}}{p_{t-2}}+......+ln\frac{p_1}{p_0}\]According to the equation above, we can simple sum up each logarithmic return in a period to get the cumulative return. The convenience of this method is also one of the reasons why we use logarithmic return in quantitative finance.

### Mean

#### Arithmetic Mean

Mean is a measure of the central tendency of a data series. It capture the key character of the distribution of the data series. When we talk about mean, by default it refers to **arithmetic mean**. It's defined as the sum of the values divided by the number of observations:

Where \((x_1,x_2,x_3.....x_n)\) is our data series.

In python we can use NumPy.mean() to do the calculation:

print np.mean(aapl.log_price) [out]: 4.94597446551

#### Geometric Mean

The geometric mean is an average that is useful for data series of positive numbers that are better interpreted according to their product, such as growth rate. It's calculated by:

\[\bar{x} = \sqrt[n]{x_1x_2x_3...x_n}\]Let's calculate the geometric mean of a series of single-period return:

\[1+\bar{r} = \sqrt[n]{\frac{p_t}{p_{t-1}}*\frac{p_{t-1}}{p_{t-2}}*...*\frac{p_2}{p_1}}\] \[(1+\bar{r}) = \sqrt[n]{\frac{p_t}{p_0}}\]Now the equation becomes the form which we are familiar with:

\[(1+\bar{r})^n = \frac{p_t}{p_0}\]This is why we said it make sense when applied to growth rates.

### Variance and Standard Deviation

#### Variance

**Variance** is a measure of dispersion. In finance, most of the time variance is a synonym for risk. The higher the variance of an asset price is, the higher risk the asset bears. Variance is usually represented by \(\sigma\), and it's calculated by

In python we can use NumPy.var to calculate it:

print np.var(aapl.log_price)

#### Standard Deviation

The most commonly used measure of dispersion in finance is **standard deviation**. It's usually represented by \(\sigma\). It's obvious to see the relation between standard deviation and variance:

NumPy also provides us a method to calculate standard deviation.

print np.std(aapl.log_price) [out]: 0.000142032804482

### Summary

We introduced different types of rate of return in this chapter, which could be a little bit tricky when we calculate them. Mean and standard deviation are also very important concepts when we conduct hypothesis test or measure the risk associated with a asset. We will use those comcepts intensively in our later chapter.

You can also see our Documentation and Videos. You can also get in touch with us via Chat.