使用回归计算hedge ratio并使用adfuller Test判断价差稳定性

本文内容是配对交易中计算价差、adf稳定性检验的代码部分
keywords： # hedge_ratio # spread # adfuller test
具体的理论，详见这篇文章

导入

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from statsmodels.tsa.stattools import adfuller
import matplotlib.pyplot as plt

%matplotlib inline
plt.rc('figure', figsize=(16, 9))

设置随机数据

np.random.seed(2020)
# use returns to create a price series
drift = 100
r1 = np.random.normal(0, 1, 1000) 
s1 = pd.Series(np.cumsum(r1), name='s1') + drift
s1.plot(figsize=(14,6))
plt.show()

offset = 10
noise = np.random.normal(0, .95, 1000)
s2 = s1 + offset + noise
s2.name = 's2'
pd.concat([s1, s2], axis=1).plot(figsize=(15,6))
plt.show()

计算比率

歧视此处的价格比率和对冲比率的都是为后面服务的。他们之间有一点区别

price_ratio用当下的价格进行计算比率
hedge_ratio则考虑了长期的价格，是根据一段时间的价格产生的，计算方法更复杂，理论上效果也更好，可以简单的理解为price_ratio的升级版

price_ratio价格比率

price_ratio = s2/s1
price_ratio.plot(figsize=(15,7)) 
plt.axhline(price_ratio.mean(), color='black') 
plt.xlabel('Days')
plt.legend(['s2/s1 price ratio', 'average price ratio'])
plt.show()
print(f"average price ratio {price_ratio.mean():.4f}")

使用回归计算hedge ratio对冲比率

lr = LinearRegression()
lr.fit(s1.values.reshape(-1,1),s2.values.reshape(-1,1))
hedge_ratio = lr.coef_[0][0]
print(hedge_ratio)
intercept = lr.intercept_[0]
print(intercept)

计算价差

spread = s2 - s1 * hedge_ratio #注意此处不应包含截距
spread.plot(figsize=(15,7)) 
plt.axhline(spread.mean(), color='black') 
plt.xlabel('Days')
plt.legend(['Spread: s2 - hedge_ratio * s1', 'average spread'])
plt.show()

注意此处代码第一行，是没有使用intercept截距的，因为我们所求的spread价差实际上就是这个截距。

使用Augmented Dickey Fuller Test 检查是否稳定

adf_result=adfuller(spread)
pvalue=adf_result[1] #第二个值是p值
print(pvalue) #pvalue<=plevel(.05),spread稳定，pvalue>.05则不稳定

在此处我们把这个功能，写成一个函数，输出布尔值，方便后面调用做出判断

def is_spread_stationary(spread, p_level=0.05):
    adf_result = adfuller(spread)
    pvalue = adf_result[1]
    if pvalue <= p_level:
        print(f"pvalue is <= {p_level}, assume spread is stationary")
        return True
    else:
        print(f"pvalue is > {p_level}, assume spread is not stationary")
        return False