Problem with data extraction from Quandl

Back

Hello,

 

I couldn't run this algo as the time serie of the crude oil is shifted by almost 1 year.

Can you help me to fix this problem please 

 

Thank you 

Update Backtest







 
0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


import numpy as np
import pandas as pd
import statsmodels.api as sm
import decimal

class PcaStatArbitrageAlgorithm(QCAlgorithm):

def Initialize(self):
self.SetStartDate(2006, 1, 1) # Set Start Date
self.SetCash(100000) # Set Strategy Cash

self.nextRebalance = self.Time # Initialize next rebalance time
self.rebalance_days = 7 # Rebalance every 30 days

self.lookback = 252*3 # Length(days) of historical data
self.weights_long,self.weights_short = pd.DataFrame(),pd.DataFrame() # Pandas data frame (index: symbol) that stores the weight
self.Portfolio.MarginModel = PatternDayTradingMarginModel()
self.AGG = self.AddEquity("AGG", Resolution.Daily).Symbol

self.UniverseSettings.Resolution = Resolution.Daily # Use hour resolution for speed
self.AddUniverse(self.CoarseSelectionAndPCA)
self.AddData(Oil, "oil", Resolution.Daily).Symbol
self.oil = self.Securities["oil"].Symbol


def CoarseSelectionAndPCA(self, coarse):

# Before next rebalance time, just remain the current universe
if self.Time < self.nextRebalance:
return Universe.Unchanged

### Simple coarse selection first

# Sort the equities in DollarVolume decendingly
selected = sorted([x for x in coarse if x.HasFundamentalData and x.Price > 5],
key=lambda x: x.DollarVolume, reverse=True)

symbols = [x.Symbol for x in selected[:250]]

### After coarse selection, we do PCA and linear regression to get our selected symbols

# Get historical data of the selected symbols
history = self.History(symbols, self.lookback, Resolution.Daily).close.unstack(level=0)
#self.Debug(history.iloc[:,0])

# Select the desired symbols and their weights for the portfolio from the coarse-selected symbols
SP500_history = self.History(self.oil, self.lookback, Resolution.Daily).close.unstack(level=0)
self.Debug(SP500_history.iloc[:2,0])


self.weights_long,self.weights_short = self.GetWeights(history,SP500_history)

# If there is no final selected symbols, return the unchanged universe
if self.weights_long.empty or self.weights_short.empty :
return Universe.Unchanged

BTK = (self.weights_long).append(self.weights_short)

return [x for x in symbols if str(x) in BTK ]#or self.weights_short.index]


def GetWeights(self, history,SP500_history):

# Sample data for PCA (smooth it using np.log function)
sample = history.dropna(axis=1).resample('1W').last().pct_change().dropna()
#self.Debug(sample.iloc[-10:,1])

market_returns = SP500_history.resample('1W').last().pct_change().dropna()

# Train Ordinary Least Squares linear model for each stock
OLSmodels = {ticker: sm.OLS(sample[ticker], market_returns).fit() for ticker in sample.columns}

Betas = pd.DataFrame({ticker: model.params for ticker, model in OLSmodels.items()}).iloc[0,:]

# Get the stocks far from mean (for mean reversion)
Betas = Betas[Betas>=-0.10]

selected_long = Betas[Betas <= Betas.quantile(0.10) ].drop(columns = self.SPY)

selected_short = Betas[Betas >= Betas.quantile(0.95) ].drop(columns = self.SPY)

# Return the weights for each selected stock
weights_long = selected_long * (1 / len(selected_long))/selected_long

weights_short = selected_short* (-1 / len(selected_short))/selected_short

return weights_long.sort_values() ,weights_short.sort_values()


def OnData(self, data):
'''
Rebalance every self.rebalance_days
'''
### Do nothing until next rebalance
if self.Time < self.nextRebalance:
return

### Open positions
for symbol, weight in self.weights_long.items():
self.SetHoldings(symbol,0.40*weight)

#for symbol, weight in self.weights_short.items():
# self.SetHoldings(symbol,0)

self.SetHoldings('AGG', 0.60)

### Update next rebalance time
self.nextRebalance = self.Time + timedelta(self.rebalance_days)


def OnSecuritiesChanged(self, changes):
'''
Liquidate when the symbols are not in the universe
'''
for security in changes.RemovedSecurities:
if security.Invested:
self.Liquidate(security.Symbol, 'Removed from Universe')

class Oil(PythonData):
def GetSource(self, config, date, isLiveMode):
return SubscriptionDataSource("https://www.quandl.com/api/v3/datasets/OPEC/ORB.csv?order=asc", SubscriptionTransportMedium.RemoteFile)
def Reader(self, config, line, date, isLiveMode):
oil = Oil()
oil.Symbol = config.Symbol
if not (line.strip() and line[0].isdigit()): return None
try:
data = line.split(',')
value = float(data[1])
value = decimal.Decimal(value)
if value == 0: return None
oil.Time = datetime.strptime(data[0], "%Y-%m-%d")
oil.Value = value
oil["close"] = float(value)
return oil

except ValueError:
return None

There is the attached code :

0

Hi Wawes,

I am unsure what you mean by that the time-series is shifted by one year, as the values seem to be correct for their dates. If you could elaborate on this, that would be greatly appreciated. Furthermore, we can use the built-in support for Quandl data with the PythonQuandl class, which I've shown in the attached backtest.

Best,
Shile Wen

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


import numpy as np
import pandas as pd
import statsmodels.api as sm

class Oilsensibiltiy(QCAlgorithm):

def Initialize(self):

self.SetStartDate( 2010 , 12, 7 ) # Set Start Date
self.SetEndDate( 2020 , 10 , 5 ) # Set End Date

self.SetCash(100000) # Set Strategy Cash

self.nextRebalance = self.Time # Initialize next rebalance time
self.rebalance_days = 30 # Rebalance every 30 days

self.lookback = 252*3 # Length(days) of historical data
self.weights_long= pd.DataFrame() # Pandas data frame (index: symbol) that stores the weight
self.Portfolio.MarginModel = PatternDayTradingMarginModel()
self.AGG = self.AddEquity("AGG", Resolution.Daily).Symbol

self.UniverseSettings.Resolution = Resolution.Daily # Use hour resolution for speed
self.AddUniverse(self.CoarseSelection)
self.oil = self.AddData(QuandlOil, 'OPEC/ORB', Resolution.Daily).Symbol
self.selectedequity = 250


def CoarseSelection(self, coarse):

# Before next rebalance time, just remain the current universe
if self.Time < self.nextRebalance:
return Universe.Unchanged

### Simple coarse selection first

# Sort the equities in DollarVolume decendingly
selected = sorted([x for x in coarse if x.HasFundamentalData and x.Price > 5],
key=lambda x: x.DollarVolume, reverse=True)

symbols = [x.Symbol for x in selected[: self.selectedequity ] ]

# Get historical data of the selected symbols
history = self.History(symbols, self.lookback, Resolution.Daily).close.unstack(level=0)

self.Debug(history.index[0])

# Select the crude oil datas
crudeoil_history = self.History(QuandlOil, self.oil , self.lookback, Resolution.Daily).droplevel(level=0)

crudeoil_history = crudeoil_history[~crudeoil_history.index.duplicated(keep='last')]

self.Debug(crudeoil_history.index[0])

self.weights_long = self.GetWeights(history,crudeoil_history)

# If there is no final selected symbols, return the unchanged universe
if self.weights_long.empty :
return Universe.Unchanged

#BTK = (self.weights_long).append(self.weights_short)

return [x for x in symbols if str(x) in self.weights_long ]#or self.weights_short.index]


def GetWeights(self, history,crudeoil_history):

# equity historical pricesprices
sample = history.dropna(axis=1).resample('1W').last().pct_change().dropna()

crudeoil_history = crudeoil_history.resample('1W').last().pct_change().dropna()

# Train Ordinary Least Squares linear model for each stock
OLSmodels = {ticker: sm.OLS(sample[ticker], crudeoil_history).fit() for ticker in sample.columns}

Betas = pd.DataFrame({ticker: model.params for ticker, model in OLSmodels.items()}).iloc[0,:]

#We want decorrelated Betas
Betas = abs(Betas)

selected_long = Betas[Betas <= Betas.quantile(0.10) ].drop(columns = self.oil)

#selected_short = Betas[Betas >= Betas.quantile(0.95) ].drop(columns = self.SPY)

# Return the weights for each selected stock
weights_long = selected_long * (1 / len(selected_long))/selected_long

#weights_short = selected_short* (-1 / len(selected_short))/selected_short

return weights_long.sort_values() #,weights_short.sort_values()


def OnData(self, data):

### Do nothing until next rebalance
if self.Time < self.nextRebalance:
return

### Open positions
for symbol, weight in self.weights_long.items():
self.SetHoldings(symbol,0.40*weight)

#for symbol, weight in self.weights_short.items():
# self.SetHoldings(symbol,0)

self.SetHoldings('AGG', 0.60)

### Update next rebalance time
self.nextRebalance = self.Time + timedelta(self.rebalance_days)

def OnSecuritiesChanged(self, changes):
'''
Liquidate when the symbols are not in the universe
'''
for security in changes.RemovedSecurities:
if security.Invested:
self.Liquidate(security.Symbol, 'Removed from Universe')

class QuandlOil(PythonQuandl):
def __init__(self):
self.ValueColumnName = 'Value'

Thank you for your answer,but this time there is discrepancy between the timeframes of the crude oil and the other equities.

As exemple,for the following code:

The first entry of the crude oil price is registred at the date: 2007-12-07 00:00:00,while for the other equities it is : 2008-11-12 00:00:00

0

The issue is probably that you use datas that do not come from the same data provider, and for which the "history" is not managed the same.

You could try the following method:
self.lookbackEquities = 252*3    # Length(days) of historical data for Equities
self.lookbackOil = 365*3 + 1      # Length(days) of historical data for Oil (3 years + 1 day for leap year)

Then you could replace self.lookback in your code, using instead the self.lookbackEquities and self.lookbackOil respectively each time history is called.


That "far-from-perfect-solution" should work for a few years of backtests... until you have the issue of leap years and/or a discrepancy in the number of bank holidays, which would generate the same error: "Runtime Error: ValueError : The indices for endog and exog are not aligned"

Maybe you could handle this much better by either writing a function that would handle all edge cases, of by using a reverse method (calling the history for Equities, then checking the first date of the data received, then updating the self.lookbackOil value through a difference between current date and the oldest date received, so that dates of both data sources are always the same)


Hope this helps!

1

Update Backtest





0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Loading...

This discussion is closed