LEAN: backtesting not pulling historical data from Alpaca

Back

Hi, so I reconfigured everything to run from inside the Docker image container (re another post) which I have running on GCP in live mode using the Alpaca API working fine (at least in paper mode so far). I set up Docker on my local Kubuntu machine because I figured it would be easier to load the html file output from a Plotly graph I created to display the algo json output. All working pretty well, except there is a large gap in the historical data starting around July 2018. 

I know this was about the time I stopped working on LEAN last time because there seemed to be no way to get historical data due to the shutdown of Google and Yahoo historical sources. (btw, I'm only using daily data). So, I could check the Data folder, and will, but I'm pretty sure this is the last date of the default included data set for SPY. 

So I've checked the config.json for errors, and tried several permutations between 'backtesting' and 'live-alpaca', and 'live-mode' true or false, and added "data-queue-handler": "AlpacaBrokerage" to backtesting environment., with no luck. The 'live-alpaca' works fine and no errors with the 'backtesting', just no historical data added to fill this gap.

I'm going to try messing with the "history-provider": "QuantConnect.Lean.Engine.HistoricalData.SubscriptionDataReaderHistoryProvider" under 'backtesting', but shouldn't this work out of the box? Or am I setting the config file incorrectly. Or is this another bug with the new Alpaca implementation? Any assistance appreciated, thanks.

PS, is there another historical data provider I should be pulling from.. I have my QC api info added. Or maybe the Alpaca history has constraints?.. But I thought they provided historical daily for a year or two, which should show up as my timeframe is from 2010 to yesterday.

 

Update Backtest







 
0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


I'm starting to think there isn't a "history-provider" implemented for Alpaca. I vaguely remember IB pulling in historical data. I guess that should be opened as an issue on GitHub.

Also, I feel like the config.json should be more clear about this, as there is a setting for "// live data configuration" but not for historical data provider. I know this has been difficult to get reliable sources, but didn't the IB historical work? And was that set as default? Or does the "history-provider" try whichever services have non-empty config information like API keys?

0

Okay, I remember now.. pulling data from QC isnt coded into LEAN for some reason. You have to manually download. Not even if you have the QC API keys set. I understand this may be for bandwidth cost reasons, but an error warning message would be really helpful here. Just something saying 'full data is not available for the dates set for algorithm run'.

There still may be a bug with the Alpaca historical data ingestion code.

My end date is acutally the last time I manually downloaded SPY. Here is a snippet that will do that:

wget https://www.quantconnect.com/data/tree/equity/usa/daily/s*/spy.zip /root/Lean/Data/equity/usa/daily/spy.zip


The QC Data Explorer is really awesome, no doubt. But the search box defaults to minute data, and when after clicking on daily, only works again after drilling down to the first letter directory of the ticker. 

quantconnect.com/data/tree/equity/usa/daily/s*

 

0

wget https://www.quantconnect.com/data/tree/equity/usa/daily/s*/spy.zip /root/Lean/Data/equity/usa/daily/spy.zip

Also, comment out line 196 in /Queues/JobQueue.cs to make your algo, not stall and complete for further scripting.

# System.Console.Read();

 

0

Nope, couldn't be that easy. That wget method doesnot work. But you can pull from Alpaca:

alpaca.markets/docs/api-documentation/how-to/market-data/

alpaca.markets/docs/api-documentation/api-v2/market-data/bars/

polygon.io/docs/#get_v2_ticks_stocks_trades__ticker___date__anchor

import alpaca_trade_api as tradeapi
import pandas as pd

api = tradeapi.REST('xx', 'xx', 'https://paper-api.alpaca.markets')
df = api.get_barset('SPY', 'day', start=pd.Timestamp('2008-01-02', tz='America/New_York').isoformat(), end=pd.to_datetime('now').isoformat()).df

df.index = pd.to_datetime(df.index).strftime('%Y%m%d 00:00')
print(df.head())
print(df.tail())
df.to_csv('/qc/SPY.csv', header=None)
zip /qc/spy.zip /qc/spy.csv
cp /qc/spy.zip /root/Lean/Data/equity/usa/daily/spy.zip

This could be added to the Custom Data guide:

quantconnect.com/docs/algorithm-reference/importing-custom-data#Importing-Custom-Data

0

better.. import old, format correctly, append and zip:

 

import alpaca_trade_api as tradeapi
import pandas as pd
import zipfile

with zipfile.ZipFile('/root/Lean/Data/equity/usa/daily/spy.zip', 'r') as zip_ref: zip_ref.extractall('/')
wait = input("PRESS ENTER TO CONTINUE.")
import pandas as pd
data = pd.read_csv("/qc/spy.csv", header=None)
lastDate = data.iloc[-1][0]
print(lastDate)

api = tradeapi.REST('xx', 'xx', 'https://paper-api.alpaca.markets')
#df = api.get_barset('SPY', 'day', start=pd.Timestamp('2008-01-02', tz='America/New_York').isoformat(), end=pd.to_datetime('now').isoformat()).df
df = api.get_barset('SPY', 'day', start=pd.Timestamp(lastDate, tz='America/New_York').isoformat(), end=pd.to_datetime('now').isoformat()).df
#df.to_csv('/qc/spy-new.csv', header=None, mode='a')

df.index = pd.to_datetime(df.index).strftime('%Y%m%d 00:00')
df.iloc[:,0] = (df.iloc[:,0]*10000).astype(int)
df.iloc[:,1] = (df.iloc[:,1]*10000).astype(int)
df.iloc[:,2] = (df.iloc[:,2]*10000).astype(int)
df.iloc[:,3] = (df.iloc[:,3]*10000).astype(int)

print(df.head())
print(df.tail())
wait = input("PRESS ENTER TO CONTINUE.")
df.to_csv('/qc/spy.csv', header=None, mode='a')

 

 

0

gawh, and change "zip /qc/spy.zip /qc/spy.csv" to "cd qc ; zip spy.zip spy.csv" or whatever your dir is. zip is picky and will include subdirs in your zipfile.

0

and make all the timestamps to be noon on daily data to be used for benchmark data under ../usa/hour/spy.zip

0

Hey John Smith Mitoch did you ever figure this out? Any insight you can provide on using brokerage provided data for backtesting would be much appreciated. Thanks mate!

0

Seems like you got it John =) 

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Update Backtest





0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Loading...

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!