Machine Learning
Stable Baselines
Get Historical Data
Get some historical market data to train and test the model. For example, to get data for the SPY ETF during 2020 and 2021, run:
qb = QuantBook() symbol = qb.AddEquity("SPY", Resolution.Daily).Symbol history = qb.History(symbol, datetime(2020, 1, 1), datetime(2022, 1, 1)).loc[symbol]
Prepare Data
You need some historical data to prepare the data for the model. If you have historical data, manipulate it to train and test the model. In this example, calculate the log return time-series of the securities:
ret = np.log(history/history.shift(1)).iloc[1:].close
Train Models
You need to prepare the historical data for training before you train the model. If you have prepared the data, build and train the environment and the model. In this example, create a gym
environment to initialize the training environment, agent and reward. Then, create a RL model by DQN algorithm. Follow these steps to create the environment and the model:
- Split the data for training and testing to evaluate our model.
- Create a custom
gym
environment class. - Initialize the environment.
- Train the model.
X_train = history.iloc[:-50].values X_test = history.iloc[-50:].values y_train = ret.iloc[:-50].values y_test = ret.iloc[-50:].values
In this example, create a custom environment with previous 5 OHLCV log-return data as observation and the highest portfolio value as reward.
class TradingEnv(gym.Env): metadata = {'render.modes': ['console']} FLAT = 0 LONG = 1 SHORT = 2 def __init__(self, ohlcv, ret): super(TradingEnv, self).__init__() self.ohlcv = ohlcv self.ret = ret self.trading_cost = 0.01 self.reward = 1 # The number of step the training has taken, starts at 5 since we're using the previous 5 data for observation. self.current_step = 5 # The last action self.last_action = 0 # Define action and observation space # Example when using discrete actions, we have 3: LONG, SHORT and FLAT. n_actions = 3 self.action_space = gym.spaces.Discrete(n_actions) # The observation will be the coordinate of the agent, shape for (5 previous data poionts, OHLCV) self.observation_space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(5, 5), dtype=np.float64) def reset(self): # Reset the number of step the training has taken self.current_step = 5 # Reset the last action self.last_action = 0 # must return np.array type return self.ohlcv[self.current_step-5:self.current_step].astype(np.float32) def step(self, action): if action == self.LONG: self.reward *= 1 + self.ret[self.current_step] - (self.trading_cost if self.last_action != action else 0) elif action == self.SHORT: self.reward *= 1 + -1 * self.ret[self.current_step] - (self.trading_cost if self.last_action != action else 0) elif action == self.FLAT: self.reward *= 1 - (self.trading_cost if self.last_action != action else 0) else: raise ValueError("Received invalid action={} which is not part of the action space".format(action)) self.last_action = action self.current_step += 1 # Have we iterate all data points? done = (self.current_step == self.ret.shape[0]-1) # Reward as return return self.ohlcv[self.current_step-5:self.current_step].astype(np.float32), self.reward, done, {} def render(self, mode='console'): if mode != 'console': raise NotImplementedError() print(f'Equity Value: {self.reward}')
env = TradingEnv(X_train, y_train)
In this example, create a RL model and train with MLP-policy DQN algorithm.
model = DQN(MlpPolicy, env, verbose=1) model.learn(total_timesteps=1000)
Test Models
You need to build and train the model before you test its performance. If you have trained the model, test it on the out-of-sample data. Follow these steps to test the model:
- Initialize a list to store the equity value with initial capital in each timestep, and variables to store last action and trading cost.
- Iterate each testing data point for prediction and trading.
- Plot the result.
equity = [1] last_action = 0 trading_cost = 0.01
for i in range(5, X_test.shape[0]): action, _ = model.predict(X_test[i-5:i], deterministic=True) if action == 0: new = equity[-1] * (1 - (trading_cost if last_action != action else 0)) elif action == 1: new = equity[-1] * (1 + y_test[i] - (trading_cost if last_action != action else 0)) elif action == 2: new = equity[-1] * (1 + -1 * y_test[i] - (trading_cost if last_action != action else 0)) equity.append(new) last_action = action
plt.figure(figsize=(15, 10)) plt.title("Equity Curve") plt.xlabel("timestep") plt.ylabel("equity") plt.plot(equity) plt.show()

Store Models
You can save and load stable baselines
models using the Object Store.
Save Models
- Set the key name of the model to be stored in the Object Store.
- Call the
GetFilePath
method with the key. - Call the
save
method with the file path.
model_key = "model"
file_name = qb.ObjectStore.GetFilePath(model_key)
This method returns the file path where the model will be stored.
model.save(file_name)
Load Models
You must save a model into the Object Store before you can load it from the Object Store. If you saved a model, follow these steps to load it:
- Call the
ContainsKey
method. - Call the
GetFilePath
method with the key. - Call the
load
method with the file path, environment and policy.
qb.ObjectStore.ContainsKey(model_key)
This method returns a boolean that represents if the model_key
is in the Object Store. If the Object Store does not contain the model_key
, save the model using the model_key
before you proceed.
file_name = qb.ObjectStore.GetFilePath(model_key)
This method returns the path where the model is stored.
loaded_model = DQN.load(file_name, env=env, policy=MlpPolicy)
This method returns the saved model.