Dan,
Thank you so much! That was just what I was looking for.
This is the code I ended up with:
def CoarseSelectionFunction(self, coarse):
# First narrow the original coarse universe down to a managable level.
init_select = [stock for stock in coarse if (stock.HasFundamentalData)
and (float(stock.Price) >= self.MyLeastPrice)
and (float(stock.Price) <= self.MyMostPrice)]
# second convert the initial selection to a pandas dataframe.
stock_symbols = [stock.Symbol for stock in init_select]
stock_data = [(stock.DollarVolume,) for stock in init_select]
column_names = ['dollar_volume']
# Use coerce parameter = True to convert data objects to numbers
data_df = pd.DataFrame.from_records(
stock_data,
index=stock_symbols,
columns=column_names,
coerce_float=True)
# Use pandas methods to select the assets we want
# First find the values of dollar_volume at our desired bounds
lower_percent = data_df.dollar_volume.quantile(self.LowVar)
upper_percent = data_df.dollar_volume.quantile(self.HighVar)
# Now simply query using those values
# Filter for has_fundamentals to remove ETFs
my_universe = (data_df.
query('(dollar_volume >= @lower_percent) & (dollar_volume <= @upper_percent)'))
# See how many securities are found in our universe
self.data.Debug("{} securities found ".format(my_universe.shape[0]))
# Expects a list of symbols returned
return my_universe.index.tolist()
I think it's slightly more efficient than your code, especially if QC has difficulties with dataframes, because instead of adding 3 columns of data from all ~6000 stocks into a dataframe, first it narrows down the list of desired stocks by .hasfundamentaldata and .price before it adds them into a dataframe, since those are easy to do without one.
It actually sped up my backtest signifigantly because I was able to narrow down my coarse universe much more before getting to the fine fundamental data. Although still at 4k data points per second.