Back

[Important] Universe Selection in Python Algorithms

We have made an improvement in the Universe Selection feature that addresses one complaint that python quants have made. Up to now, we were forced to create a C# List of Symbol object, populate it with the Symbol member of the universe and finally return the list:

def CoarseSelectionFunction(self, coarse):
    '''Take the top 50 by dollar volume using coarse'''
    # sort descending by daily dollar volume
    sortedByDollarVolume = sorted(coarse, \
        key=lambda x: x.DollarVolume, reverse=True) 

    # we need to return only the symbol objects
    list = List[Symbol]()
    for x in sortedByDollarVolume[:50]: list.Add(x.Symbol)
    return list

With the update, we can return a list which is more python friendly:

def CoarseSelectionFunction(self, coarse):
    '''Take the top 50 by dollar volume using coarse'''
    # sort descending by daily dollar volume
    sortedByDollarVolume = sorted(coarse, \
        key=lambda x: x.DollarVolume, reverse=True) 

    # we need to return only the symbol objects
    return [ x.Symbol for x in sortedByDollarVolume[:50] ]

Unfortunately, we were not able to support back compatibility and this change will "break" algorithms that rely on the former return type. The following runtime error message will be shown:

20170926 14:29:42 ERROR:: Runtime Error: Python.Runtime.ConversionException: could not convert Python result to System.Object[]

Meaning that pythonnet could not convert the List<Symbol> into System.Object[] which is the python list.

Update Backtest






The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.



Thanks for this,

If I understand correctly, even though the selection functions are returning a list of symbols, the universe change function is putting out a list of equity objects right?

I had a lot of trouble figuring out which functions deal with "equity_object"s, which ones use "equity_object.Symbol"s, and which ones use "equity_object.Symbol.Value"s

0

Thanks Lukel -- to clear that up:
1) You shouldn't need to use "equity_object.Symbol.Value" ever -- 
2) The universe only ever deals with Symbol objects (light, low impact objects).
3) When you add a security you get the security object (once its been through selection in the universe). Securities are heavy objects which we don't throw around.

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Thumbs up for this change. This definitely is more 'pythonic'. I prefer to work with pandas dataframes and use the powerful built in pandas methods. This change now allows for using the 'tolist' method to easily return the desired universe.

Here's an example.

def CoarseSelectionFunction(context, coarse):
# First convert the 'course' data to a pandas dataframe.
# (This way we can use all the powerful builtin datafame methods)

# Select 20 largest dollar_volume stocks with prices higher than $5.
# Use 'has_fundamentals' to filter out ETFs and the like.
# This function expects to return a list of symbols so use the 'tolist' method.

data_df = to_dataframe(coarse)
my_universe = (data_df.
query("(price > 5) and has_fundamentals").
nlargest(20, 'dollar_volume').
index.tolist())

return my_universe

The 'to_dataframe' is simply a helper function I added to make the code more readible. 

def to_dataframe(data_c):
# Helper function to make a dataframe from the coarse object.
# Then we can use all the handy dataframe methods.
# Set the 'coerce_float' parameter to get meaningful datatypes
# otherwise all the fields will be un-useful generic objects.
data = [(
stock.Price,
stock.Volume,
stock.DollarVolume,
stock.HasFundamentalData)
for stock in data_c]

symbols = [stock.Symbol for stock in data_c ]
labels = ['price', 'volume', 'dollar_volume', 'has_fundamentals']

data_df = pd.DataFrame.from_records(
data,
index=symbols,
columns=labels,
coerce_float=True)

return data_df

I'd now like to see the 'course' object passed as a dataframe. That would make this much more friendly.

Thanks.

2

Dan Whitnable, we have thought of that.
Universe Selection is mostly a filter and/or sort operation and we can do it lambda expressions.
Since community members have had experienced memory and speed issues when using pandas.DataFrame, we don't want to impose pandas.DataFrame, since, like I said, lambda expressions can be used instead.
We could create a method helper that converts the current type to pandas.DataFrame, but it is probably better to leave it to users. Thank you for provinding the code for others to use!

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


I've been having problems implementing this change into my code. Would love some help.

def CoarseSelectionFunction(self, coarse):
# if the rebalance flag is not 1, return null list to save time.
if self.reb != 1:
return (List[Symbol]())

# make universe selection once a month
# drop stocks which have no fundamental data or have too low prices
selected = [x for x in coarse if (x.HasFundamentalData)
and (float(x.Price) > 5)]

sortedByDollarVolume = sorted(coarse, \
key=lambda x: x.DollarVolume, reverse=True)

# we need to return only the symbol objects
return [ x.Symbol for x in sortedByDollarVolume[:50] ]

Build Error: File: n/a Line:0 Column:0 - return [ x.Symbol for x in sortedByDollarVolume[:50] ]

Thanks

0

Jonathon Rogane , please do not return a List of Symbol: return (List[Symbol]()).
If you want to return an empty list, please use return [].
However, please note that returning an empty list with remove all the symbols from the universe. If you want to keep the current universe, you need to return the previous list. For example

# In Inilialize:
self.symbols = []

def CoarseSelectionFunction(self, coarse):
# if the rebalance flag is not 1, return current symbols to save time.
if self.reb != 1:
return self.symbols

# make universe selection once a month
# drop stocks which have no fundamental data or have too low prices
selected = [x for x in coarse
if x.HasFundamentalData and float(x.Price) > 5]

sortedByDollarVolume = sorted(coarse, \
key=lambda x: x.DollarVolume, reverse=True)

# Save the symbols for rebalance flag not 1
self.symbols = [ x.Symbol for x in sortedByDollarVolume[:50] ]

# we need to return only the symbol objects
return self.symbols

:
 

1

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Thanks Alexandre Catarino for sharing the improvment. I met two problems in my algo and can't find the solution in the docs.

1. In your example above, the stocks are sorted by the dollarvolume and top 50 stocks are returned. Is there a way to sort the stock based on past past 20 days' averagedollarvolume?
2. Top 50 good liquidity stocks are returned to self.symbols. Can you kindly share an example how to get the history close price data(for example past 90 days) of these 50 selected stocks and calculate their 90 days'moving average before market open. 
Thanks!

0

Please checkout the EmaCrossUniverseSelectionAlgorithm example and attached backtest.
Since you want the mean average of both dollar volume and price, you could use the helper class to keep this indicators:

class SymbolData(object):
def __init__(self, symbol):
self.symbol = symbol
self.vol = SimpleMovingAverage(20)
self.sma = SimpleMovingAverage(90)
self.is_ready = False

def update(self, value):
self.is_ready =
self.sma.Update(value.EndTime, value.Price) and
self.vol.Update(value.EndTime, value.DollarVolume)

And you need to modify the universe selection function accordingly:

# sort the data by 20-day dollar volume average and take the top 'NumberOfSymbols'
def CoarseSelectionFunction(self, coarse):
# We are going to use a dictionary to refer the object that will keep the moving averages
for cf in coarse:
if cf.Symbol not in self.averages:
self.averages[cf.Symbol] = SymbolData(cf.Symbol)
# Updates the SymbolData object with current EOD price
avg = self.averages[cf.Symbol]
avg.update(cf)

# Filter the values of the dict: wait for indicator to be ready
values = filter(lambda x: x.is_ready, self.averages.values())

# Sorts the values of the dict: we want those with greater mean dollar volume
values.sort(key=lambda x: x.vol.Current.Value, reverse=True)

for x in values[:self.coarse_count]:
self.Log('symbol: ' + str(x.symbol.Value) + ' mean vol: ' + str(x.vol.Current.Value) + ' mean price: ' + str(x.sma.Current.Value))

# we need to return only the symbol objects
return [ x.symbol for x in values[:self.coarse_count] ]

If you want to access the 90-day moving average of price, you just need to refer to the averages dictionary:
sma = self.averages[symbol].sma

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Hi Alexandre, thanks for your help.

When I tried to run your sample, there was a runtime error: "'filter' object has no attribute 'sort'
at CoarseSelectionFunction in main.py:line 72".

 

The sample code you shared is very instructive and helpful, now I understand how to calculate custom indicator in "Coarse Universe Selection". The further question is how to use these “pre-calculate" indicators to trade when market open. For example, in your code, x.sma is the 90 day moving average. Can I return it in a dataframe and use it to compare with the open price when market open? for example, if the open price of a stock that returned from CoarseSelection is higher than its sma, we will trade it.

In the documentation and all the coarseselection sample, only the symbol is returned but no further sample to show how to use these symbol to get current price data (like the open price) to do more calculation with the indicator calculated in the coarseselection. The reason its very hard for us to figure out how to use the symbol retured to calculate is also because it seems QC historical dataframe is multi-layered which makes it very complicated when combined with coarseselection. So pls share an example on this which I believe will be very helpful.

1

self.sma.Update(value.EndTime, value.Price)

Hi Alexandre, if I understand this correctly, the input value.Price here is the EOD close price according to the documentation. Can I have other data input (for example history eod high/low instead of just the close)  here to calculate Custom indicator in CoarseSelectionFunction? 

0

This might be one of the best on SimpleMovingAverage, which is something I need to move forward with QC. Fairly new here. But as noted by Michael Zhang, the code immediately above is has that sort error. Removing the offending line, another error: 'filter' object is not subscriptable

Meanwhile I really need an example with two moving averages (both on close price) with math between the two. The easiest example would be to simply subtract one sma from the other. Can that be done in the class SymbolData() or should that be around line 62?

Part of the problem is that it isn't clear to me what these objects consist of and how to get a glimpse into their contents.

0

https://github.com/QuantConnect/Lean/blob/master/Algorithm.Python/EmaCrossUniverseSelectionAlgorithm.py

The sample EmaCrossUniverseSelectionAlgorithm.py on github should get you started.

0

Doesn't make sense for these to be different types and math impossible.

Runtime Error: Trying to perform a summation, subtraction, multiplication or division between 'SimpleMovingAverage' and 'SimpleMovingAverage' objects throws a TypeError exception. To prevent the exception, ensure that both values share the same type.

def CoarseSelectionFunction(self, coarse):
for cf in coarse:
if cf.Symbol not in self.averages:
self.averages[cf.Symbol] = SymbolData(cf.Symbol)
avg = self.averages[cf.Symbol]
avg.update(cf)

prc = 0.0
if self.Securities.ContainsKey(cf.Symbol):
prc = self.Securities[cf.Symbol].Price
sma1 = self.averages[cf.Symbol].sma1
sma2 = self.averages[cf.Symbol].sma2
sma1 = sma1 if sma1 else None
sma2 = sma2 if sma2 else None
if sma1 and sma2:
ratio = (sma1 - sma2) / sma2
self.Log('{} prc {} sma1 {} sma2 {} ratio {} '.format(cf.Symbol, prc, sma1, sma2, ratio))

vals = filter(lambda x: x.is_ready, self.averages.values())
return [ x.symbol for x in vals] #[:self.coarse_count] ]

class SymbolData(object):
def __init__(self, symbol):
self.symbol = symbol
self.sma1 = SimpleMovingAverage(3)
self.sma2 = SimpleMovingAverage(10)
self.is_ready = False

def update(self, value):
self.is_ready = self.sma1.Update(value.EndTime, value.Price) and self.sma2.Update(value.EndTime, value.Price)
0

Hi Garyha, you should use self.sma1.Current.Value to get the indicator value otherwise you are performing the math calculation over the object.

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


That fixes the error so arithmetic can be done on sma's, kind thanks to Jing Wu.

    sma1 = self.averages[cf.Symbol].sma1.Current.Value
    sma2 = self.averages[cf.Symbol].sma2.Current.Value

Next, TODO
    See Error: 'filter' object is not subscriptable, in code. Why?
    Place diff's in a sortable object
    Sort them (and fix the sort error mentioned by Michael Zhang)
    Select top or bottom coarse_max
    Understand why log is this limited currently, a single stock ...
        2017-10-11 00:00:00 Z UYE69C59FN8L  prc 42.14  sma1 41.987  sma2 41.752   diff 0.23467
        2017-10-11 00:00:00 Z UYE69C59FN8L  prc 42.14  sma1 41.810  sma2 41.829   diff -0.01900
        2017-10-12 00:00:00 Z UYE69C59FN8L  prc 41.87  sma1 41.633  sma2 41.783   diff -0.14967
        2017-10-12 00:00:00 Z UYE69C59FN8L  prc 41.87  sma1 41.433  sma2 41.745   diff -0.31167
        2017-10-13 00:00:00 Z UYE69C59FN8L  prc 41.42  sma1 41.463  sma2 41.747   diff -0.28367
        2017-10-13 00:00:00 Z UYE69C59FN8L  prc 41.42  sma1 41.467  sma2 41.741   diff -0.27433
        2017-10-14 00:00:00 Z UYE69C59FN8L  prc 41.70  sma1 41.550  sma2 41.680   diff -0.13000
        2017-10-14 00:00:00 Z UYE69C59FN8L  prc 41.70  sma1 41.460  sma2 41.634   diff -0.17400
        2017-10-17 00:00:00 Z UYE69C59FN8L  prc 41.52  sma1 41.363  sma2 41.534   diff -0.17067
        2017-10-17 00:00:00 Z UYE69C59FN8L  prc 41.52  sma1 41.213  sma2 41.446   diff -0.23267
        2017-10-17 00:00:00 Z UYE69C59FN8L  prc 41.14  sma1 41.283  sma2 41.423   diff -0.13967
        2017-10-17 00:00:00 Z UYE69C59FN8L  prc 41.14  sma1 41.407  sma2 41.413   diff -0.00633       

0


Update Backtest





0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Loading...

This discussion is closed