Back

How do I get a percentile of dollarvolume universe?

I'm looking to get a result similar to quantopian

pipe.set_screen(AverageDollarVolume(window_length=3).percentile_between(85, 100))

 

I found this in the LEAN documentation: https://www.quantconnect.com/lean/documentation/topic98.html which seems to be what I want but don't know where to implement it in the coarse universe function.

Currently my coarse universe just has 

DollarVolume = sorted(selected, key=lambda x: x.DollarVolume, reverse=True)

Update Backtest








Lukel, Please take a look at Yan Xiaowei's codes in his backtest. See here. His backtest generates error messages but at least you can take a look at it and get how to filter your selection with some factors using CoarseSelectionFunction. Hope this would help. Thanks :) 

0

I'm a fan of pandas and dataframes. So, here's another approach. Coming from Quantopian this was always fast and easy. There's debate on that here at QC. Being C# under the hood may make that approach slower. Anyway, here is my implememntation:

# Use pandas methods to select the assets we want
# First find the values of dollar_volume at our desired bounds
lower_percent = data_df.dollar_volume.quantile(.85)
upper_percent = data_df.dollar_volume.quantile(1.0)

# Now simply query using those values
# Filter for has_fundamentals to remove ETFs
my_universe = (data_df.query('has_fundamentals & (dollar_volume >= @lower_percent) & (dollar_volume <= @upper_percent)'))

Just one line of code once you find the upper and lower percents. See the attached backtest. Note the logs to see how many stocks it's filtering and the cut-offs.

One issue I have with QC is the data. I'm not entirely convinced it's reliable. For example. In the attached backtest I use a 'has_fundamentals' check to exclude ETFs. It excludes ETFs, from what I see, but in this case also excludes AAPL. Ouch. I haven't looked closely into the issue but... beware. (you can check this by deleting the 'has_fundamentals' check from the query. AAPL magically appears in the results. BTW it also takes a lot longer because 900+ stocks are returned vs a bit more than 100).

0


Will look into that tomorrow Dan. 99% of the time its a misunderstanding on what things mean, or poorly named variables =).

0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Dan,

Thank you so much! That was just what I was looking for.

This is the code I ended up with:

def CoarseSelectionFunction(self, coarse):
# First narrow the original coarse universe down to a managable level.
init_select = [stock for stock in coarse if (stock.HasFundamentalData)
and (float(stock.Price) >= self.MyLeastPrice)
and (float(stock.Price) <= self.MyMostPrice)]

# second convert the initial selection to a pandas dataframe.
stock_symbols = [stock.Symbol for stock in init_select]

stock_data = [(stock.DollarVolume,) for stock in init_select]

column_names = ['dollar_volume']


# Use coerce parameter = True to convert data objects to numbers
data_df = pd.DataFrame.from_records(
stock_data,
index=stock_symbols,
columns=column_names,
coerce_float=True)

# Use pandas methods to select the assets we want
# First find the values of dollar_volume at our desired bounds
lower_percent = data_df.dollar_volume.quantile(self.LowVar)
upper_percent = data_df.dollar_volume.quantile(self.HighVar)

# Now simply query using those values
# Filter for has_fundamentals to remove ETFs
my_universe = (data_df.
query('(dollar_volume >= @lower_percent) & (dollar_volume <= @upper_percent)'))

# See how many securities are found in our universe
self.data.Debug("{} securities found ".format(my_universe.shape[0]))

# Expects a list of symbols returned
return my_universe.index.tolist()

 

I think it's slightly more efficient than your code, especially if QC has difficulties with dataframes, because instead of adding 3 columns of data from all ~6000 stocks into a dataframe, first it narrows down the list of desired stocks by .hasfundamentaldata and .price before it adds them into a dataframe, since those are easy to do without one.

It actually sped up my backtest signifigantly because I was able to narrow down my coarse universe much more before getting to the fine fundamental data. Although still at 4k data points per second.

0

Lukel, great approach. Narrowing down the selection with an 'if' conditoin in the initial iterator is nice. Get's out all the securities you don't want right away.

The other benefit of this approach is that the dollar volume range is calculated only accross the stocks of interest and not the entire course universe. That was a flaw in my original logic.

You may want to look at the 'context.UniverseSettings.MinimumTimeInUniverse' property of the universe. It's set in the 'Initialize' section. I had it set to 0 but maybe consider a higher value. This should keep securities from moving in and out so much. Makes them 'sticky'. This is a bit more important in QC rather than Quantopian because the 'dollar_volume' in QC is for a single day and could vary a lot. Quantopian allowed one to take an average dollar_volume (you noted in your intial post you had used a window_length of 3). Just a thought.

0

Lukel, Dan, Great work. @Dan, Glad to see you here at QC. Hope we Quantopian migrants get used here at QC as fast as we can, and develop what we want. Thanks guys ! :)  

0

Dan,

Since my strategy isn't a long term holder, more of a swing strategy, it needs a fresh list of stocks that meet the critera every day. I just read about the change to dollar volume. I had no idea that the original way was a 30 day EMA, a premade EMA isn't very flexible but is preferable to a single day dollar volume if you can't "warm up" the dollar volume and are stuck with waiting 30 days to get an accurate representation of stocks that you would have wanted to purchase on day 1. If that's the case I don't even know how you would go about storing the dollar volume data of 5000 stocks to detect when one eventually meets the long term moving average criteria. My original dollar volume filter is MUCH longer than 3 days, but I don't know if it's a very sensitive parameter or not. Guess I will find out as I go.

0

Update Backtest





0

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.


Loading...

This discussion is closed