Does data consolidation save memory? Data consolidation on Universe selection?

I am trying to optimize my data usage so I can get as long of a backtest as possible with as many stocks as possible. There are a couple of different situations that I think I can save data with:

1) I've got about 10 stocks that I only need daily historical data, but when I execute the orders I need the last minute price, just once a day, to calculate how many shares I can buy of each stock based on risk management principles. I am currently subscribed to recieve minute data for these stocks which Is a waste I think

Can I get an up to the minute price with a daily data subcription?

2) I have a universe spitting out many different stocks each day (currently 50, would like to push this to >150), I only need the price every 5 minutes. I know I can consolidate individual stocks with self.SubscriptionManager.AddConsolidator(stock, consolidator) but how do I apply that to the stocks being put out by a universe and does that even save me any data?

This is what I am trying:

def Initialize(self):
    # ...other initialization...
    self.consolidator = TradeBarConsolidator(5)
    self.consolidator.DataConsolidated += self.OnDataConsolidated


def OnSecuritiesChanged(self, changes):
        self.changes = changes
        
        # if we have no changes, do nothing
        if not self.changes == SecurityChanges.None:
            
            for security in self.changes.AddedSecurities:
                
                self.SubscriptionManager.AddConsolidator(security.Symbol, self.consolidator)

Related question:

What price does self.Securities[security].Price return (which interval)? The last seen price at the subscription interval, or the consolidated interval? What about self.History(security, 1, Resolution.Minute)['close'][0]?

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

1) No; that's look ahead bias,

2) Consolidators don't use any memory; they're maybe 100 bytes each.

self.Securities[security].Price is the data resolution.

Do you have trouble selecting even a 150 universe of stocks? Can you share an example in support so we can debug why?

Jared Broad

STAFF Pro ,

LukeI

2.9k Pro ,

on #2 the consolidators don't use any memory but do they reduce the memory of the subscribed data? For example. If I have SPY minute data and then I consolidate to 5 minute bars because I only need the price every 5 minutes, does that use 5x less memory?

Actuall, after typing that I think the answer would be no, because you'd still need each of the 5 minutes data to compile into a 5 minute bar... That's probably the same reason why I need to subscribe to minute data even though I only need the current price and daily history, I have to have the price every minute of every day in order to get the current price just one time.

    """
    DEFINE UNIVERSE
    """
    def CoarseSelectionFunction(self, coarse):
        selected = [x for x in coarse if (x.HasFundamentalData) 
                    and (float(x.Price) >= self.MyLeastPrice)
                    and (float(x.Price) <= self.MyMostPrice)]
        
        DollarVolume = sorted(selected, key=lambda x: x.DollarVolume, reverse=True)
        #self.data.Log('initial select %d'%(len(DollarVolume)))
        top = DollarVolume[:self.Top_Stocks]
        #self.data.Log('top select %d'%(len(top)))
        
        return [x.Symbol for x in top]
    
    def FineSelectionFunction(self, fine):
        filtered_fine = [x for x in fine if x.ValuationRatios.PriceChange1M]
        
        #self.data.Log('filtered_fine select %d'%(len(filtered_fine)))
        sortedByChange = sorted(filtered_fine, key=lambda x: x.ValuationRatios.PriceChange1M, reverse=True)
        
        # take the top entries from our sorted collection
        topFine = sortedByChange[:self.MaxOrders]
        #self.data.Log('topFine select %d'%(len(topFine)))
        stocks = [x.Symbol for x in topFine]
        
        return stocks

My universe selection right now is fairly simple, I don't have much problems running out of memory it's more of an issue of speed and length of backtest.

Currently I limit Coarse to 300 stocks and fine to 50. Coarse selection doesn't have the ability to narrow down my universe very much because it can't look at historical price movements over time so it must pass a large number of stocks to fine. In fine I substituted ValuationRatios.PriceChange1M to fill that gap so it seems to work alright.

Increasing the coarse limit gives me more stocks availible to check against my fine criteria but slows down the backtest signifigantly. At this time I can't add more to the Fine selection because IB's fees make it cost prohibitive given my capital but if I was on Robinhood and there were no transaction fees I would increase the limit of fine selection. That would cause the memory to run out after only a few weeks worth of backtest length because it would be pulling the data of potentially up to >150 stocks looking at the minute level pricing to make purchases.

I think the main reason some of this is more challening in QC is because in Quantopian my pipeline would select a list of stocks based on a moving average (~200 stocks selected out of the full ~8000) then there was function called data.current(stock, 'price') that just gave the last seen price of a stock or list of stocks as a float that I used a lot just to get a one-off value without needing to see ALL the data. Then I would use just that singular price to place the order for the calculated number of shares I could afford.

Interesting thank you Lukel. We synchronize all the data in realtime as your backtest runs. This makes it 10x faster for small batches of symbols but also slower for universes like this. We will make universe selection an order of magnitude faster (10-100x) if we pre-synchronize all the data. This is in our plan for the next 3-6 mo.

LukeI INVESTOR

Update Backtest

Notebook

person upvoted this people upvoted this

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

287,000 Quants.

VOTE FOR UPCOMING FEATURES

Does data consolidation save memory? Data consolidation on Universe selection?

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

287,000 Quants.

VOTE FOR UPCOMING FEATURES

Does data consolidation save memory? Data consolidation on Universe selection?

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free