I'm currently working on an approach that involves massive data sets. The way the current algo approaches things involves search and performing data comparisons (moving averages, bollingers, standard deviations and several others) on 1 years data on some 1500 + equities. This is done before market open using the CoarseUniverseSelection process. Data objects are created to store these indicators for all 1500 symbols. Then in "On Data" we examine the incoming prices on this universe by the minute and compare that to our stored object data to make buy sell trading decisions intra-day.

So we are essentially fetching 1500 stocs x 252 days of data, before market open. In the first minute of the ay we are then reducing that universe down to a select number (e.g. 30) securities for which we want to trade.

My question is not whether the QC - AWS system can handle the task efficiently. It can and does. The whole thing takes less than a few seconds to run. My question is when we go live . Will we be having to bpay large scanning costs because we have surpassed certain data feed limitations or will we be subjected to data throttling by IB? 

If the latter is the case, we will have to re-architect our whole approach:  Hopefully someone here can allay my fears and tell me I need not be concerned.