Hi,
I’m hoping to ask for some help please filtering the Coarse Universe of US stocks in QuantConnects 'research' section, and displaying the stock info (open, close, volume etc) in a dataframe. I’m new to Python (taking the EDX course), new to pandas/numpy (watching a lot of YouTube tutorials) and trading (listening to lots of quant podcasts) - my background is in website analytics and trying to learn Quant trading & Python at the same time (learn and practice together). I'm finding that most of the tutorials on QuantConnect are for full backtesting of algorithms in the 'lab' so I'm struggling get started in the 'research' section (not ready to do backtesting of an algorithm before researching signals themselves).

I've outlined below what I'm trying to achieve, and my work in progress code thats not working further below (I figure best practice is to give more info than less). If you're able to help with any of the sections that would be an amazing help, but ideally just a basic piece of working code that allows the Coarse Universe of US stocks in QuantConnects 'research' notebooks to be filtered would be an amazing help! Also, if you think a newbie like me should learn in a different way, please advise.

*****The rest of this thread post is extra detail on what I'm trying to do:*****

In the 'research' notebook:
1. Load the Coarse Universe of US stocks (~16k I believe)
2. Apply a basic filter for stocks >=$45 & <=$50 close price for a given day (eg 2017/5/15)
3. Print a count of how many stocks are returned
4. Sort these stocks high to low
5. Filter the top 50
6. Pull the daily stock info (Symbol, Open, close, high, low, volume) for a date range (e.g. 2017/3/15 to 2017/6/15) for these top 50 stocks
7. Add an additional column for day over day return rate
8. Display this data for the 50 stocks in a dataframe
9. Add the same SPY data (same data range, columns) into the same dataframe as the 51st row

Here’s my (newbie) first stab at the code (its a work in progress that obviously isn’t working yet): 

#Start: initial general imports ##taken from: EmaCrossUniverseSelectionAlgorithm.py https://github.com/QuantConnect/Lean/blob/master/Algorithm.Python/EmaCrossUniverseSelectionAlgorithm.py from clr import AddReference AddReference("System") AddReference("QuantConnect.Algorithm") AddReference("QuantConnect.Indicators") AddReference("QuantConnect.Common") from System import * from QuantConnect import * from QuantConnect.Data import * from QuantConnect.Algorithm import * from QuantConnect.Indicators import * from System.Collections.Generic import List import decimal as d #end 'initial general imports' #Start: QuantBook research imports ##taken from initial https://www.quantconnect.com/research# notebook. I think these are the standard settings for creating a QuantBook research books ###removed some duplicates with initial general imports. %matplotlib inline # Imports AddReference("QuantConnect.Jupyter") from QuantConnect.Data.Custom import * from QuantConnect.Data.Market import TradeBar, QuoteBar from QuantConnect.Jupyter import * from datetime import datetime, timedelta import matplotlib.pyplot as plt import pandas as pd import numpy as np # Create an instance - needed (I believe) in order to use the research tab qb = QuantBook() #End: QuantBook research imports #need to create a class to set universe criteria within. All examples use 'QCAlgorithm' (presumably for 'lab' testing) so copying this. class MyUniverseSelectionResearch(QCAlgorithm): def Initialize(self): #need to initialize self.AddUniverse(self.CoarseSelectionFunction) #needs to happen within initialize section according to: https://www.quantconnect.com/docs/algorithm-reference/universes#Universes-Coarse-Universe-Selection self.SetStartDate(2017,5,15) #Set Start Date for the initial filtering of stocks self.SetEndDate(2017,5,15) #Set End Date for the initial filtering of stocks #figure out how to Set Start & end Date for pulling the stock info from 2017/3/15 to 2017/6/15, SetStartDate2 likely won't work as its a method, perhaps use a variable like: Date1 = self.SetStartDate2(2017,3,15) def CoarseSelectionFunction(self, coarse): # sort descending by daily close price sortedByClosePrice = sorted(coarse, \ key=lambda x: x.Price, reverse=True) #missing how to link the date, does self.SetEndDate1 work or # we need to return only the symbol objects (as the Universe doesnt have things like high/low, open - presume need to pull this info from the stock symbols, pulling anything more than stock symbols at this stage seems a waste return [ x.Symbol for x in sortedByClosePrice[:50] ] #How many stock symbols does this $45-$50 filter return? (Figure out how to do) self.Debug("Stock Sybmols matching criteria >>> print numpy.pi: " + print(count(x)) #restrict to symbols >=$45 return [ x.Symbol for x in sortedByClosePrice[45:50] ] #MISSING SECTION:Pull the daily stock info (Symbol, Open, close, high, low, volume) for a date range (e.g. 2017/3/15 to 2017/6/15) for these top 50 stocks #(no progress due to needing to understand universe methods first) #Do I need to do a loop 50 times, looping something like x = qb.History([symbol], 90, Resolution.Daily) #Add an additional column for day over day return rate #this code worked for a previous research notebook for a single SPY symbol, should be adaptable (perhaps needs looping too) once know what to latch to ##drop everyting from SPY except closing, rename column to spy_closing ###spy_close = hist.drop(['open','high','low','volume'],1).rename(columns={'close': 'spy_closing'}) #create a percent change for SPY ###p_change = spy_close.pct_change(1).rename(columns={'spy_closing': 'spy_pct_change'}) #Need to add this new column into the dataframe once I know what to latch onto (figure out how to do it) #Display this data for the 50 stocks in a dataframe dftop50 = pd.DataFrame(hist) print(dftop50) ##Figure out: Add the same SPY data (same data range, columns) into the same dataframe as the 51st row

Thanks in advance for any help, tips or advice that you can offer

Dean

Author