Backtesting
Research Guide
Introduction
QuantConnect aims to teach and inspire our community to create highperforming algorithmic trading strategies. We measure our success by the profits created by the community through their live trading. As such, we try to build the best quantitative research techniques possible into the product to encourage a robust research process.
HypothesisDriven Research
We recommend you develop an algorithmic trading strategy based on a central hypothesis. You should develop an algorithm hypothesis at the start of your research and spend the remaining time exploring how to test your theory. If you find yourself deviating from your core theory or introducing code that isn't based around that hypothesis, you should stop and go back to thesis development.
Wang et al. (2014) illustrate the danger of creating your hypothesis based on test results. In their research, they examined the earnings yield factor in the technology sector over time. During 19981999, before the tech bubble burst, the factor was unprofitable. If you saw the results and then decided to bet against the factor during 20002002, you would have lost a lot of money because the factor performed extremely well during that time.
Hypothesis development is somewhat of an art and requires creativity and great observation skills. It is one of the most powerful skills a quant can learn. We recommend that an algorithm hypothesis follow the pattern of cause and effect. Your aim should be to express your strategy in the following sentence:
A change in {cause} leads to an {effect}.
To search for inspiration, consider causes from your own experience, intuition, or the media. Generally, causes of financial market movements fall into the following categories:
 Human psychology
 Realworld events/fundamentals
 Invisible financial actions
Consider the following examples:
Cause  leads to  Effect 

Share class stocks are the same company, so any price divergence is irrational...  A perfect pairs trade. Since they are the same company, the price will revert.  
New stock addition to the S&P500 Index causes fund managers to buy up stock...  An increase in the price of the new asset in the universe from buying pressure.  
Increase in sunshinehours increases the production of oranges...  An increase in the supply of oranges, decreasing the price of Orange Juice Futures.  
Allegations of fraud by the CEO causes investor faith in the stock to fall...  A collapse of stock prices for the company as people panic.  
FDA approval of a new drug opens up new markets for the pharmaceutical company...  A jump in stock prices for the company.  
Increasing federal interest rates restrict lending from banks, raising interest rates...  Restricted REIT leverage and lower REIT ETF returns. 
There are millions of potential alpha strategies to explore, each of them a candidate for an algorithm. Once you have chosen a strategy, we recommend exploring it for no more than 832 hours, depending on your coding ability.
Research Panel
We launched the Research Guide in 2019 to inform you about common quantitative research pitfalls. It displays a power gauge for the number of backtests performed, the number of parameters used, and the time invested in the strategy. These measures can give a ballpark estimate of the overfitting risk of the project. Generally, as a strategy becomes more overfit on historical data, it is less likely to perform well in live trading. These properties were selected based on the recommended best practices of the global quantitative research community.
Restricting Backtests
According to current research, the number of backtests performed on an idea should be limited to prevent overfitting. In theory, each backtest performed on an idea moves it one step closer to being overfitted as you are testing and selecting for strategies written into your code instead of being based on a central thesis. For more information, see the paper Probability of Backtest Overfitting (Bailey, Borwein, Jim Zho, & López de Prado, 2015).
QuantConnect does not restrict the number of backtests performed on a project, but we have implemented the counter as a guide for your reference. Your coding skills are a factor in how many backtests constitute overfitting, so if you are a new programmer you can increase these targets.
Backtest Count Overfit Reference  

030: Likely Not Overfit  3070: Possibly Overfitting  70+ Probably Overfitting 
Reducing Strategy Parameters
With just a handful of parameters, it is possible to create an algorithm that perfectly models historical markets. Current research suggests keeping your parameter count to a minimum to decrease the risk of overfitting.
Parameter Overfit Reference  

010: Likely Not Overfit  1020: Possibly Overfitting  20+ Probably Overfitting 
Limiting Research Time Invested
As you spend more time on one algorithm, research suggests you are more likely to be overfitting the strategy to the data. It is common to become attached to an idea and spend weeks or months to perform well in a backtest. Assuming you are a proficient coder who fully understands the QuantConnect API, we recommend no more than 16 hours of work per experiment. In theory, within two full working days, you should be able to test a single hypothesis thoroughly.
Research Time Overfitting Reference  

08 Hours: Likely Not Overfit  816 Hours: Possibly Overfitting  16 Hours+ Probably Overfitting 
Parameter Detection
Using parameters is almost unavoidable, but a strategy trends toward being overfitted as more parameters get added or finetuned. Adding or optimizing parameters should only be done by a robust methodology such as walkforward optimization. The parameter detection system is a general guide to inform you of how many parameters are present in the algorithm. It looks for criteria to warn that code is potentially a parameter. The following table shows the criteria for parameters:
Parameter Types  Example Instances 

Numeric Comparison  Numeric operators used to compare numeric arguments: <= < > >= 
Time Span  Setting the interval of TimeSpan or timedelta 
Order Event  Inputting numeric arguments when placing orders 
Scheduled Event  Inputting numeric arguments when scheduling an algorithm event to occur 
Variable Assignment  Assigning numeric values to variables 
Mathematical Operation  Any mathematical operation involving explicit numbers 
Lean API  Numeric arguments passed to Indicators, Consolidators, Rolling Windows, etc. 
The following table shows common expressions that are not parameters:
NonParameter Types  Example Instances 

Common APIs 
SetStartDate , SetEndDate , SetCash , etc.

Boolean Comparison  Testing for True or False conditions 
String Numbers 
Numbers formatted as part of Log or Debug statements

Variable Names 
Any variable names that use numbers as part of the name (for example, smaIndicator200 )

Common Functions  Rounding, array indexing, boolean comparison using 1/0 for True/False, etc. 
Overfitting
Overfitting occurs when you finetune the parameters of an algorithm to fit the detail and noise of backtesting data to the extent that it negatively impacts the performance of the algorithm on new data. The problem is that the parameters don't necessarily apply to new data and thus negatively impact the algorithm's ability to generalize and trade well in all market conditions. The following table shows ways that overfitting can manifest itself:
Data Practice  Description 

Data Dredging  Performing many statistical tests on data and only paying attention to those that come back with significant results. 
HyperTuning Parameters  Manually changing algorithm parameters to produce better results without altering the test data. 
Overfit Regression Models  Regression, machine learning, or other statistical models with too many variables will likely introduce overfitting to an algorithm. 
Stale Testing Data  Not changing the backtesting data set when testing the algorithm. Any improvements might not be able to be generalized to different datasets. 
An algorithm that is dynamic and generalizes to new data is more valuable to funds and individual investors. It is more likely to survive across different market conditions and apply to new asset classes and markets.