This thread is meant to continue the development of the In & Out strategy started on Quantopian. The first challenge for us will probalbly be to translate our ideas to QC code.
I'll start by attaching the version Bob Bob kindly translated on Vladimir's request.
Vladimir:
About your key error, did you also initialize UUP like this?
self.UUP = self.AddEquity('UUP', res).Symbol
Goldie Yalamanchi
Nathan Swenson I got you Nathan and I have NinjaTrader as well but also a day job totally unrelated to quant stuff so it's like I want to just setup this in our Roth IRAs and just run it without worrying about it. I want it to run passively rather than me have to trade it.. But I hear you the trade frequency is so low -- so why have a cloud server and all that can go wrong? Idk just because I want to be lazy and have the system run rather than me think about it everyday and trade it everyday or whenever that once a month cycle comes. Also if it works good I want to run it across multiple IRA's and multiple other types of accounts. Still though based on QuantConnect server model now I think I would have to have 5 server hosts to run 5 copies of the algo against 5 accounts.
Hello everybody, new member here. I work for a firm, that develops proprietary trading strategies, and have a general interest in the development of trading strategies in general. This thread has been a very interesting read, but after translating the general concepts of this strategy to our own machine learning based strategy validation methodology, which involves cross-validated out-of-sample testing among other things, I must caution against using this strategy. The results posted here cannot be reproduced out-of-sample, and thus our conclusion is, that the results seen in-sample are mostly due to overfitting. To some extend this is not surprising considering the relatively low frequency of the trades, compared to the large number of parameters used even for the simplest version of the algorith (look back period, three exit levels, a waiting period before re-entry, an additional entry level). To illustrate the point, I've simplified the strategy to the three original signals (DBB, SHY, and XLI) with a lookback period of 63 days, and two assets that we will switch between, namely SPY and IEF. The strategy is simple: exit the SPY the next day, if the three signals are below some set of thresholds. I optimized the thresholds, which results in the following equity curve:
Looks good, right? Sadly looks can decieve. I now generate a set of random returns from a normal distribution with the same covariance structure as our original three signals (DBB, SHY, and XLI)., transform them into a price series, and use these to generate the same trading signals with a 63 day lookback period, and...
The "strategy" based on just using signals based on random returns is even better than the one based on the real signals. Now not all equity curves based on random signals are this good, but a significant proportion are, which just shows how easy it is to overfit a strategy even with just three time correlated signals. This is why we always optimize parameters, such as exit, and entry points on multiple subsets of the data, and then do multiple out-of-sample tests on other subsets of the data. You simply cannot trust in-sample backtests, particulary when the ratio of the number of trades to the number of parameters is relatively small.
To add to what I stated above, I believe more time should be spent validating the assumptions made when we introduce a strategy, rather than to accept its preliminary result at face value, and then spent our time adding or replacing instruments with higher returns, or adding leveraged instruments to outdo the previous poster with another in-sample backtests, which boasts returns of 1,000%, 10,000%, 50,000%. If a strategy aims to time the S&P 500, does it actually do that, without adding a bunch of instruments to boost returns upon an exit. If industrials are assumed to be leading in predicting market crashes, why not first validate that relationship over a period of decades before adding other instruments like metals and short term bonds? There's over two decades of ETF data available for SPY and XLI. Just my two cents...
Guy Fleury
Ran my latest version of In & Out with $10k to test its downside scalability. No further change was made to the program except for the initial capital (still used TQQQ for the long side).
Portfolio metrics gave a total net profit of 120,148.019%. The equivalent of a 73.358% CAGR over those 13.24 years. The win rate was a modest 65% (which technically is still interesting).
The beta remained relatively low at 0.369 while the Sharpe ratio sat at 1.855.
So, there is something that can be done with this strategy even with a small-sized trading account.
S.T.E
Hi Guy - Would you mind uploading your version so we could see what have you done differently on the shared Algo?
Thanks!
Nathan Swenson
Menno, I read your cautioning posts a couple time and you seem to be saying that the out of sample performs even better than the in sample. With that you conclude that one should NOT use this strategy. I must not understand you correctly. The small out of sample I tested was also better. I take that as a good thing. Regarding low trade count, if you take the original version that also performs well, it has around 5x the number of trades.
Edit: I see. You are saying that using random signals that are meaningless, you can use parameter fitting to get good results anyway. Interesting,. I wonder how randomized the siganl data really is. Sounds suspicious.
Nathan Swenson
I should add, this strategies performance is highly effected by the Out holdings. By switching to IEF in place of TLT/IEF combo, you take out a big part of what makes this system work. IEF has a much smaller impact than TLT.
Matthew Wormington
Nathan Swenson Agree per earlier comments and the Out holdings are probably where the back tests will fall short with the bond yeilds where they are. What other alternatives Out assets are available that have not been so so impacted by the feds? Also, what do folks think about using the signals to just avoid drawdowns from to sharp downs maybe together with something like a 200 day SMA for prolonged downturns, i.e. use In-and-Out as an early waring indicator and then continue to stay ot if In holdngs below 200 SMA in the case of a recession? In accounts such as 401K there are limited asset selections and so being able to time the market even just with SPY like assets might do better than buy-hold over the long term especially with 60/40 not working so well. Not a 1000% gain use-case I know, but wonder what folks think for accounts with limit asset choice.
Here I have used the TLT/IEF combo as the out alternative for SPY, where I applied the original strategy, which has six parameters to optimize: lookback period (1-100 days), out period (1-30 days), limit for exit DBB (-50%-0%), limit for SHY (-5%-0%), limit for XLI (-50%-0%), and limit for re-entry before waiting period is over (-100%-0%). I generated random returns with the same covariance structure as DBB, SHY, and XLI, transformed them into price signals, and applied the strategy based on these bogus signals. I optimized over 100,000 parameter combinations, which yields the following optimal equity curve:
It should be noted that with six parameters to optimize pretty much any bogus set of signals based on random returns will after optimization provide an equity curve with >1000% return over 14 years or >20% CAGR.
Nathan Swenson
Menno, thank you for your feedback on this. You obviously have a great deal of experience in this field. I guess we will find out as we get live data rolling in. I don't have a lot of experience in long term, low trade count algos. I generally create intraday algos created around price action, market internals such as ADV/DECL, VIX, and Cumulative Delta. The good thing about intraday, you quickly know if the algo is working due to the high trade count in a short period of live trading. With these one trade every month type, it takes a long time to validate. You seem quite certain of your findings which makes me again question things.
Goldie Yalamanchi
Matthew Wormington you could always just rollover your 401k or a portion of it over into any online broker as a rollover IRA then you have total control -- except you can't short.
Menno Dreischor Are you saying that running backtests with too many parameters and wide range of them is creating a false overperformance that doesn't measure up when extended to out of sample data? Which out of sample data time frame did you try? I tried other years or sub-years as a subset and they had mixed performance like the period from 2015-2017.
Interestingly, there was an earlier version of this algorithm which had like just a SMA 200 cross for S&P500 to determine in or out strategy and it performed OK. I wonder if maybe there is a more pure way to just combine something like an SMA cross and maybe a VIX % change to determine an OUT state that isn't overly curve fit.
Yes, running backtests with too many parameters is likely to create false strategies. You end up accurately describing the past, but not accurately predicting the future. I use two types of out of sample tests, one is optimize on a section of historic data, and forward test from then onwards. The other way is to do a cross-validation, where you leave-out a period of data, optimize over the rest of the data, and predict the data you left out. You repeat this many times. This will not only tell you how well the strategy performs out of sample, but also how stable the parameters are. If the optimal parameters are all over the place, the strategy is not sufficiently robust.
Matthew Wormington
Goldie Yalamanchi thanks for the suggestion but jI was just using a 401k as an example where you don't have freedom over asset seection ;-) I'd certainly not risk my retirement any more than I would with a trip to vagas and a spin on a roulette wheel. I'm trying to understand how things are impacted by the timing of the in/out signal rather than choice of assets and so SPY to cash seems like the simplest case. The original versions of the algo seemed like a neat idea to come up with some possible leading ecomic indicators that might get you out of some quick drawdowns. Doing that alone, e.g. avoiding the 2009 fast drop in Menno's post and nothing else might be more acheivable and realistic than some of the huge gains that have been shown as the discussion progresses.
The question for me is this: if any number of nonsensical signals, for which we know there exists no relationship with the instrument we're trading, can achieve similar results to the actual signals, what faith can we have, that there exists a relationship with actual signals? The answer of course is very little.
Jimothy
Hi all, I was wondering if we could prevent big drops of 20% or more, like in march of 2020 by using a stop loss. I tried doing something like this:
self.stop_price = self.Securities[self.STKS].Price * 0.90
self.price = self.Securities[self.STKS].Price
if price <= stop_price:
self.SetHoldings(self.STKS, -self.Portfolio[self.STKS].Quantity)
But I can't seem to get it to work, is there a better way to do this in QC?
Joshua Tsai
One thing I'd like to note is that in real life, you can't always find correlations in assets that will continue. Thus, if you can find assets that are highly correlated and logically consistent (their correlation makes sense), you can argue that even a strategy based on the false assumption that our signals precede the SPY, the basic idea (that drops in a lot of corresponding assets gives rise to possible bear markets) could work. After all, we can be reasonably certain that these correlations will continue working in the future. Of course, it does raise the question on why you're trying to use the other ETFs as predictors...
The problem here is that the so called logical correlation is dependent on a host of unsubstantiated assumptions. Why use a three month lookback period, why not two or four, or six? Why exit SPY if metals and/or industrials drop 7%, why not 5% or 10%? Why wait three weeks to re-enter the market, why not two weeks or five weeks? A strategy can only be viewed as robust, if either a single set of parameters is a clear optimum, that changes very little over time, or if the success of the strategy is insensitive to the choice of these parameters. Neither is the case here.
I've performed an out-of-sample test to further illustrate the point, I'm making. First I optimized the six parameters in the original version of the algo in-sample, taking the best value out of a 100,000 possible combinations of parameters. This yields the following familiar equity curve:
.
Next I perform a leave one year out cross-validation, meaning I take out the year 2008, and optimize the parameters over 2009-2020, and apply the strategy to 2008, next I take out the year 2009, and optimize over 2008, and 2010-2020, etc, etc, until all years have been tested out-of-sample. This yields the following out-of-sample equity curve:
The strategy clearly does not work out of sample. Moreover, unlike the in sample result the out of sample result is very sensitive to the choice of parameters, meaning that another set of 100,000 combinations will yield an almost identical in sample equity curve, but a different out of sample curve. This is the hallmark of an overfitted model: a poor out of sample performance, and a numerically unstable solution.
Guy Fleury
Had to test the other side of scalability. This time running the strategy with $10 million as initial stake. No other modification.
Total net profit came in at 120,219.296% which would tend to confirm the strategy's upside scalability. This translated into a net profit of $1,243,714,357.98 over the 13.24 years.
The Sharpe ratio remained the same at 1.855, as well as the beta at 0.369. The win rate also came out the same at 65%. All indicating the trade mechanics and portfolio metrics were about the same as in the $10k case. This would suggest that there was not that much incremental risk since those numbers stayed the same. The main change was in the trade execution, in the bet sizing department (larger bets). CAGR stood at 73.366% which was very close to the $10k scenario with its 73.358%.
This does demonstrate that the strategy could scale from $10k to $10M (a factor of 1,000) and still perform as should be expected. Accepting its scalability, I even tried an initial capital selected almost at random ($237,815) and got back 120,215.286% for net profit indicating again the strategy's scalability. Its CAGR was 73.365%, also very close to either the $10k or the $10M scenario.
Note that the initial stake is not a program decision. That decision is up to the strategy designer, and therefore we are the ones making that decision.
Will proceed to the next level of testing in order to find the limits and boundaries of this program to then scale back to whatever performance level I might find acceptable. I know, it will be some compromise in the risk/reward space. We all have to make those choices. It does take time to explore a strategy's potential, pitfalls, and shortcomings to then code improvements.
@S.T.E, sorry, but understandably, after having transformed a basic trading strategy above a certain performance level I do not put out code. However, I do provide outcome examples of what could be done and some data on what makes it reasonable and possible.
Of course this is all in-sample, I assume, which as I demonstrated, makes these dizzying numbers highly suspect.
John von Neumann famously said:
"With four parameters I can fit an elephant, and with five I can make him wiggle his trunk."
I've read many of your posts on your blog, and on quantopian; very interesting indeed! I would be interested where out-of-sample testing fits into your philosophy?
Tentor Testivis
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!