SECTION I
RISK
MANAGEMENT
CHAPTER 3 –
System Evaluation and
Testing
CHAPTER AGENDA
▪ Choose factors for system testing including objectives parameters, and test
data
▪ Assess the use of in-sample and out-of-sample data
▪ Evaluate optimized test results for continuity and significance using a variety
of visualization methods
▪ Explain the basics of using genetic algorithms
▪ Illustrate the concept of robustness in a trading system
▪ Critique the use of performance and risk metrics based on a given objective
3
System Testing
,Parameters, and Test
Data
Identifying the Parameters
Once the test strategy is known, the parameters to be tested must be identified.
A parameter is a value within the strategy that can be changed in order to vary the
speed or timing of the system. For example, parameters include:
▪ Moving average calculation period (number of days or bars).
▪ Exponential smoothing constant ( a percentage).
▪ Band width around a moving average (percentage or number of standard
deviations).
▪ RSI calculation period (number of days or bars).
▪ Stop-loss value (fixed number ofpoints, a percentage, or a volatility factor).
▪ Maximum holding period for a trade (days or bars).
5
Selecting the Test Data
It will be necessary to have a database of prices and if more ambitious, a history of
economic statistics as well.
Data is readily available from numerous vendors and includes daily high, low, and
closing prices plus volume and open interest (futures) for nearly all exchange-
traded markets in the world.
Analysts may choose to use different' data formats for testing.
▪ Individual full contracts
▪ Backward-(or forward-) adjusted data.
▪ Indexed data series
▪ Constant n-day forward series
6
Synthetic Data
The ultimate answer to getting test data is to create it yourself.
The more data the better. It could be ideal if synthetic data could be created that
had the same characteristics as the data that will be traded, yet is different.
▪ Monte Carlo Method
▪ Random Numbers and Distributions
▪ Synthetic data is an ideal solution to a difficult problem. It also brings to mind
two computers playing chess with one another.
7
Testing Integrity
▪ Analysts use as much data as possible to determine whether their proposed
strategy is sound.
▪ The use of long test periods assures a good sample of markets, including long
periods of sideways movement, bull and bear ln Markets, and a good number
of price shocks of various sizes.
▪ The testing process most often involves some level of cheating.
▪ Acknowledging that developing a system can be difficult, there is a right way
and a wrong way to test in order to arrive at credible results.
▪ In-Sample and Out-of-Sample Data
▪ Out-of-Sample Testing
▪ Step-Forward Testing
8
System Testing
,Parameters, and Test
Data
Price Shocks
▪ Ignoring or underestimating the significance of a price shocks will result in a
catastrophic loss-if not now, then sometime in the future.
▪ Price shocks are large changes in price caused by unpredictable and significant
events.
▪ The impact of price shocks on historic tests can change the results from
profits to losses; it varies the risk from small to extremely large.
▪ Consider a price shock only occurs because traders and investors are
liquidating their positions or entering new positions in the direction of the
price shock.
▪ Price shocks must be treated. as a serious risk; otherwise, any trading can be
overleveraged and under capital .
▪ This is the most Common cause of catastrophic Joss. A price shock that
causes a windfall profit could just as easily have produced a devastating loss. '
10
System Smoothing
Robustness
Summarizing Robustness
▪ In showing test results we concluded that a large number of profitable tests
was a good measurement of robustness.
▪ For the testing process, it was necessary to set performance expectations and
make realistic assumptions about the parameter values that were likely to
work; realistic costs also must be used.
▪ A robust trading strategy is one that produces consistently good results
across a broad set of parameter values applied to many different markets
tested for many years.
▪ Robustness is a term used to describe a system or method that works under
many market conditions, one in which we have confidence.
12
Summarizing Robustness
▪ The following is a checklist and brief explanation that outlines the steps for
testing, evaluating the test results, selecting parameter values, and
monitoring actual' trading, consistent with the purpose of finding a robust
trading strategy:
Checklist 1 - Deciding what to test. ?
- Is the underlying premise sound?
- Can you program all the rules?
- Does the strategy make sense on" under certain conditions?
- Take a guess as to the expected results.
- Be critical of Good results
13
Summarizing Robustness
Check List 2 - Decide how to test?
Choose the testing tools and methods.
Use as much if the Right data as possible.
Separate the data into in-sample and out-if-sample sets
Choose the range of parameter values that applies to the strategy
Decide the order in which to test the parameters
Be sure the parameter test values are distributed properly
Define the success criteria.
Presentation and visualization of the results
14
Summarizing Robustness
Check List 3 - Evaluating the results.
Verify a reasonable sample of calculations.
were there enough trades to have reliable results?
Does the trading system produce profits for most combinations of parameters?
Did rule changes improve overall test performance.
15
Summarizing Robustness
Check List 4 - Choosing the specific parameter values to trade.
The final parameter selection can include all data.
Choose values from a broad area success not biased by the one test that had peak
profits.
Profits must be distributed evenly throughout the test period.
Verify whether a disproportionate amount of profits came from price shocks.
Scale the risk to a personally acceptable risk level.
When there is a choice between two sets of parameter values, choose the one
that causes less time in the market.
Out-of-sample performance must validate the results.
16
Summarizing Robustness
Checklist 5 - Trading and performance monitoring
Trade using exactly the same rules that were tested.
Trade the same data that was used in testing.
Monitoring the difference between the system and the actual entries and exits -
Slippage,
17
Thanks

Risk Management - CH 3 - System Evaluation and Testing | CMT Level 3 | Chartered Market Technician | Professional Training Academy

  • 1.
  • 2.
    CHAPTER 3 – SystemEvaluation and Testing
  • 3.
    CHAPTER AGENDA ▪ Choosefactors for system testing including objectives parameters, and test data ▪ Assess the use of in-sample and out-of-sample data ▪ Evaluate optimized test results for continuity and significance using a variety of visualization methods ▪ Explain the basics of using genetic algorithms ▪ Illustrate the concept of robustness in a trading system ▪ Critique the use of performance and risk metrics based on a given objective 3
  • 4.
  • 5.
    Identifying the Parameters Oncethe test strategy is known, the parameters to be tested must be identified. A parameter is a value within the strategy that can be changed in order to vary the speed or timing of the system. For example, parameters include: ▪ Moving average calculation period (number of days or bars). ▪ Exponential smoothing constant ( a percentage). ▪ Band width around a moving average (percentage or number of standard deviations). ▪ RSI calculation period (number of days or bars). ▪ Stop-loss value (fixed number ofpoints, a percentage, or a volatility factor). ▪ Maximum holding period for a trade (days or bars). 5
  • 6.
    Selecting the TestData It will be necessary to have a database of prices and if more ambitious, a history of economic statistics as well. Data is readily available from numerous vendors and includes daily high, low, and closing prices plus volume and open interest (futures) for nearly all exchange- traded markets in the world. Analysts may choose to use different' data formats for testing. ▪ Individual full contracts ▪ Backward-(or forward-) adjusted data. ▪ Indexed data series ▪ Constant n-day forward series 6
  • 7.
    Synthetic Data The ultimateanswer to getting test data is to create it yourself. The more data the better. It could be ideal if synthetic data could be created that had the same characteristics as the data that will be traded, yet is different. ▪ Monte Carlo Method ▪ Random Numbers and Distributions ▪ Synthetic data is an ideal solution to a difficult problem. It also brings to mind two computers playing chess with one another. 7
  • 8.
    Testing Integrity ▪ Analystsuse as much data as possible to determine whether their proposed strategy is sound. ▪ The use of long test periods assures a good sample of markets, including long periods of sideways movement, bull and bear ln Markets, and a good number of price shocks of various sizes. ▪ The testing process most often involves some level of cheating. ▪ Acknowledging that developing a system can be difficult, there is a right way and a wrong way to test in order to arrive at credible results. ▪ In-Sample and Out-of-Sample Data ▪ Out-of-Sample Testing ▪ Step-Forward Testing 8
  • 9.
  • 10.
    Price Shocks ▪ Ignoringor underestimating the significance of a price shocks will result in a catastrophic loss-if not now, then sometime in the future. ▪ Price shocks are large changes in price caused by unpredictable and significant events. ▪ The impact of price shocks on historic tests can change the results from profits to losses; it varies the risk from small to extremely large. ▪ Consider a price shock only occurs because traders and investors are liquidating their positions or entering new positions in the direction of the price shock. ▪ Price shocks must be treated. as a serious risk; otherwise, any trading can be overleveraged and under capital . ▪ This is the most Common cause of catastrophic Joss. A price shock that causes a windfall profit could just as easily have produced a devastating loss. ' 10
  • 11.
  • 12.
    Summarizing Robustness ▪ Inshowing test results we concluded that a large number of profitable tests was a good measurement of robustness. ▪ For the testing process, it was necessary to set performance expectations and make realistic assumptions about the parameter values that were likely to work; realistic costs also must be used. ▪ A robust trading strategy is one that produces consistently good results across a broad set of parameter values applied to many different markets tested for many years. ▪ Robustness is a term used to describe a system or method that works under many market conditions, one in which we have confidence. 12
  • 13.
    Summarizing Robustness ▪ Thefollowing is a checklist and brief explanation that outlines the steps for testing, evaluating the test results, selecting parameter values, and monitoring actual' trading, consistent with the purpose of finding a robust trading strategy: Checklist 1 - Deciding what to test. ? - Is the underlying premise sound? - Can you program all the rules? - Does the strategy make sense on" under certain conditions? - Take a guess as to the expected results. - Be critical of Good results 13
  • 14.
    Summarizing Robustness Check List2 - Decide how to test? Choose the testing tools and methods. Use as much if the Right data as possible. Separate the data into in-sample and out-if-sample sets Choose the range of parameter values that applies to the strategy Decide the order in which to test the parameters Be sure the parameter test values are distributed properly Define the success criteria. Presentation and visualization of the results 14
  • 15.
    Summarizing Robustness Check List3 - Evaluating the results. Verify a reasonable sample of calculations. were there enough trades to have reliable results? Does the trading system produce profits for most combinations of parameters? Did rule changes improve overall test performance. 15
  • 16.
    Summarizing Robustness Check List4 - Choosing the specific parameter values to trade. The final parameter selection can include all data. Choose values from a broad area success not biased by the one test that had peak profits. Profits must be distributed evenly throughout the test period. Verify whether a disproportionate amount of profits came from price shocks. Scale the risk to a personally acceptable risk level. When there is a choice between two sets of parameter values, choose the one that causes less time in the market. Out-of-sample performance must validate the results. 16
  • 17.
    Summarizing Robustness Checklist 5- Trading and performance monitoring Trade using exactly the same rules that were tested. Trade the same data that was used in testing. Monitoring the difference between the system and the actual entries and exits - Slippage, 17
  • 18.