Philip Genetic Programming In Statistical Arbitrage

Uploaded on


More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Genetic Programming in Statistical Arbitrage Philip Saks PhD Seminar 17.10.2007
  • 2. Contents
    • Introduction
    • Genetic Programming
    • Clustering of Financial Data
    • Data
    • Framework
    • Results
    • Conclusion
  • 3. Introduction
    • To develop an automated framework for trading strategy design, by employing evolutionary computation in conjunction with other machine learning paradigms
    • The present framework utilize genetic programming
    • Much of the existing financial forecasting using GP has focused on high-frequency FX [Jonsson, 1997][Dempster and Jones, 2001][Bhattacharyya et al, 2002] and the general consencus is that there is predictability, and excess return is achievable in the pressence of transaction costs
    • For stocks, the results are mixed [Allen and Karjalainen, 1999] do not significantly out-perform the buy-and-hold on S&P500 daily data, but [Becker and Sheshadri, 2003] do on monthly.
  • 4. GP I
    • EC is a concept inspired by the Darwinian survival of the fittest principle – The rationale being, that natural evolution has proved succesfull in solving a wide range of problems throughout time, hence an algorithm that mimics this behavior, might solve a wide range of artificial problems
    • The concept was pioneered by Holland (1975) in the form of Genetic Algorithms (GA)
    • A GA is essentially a population based search method, where each candidate solution is incoded in a fixed length binary string.
    • The population evolves, via mainly three operators, selection, reproduction and mutation.
    • The selection process is based on the survival of the fittest principle.
  • 5. GP II
    • GP’s are basically GA’s in which the genome contitutes hierachical computer programs
    • Using this representation, we can solve problems in a wide range of fields such as, symbolic or ordinary regression, classification, optimal control theory etc. since each of these areas “can be viewed as requiring discovery of a computer program that produces some desired output for particular inputs” (Koza, 1992)
    • Tree representation of programs, function & terminal Set
    • Evolutionary operators: selection, cross-over & mutation
  • 6. Clustering of Financial Data
  • 7. Data
    • Hourly VWAP prices and volume for banking stocks within the Euro Stoxx Universe, covering the period from 01-Apr-2003 to 29-Jun-2007 (8648 oberservations).
  • 8. Framework
    • Evolve trading rules with binary decisions
    • We consider the classical single tree setup, but also a dual tree framework, where buy and sell rules are co-evolved.
    • The training set comprises 6000 samples, while the remaining 2647 are used for out-of-sample testing
    • 10 runs are performed for each experiment.
  • 9. Results
    • Trading on VWAP, assuming 1bp market impact
  • 10. Sensitivity Analysis
  • 11. Stress Testing I
  • 12. Turnover Analysis
  • 13. Transaction Cost Implications
  • 14. Conclusion
    • It is possible to discover profitable arbitrage trading rules on the Euro Stoxx banking sector.
    • A cooperative co-evolution of buy and sell rules are beneficial to the classical single tree structure.
    • Optimizing in the pressence of transaction costs makes a difference – There should be correspondence between assumption and application for optimal performance.