This document discusses genetic algorithms and genetic programming. It begins by explaining how genetic algorithms are inspired by natural selection and evolution by creating populations of potential solutions and using selection, crossover and mutation to find better solutions. It then describes genetic programming, which evolves computer programs using the same principles. Examples of applications are given, such as modeling chemical kinetics for NASA. The document suggests genetic algorithms and programming may help invent the future through applications in domains like smart grids, intrusion detection, and protein modeling.
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
If The Singularity Arrives, Will It Be By Design Or Evolution?
1. IF THE SINGULARITY ARRIVES,
WILL IT BE BY DESIGN OR
EVOLUTION?
Bill Worzel billwzel@gmail.com
Evolution Enterprises http://evolver.biz
Data Day Texas 11 Jan 2014
Austin, TX
Monday, January 13, 14
2. NATURE HAS MANY ROOMS
• Animals
solve the problem
of survival in many ways
• Most
are adapted to specific
ecological niches
• Genetics
forms the common
language of living creatures
Monday, January 13, 14
3. EVOLUTIONARY ALGORITHMS
(EA) BORROW FROM NATURE
• Based
on natural selection
and population dynamics
• Create
a population of
solutions
• Preferentially
select and
recombine better individuals
to find better solutions
Monday, January 13, 14
4. AN ELEGANT SEARCH
• EAs
combine global search
with local search
• Randomly
generated
individuals test many niches
• Selection
and recombination
hones in on the best
neighborhoods
Monday, January 13, 14
5. GENETIC ALGORITHMS (GA)
• GAs
encode information
and then combine and
mutate individuals
Population
• In
simplest case, encoding is
a bit string mapped to
variable values
• Initial
population of
individuals are created
randomly
Monday, January 13, 14
P/E Trend
101001011010
6. SELECTION & FITNESS
• Subset
of individuals are
selected at random from
population
• Fitness
of each is calculated
pair are combined to
produce offspring
32
16
• Best
Monday, January 13, 14
32
18
x
90
90
Fitness
7. CROSSOVER & MUTATION
crossover pt
• Crossover
strings
• Mutation
combines bit
changes bits
• Both
operations are
stochastic
• Offspring
replace parents or
weaker individuals in
population
Monday, January 13, 14
101001011010
x
011101101110
=
011101111010
+
101011001110
mutation
8. BUILDING BLOCKS AND
SCHEMAS
• Building
block hypothesis states that GAs find good simple
components that confer better fitness on individuals
• The
Schema Theorem shows that better building blocks
accrue to produce best individuals: E(m(H,t+1)) ≥ ((m(H,t)
f(H))/at)[1-p].
Monday, January 13, 14
9. CASE STUDY: AGRICULTURAL
MODELING
• Decision
support software for
farmers: With large number of
new hybrids, what to choose?
• Needed
to integrate
agronomic, weather,
economic, personal factors
• GA
not as an optimizer but as
an optionizer in a multiobjective space
Monday, January 13, 14
10. RICH LITERATURE FOR GA
• Many
conferences, particularly GECCO (Genetic and
Evolutionary Computing COnference)
• D. Goldberg, Genetic
algorithms in search, optimization, and
machine learning, Addison-Wesley, 1989
• J. Holland, Hidden
Order: How adaptation builds complexity,
Helix Books, 1995
• J. Holland, Adaptation
1975
Monday, January 13, 14
in Natural and Artificial Systems, MIT Press
11. GENETIC PROGRAMMING
(GP)
• GP
evolves computer programs (usually functions)
• Essentially
• Extends
Monday, January 13, 14
a program that produces programs as its output
idea of combining bit strings to parse trees
12. GP OVERVIEW
Input Data
GP Parameters
?
Program
Population
Replace
Least Fit
With
Offspring
Yes
Terminate?
GP
Cycle
?
Crossover
and
Mutate
Monday, January 13, 14
No
Select
Mating
Group
SelectTwo
Best
Programs
Output
Results
?
? = stochastic process
13. CONSTRUCTING TREES
• Randomly
assemble a population of function trees as
constrained by GP parameters
From: ‘A Field Guide To Genetic Programming’
Monday, January 13, 14
16. THE DEVIL IN THE DETAILS
• How
do you correct syntax errors?
• Type
coherence?
• Control
overfitting?
• Computationally
Monday, January 13, 14
intensive
17. BUT HEAVEN’S ON OUR SIDE
• Naturally
iterative
parallel algorithm - linear speedup, mostly not
• Sub-populations
may be run asynchronously in parallel: m*n/p
where m is individuals in a sub-population, n is the number of
sub-populations, and p is number of processors
• Matches
Monday, January 13, 14
up well with cloud computing
18. THE SKGP
• Uses
purely functional combinators to represent programs
• Efficient, powerful, reusable
• Algorithm
code
becomes superlinear in parallel application because
of code reuse
Monday, January 13, 14
20. VARIABLE ABSTRACTION
• D.A. Turner
showed that all bound variables could be
removed completely using combinators (Turner 1979, A New
Implementation Technique for Applicative Languages, Software–
Practice and Experience, vol 9, 31-49 )
• Essentially
this provides a way to create expressions that are,
combinators applied to data with no reference to variables
Monday, January 13, 14
21. EXAMPLE COMBINATOR
FUNCTION
Example:
‘S(S(K +)(K 1))I’
is the function
that adds 1so
S(S(K +)(K 1)I
applied to 3 is:
S(S(K +)(K 1))I 3
S(K +)(K 1)3(I 3)
K+3((K 1)3)(I 3)
+K 1 3 (I 3)
+ 1 (I 3)
+13
4
Monday, January 13, 14
22. COMBINATORS FUNCTIONS
QUICKLY BECOME COMPLEX
Here is the function for factorial:
def fac = S(S(S(K cond)(S(S(K =)(K 0)))I))(K 1))(S(S(K *)I)
(S(K fac)(S(S(K -)I)(K 1))))
Evaluation is left as an “exercise to the reader.”
Monday, January 13, 14
23. THE SKGP
• Implements
programs as graphs
using both combinators with GP
to produce pure functional
(combinator) expressions
• Combinators
have the property
of being ‘structure altering
operators’
• There
is evidence that GP can
be limited in its search ability
without such a capability
Daida, unpublished based on Daida2004
Demonstrating Constraints to Diversity with a
Tunably Difficulty Problem for Genetic Programming
Monday, January 13, 14
24. CHURCH-ROSSER THEOREM
• The
Church-Rosser Theorem says pure function evaluation
can be order independent: Regardless of order of evaluation,
result will be the same
• Because
of this, each functional piece, when evaluated, can be
stored for re-use since order of evaluation does not matter
• Because
GP shares pieces across generations, reuse gives
super-linear speed up: you don’t have to recompute each
component
Monday, January 13, 14
25. CASE STUDY: MODELING THE
MODEL
• Modeling
chemical kinetics for NASA
• NASA
had a set of first principle models used to simulate
combustion of jet fuel and its exhaust gases: accurate but
very slow
• By
using the simulator to train the SKGP, it was able to
produce a highly accurate function for predicting output gas
amounts across a wide range of values
• Functional
• Function
Monday, January 13, 14
results was 2370x faster than running simulation
was highly accurate empirical solution of PDEs
26. CASE STUDY: LISTENING TO DATA
• Collaboration
with Dr. Richard Cote and USC to study
bladder cancer
• Is
there a molecular signature that matches T-stage of
tumors? No! Attempt produced complicated, poorly
performing functions
• Examining
data showed that tumors with local metastasis
were consistently misclassified
• Is
there a signature in tumor that indicates local mets? Yes!
Produced a set of concise, highly accurate, biologically
sensitive functions that could identify when a tumor had
metastasized
Monday, January 13, 14
27. SOME APPLICATIONS
• Inferential
sensors (Dow Chemical)
• Financial
modeling (Analytic Research Foundation, State
Street Global Advisors)
• Antenna
• Analog
• Solid
Monday, January 13, 14
design (NASA)
circuit layout (Solido Design)
State Memory management (NVM durance)
28. OPEN SOURCE SOLUTIONS
• Java: ECJ
- a well known Java implementation from one
of the well known researchers in GP
• Python: DEAP
Python
- an “all-in one package” written in
• Clojure: PushGP
- a stack-based version of GP with
many nice features, also written developed by a
respected GP researcher
Monday, January 13, 14
29. PROPRIETARY
• Evolver
by Evolution Enterprises: http://evolver.biz
Modeler by Evolved Analytics: http://
www.evolved-analytics.com/
• Data
Monday, January 13, 14
30. GP REFERENCES
• J. Koza, Genetic
and Kluwer.
Programming I-IV, Morgan Kauffman
• R. Poli, W.B. Langdon
and N.F. McPhee, A Field Guide
to Genetic Programming
•
<Various> Genetic Programming Theory and Practice IX1, 2002-2013
•
Mitra et al, The use of genetic programming in the
analysis of quantitative gene expression profiles for
identification of nodal status in bladder cancer, BMC
Cancer, 6(159) 2006
Monday, January 13, 14
31. POSSIBLE FUTURES
•
Some immediate areas of application include Smart Grid
and energy efficient designs, intrusion detection, discovery of
protein-gene-SNP networks
•
Since evolutionary algorithms give a multi-dimensional
analysis in the form of a population of solutions they provide
more information than a single solution
•
EAs can continuous analyze data as it comes in, adapting to a
changing environment while still providing high performance
solutions
•
There is a bridge from functions to full programs, though
functional methods reduce the gap and could lead to
functional co-applications (an ecology of functions)
Monday, January 13, 14
32. “THE BEST WAY TO PREDICT
THE FUTURE IS TO INVENT IT.”
-ALAN KAY
Monday, January 13, 14