Confessions of an Interdisciplinary Researcher: The Case of High Performance Economics

589 views

Published on

Scaling up economics models to run on large input sizes, complex market and agent model settings, and on big computational resource pools is a demanding feat.
This presentation tells you what it takes to work as a computational economist.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
589
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Confessions of an Interdisciplinary Researcher: The Case of High Performance Economics

  1. 1. Confessions
of
an
inter‐ disciplinary
researcher
 The
case
of
high‐performance
 economics
 by
 Tibi
Stef‐Praun
 :berius@ci.uchicago.edu
 Nov.
2009


  2. 2. The
evolu:on
of
science:

 Specialized
Modeling

 •  physical
and
biological
sciences
 have
proven
successes
with
 increasingly
complex
models
 •  economics
lags
because
it
is
 imprac:cal
to
conduct
experiments
 to
validate
theories,
and
data
fiIng
 and
simula:ons
remain
the
only
 tools
available

 •  current
state
of
the
art
in
 economics
is
in
models
(ref.
any
 graduate
textbook),
which
s:ll
need
 valida:on
with
mul:ple
data
sets
 •  economics
Nirvana
means
 integra:ng
all
models/theories
and
 simula:ng
and
predic:ng
realis:c
 outcomes.


  3. 3. The
seIng
for
economic
modeling
 •  modern
growth
theory
 –  model
individual
agents
(households,
firms,
govt.)
 –  market
with
asymmetric
informa:on
 –  forward
looking
agents
 –  stochas:c
shocks
 •  computa:onal
limita:ons
 –  above
model
features
introduce
alterna:ve
paths,
each
of
which
has
 to
be
evaluated
and
considered
in
the
final
choice
 –  curse
of
dimensionality
 •  advanced
tools
 –  standard
constructs:
dynamic
programming,
max
likelihood
 –  standard
tools:
op:miza:on,
solvers,
sta:s:cs
 –  agent’s
decision
is
an
op:miza:on
problem
 –  exploit
model
structure,
introduce
approxima:ons

  4. 4. Must
work
in
all
three
areas
 •  economic
theory
 –  need
to
understand
the
constraints
of
the
model
(agents
 decision
model,
:melines,
resources
involved)
 –  be
able
to
generalize
model
to
solve
related
data
and
seIngs
 (add
degrees
of
freedom)


 •  computa:onal
resources
 –  iden:fy
execu:on
paOers
(agent’s
decision
code,
market
setup
 and
clearing,
structural
calibra:ons,
etc.)
and
dependencies
 –  exploit
parallel/distributed
resources
(Grid/Clouds,
SwiQ/WS)
 •  mathema:cal
tools
 –  familiarity
with
solving
the
mathema:cal
formula:ons
 (op:miza:on
theory,
solver
libraries)

 –  understand
implica:ons
of
the
tools
used

  5. 5. Current
involvement:

 Dynamic
Mechanism
Design
Theory
 •  AOack
the
problem
from
the
economic
modeling
side,
provide
(scalability)
 improvements
to
exis:ng
models
(ini:a:ve
of
Rob
Townsend)
 •  Evalua:ng
choices
of
group
organiza:on
for
risk
sharing
purposes,
 by
Madeira
and
Townsend.
Paper:
Accelera'ng
solu'on
of
a
moral
hazard
problem
 with
Swi9,
eScience
conference
2007.
Contributed
modest
speedup
(20x).
 •  Linking
growth
to
financial
deepening
and
inequality,
by
Ueda
and
 Townsend.
Poster
with
Victor
at
the
Uncertainty
workshop
(2008).
Contribu:on
in
 parallelizing
Matlab
code
(stochas:c
shocks).
 •  Borrowing
choices,
work
in
progress
by
Esteban
Puentes
(with
 Townsend).
Contributed
70x
speedup
for
remote
expensive
func:on
evalua:on
 (2009):
 hOp://www.mathworks.com/matlabcentral/fileexchange/24982‐parallelizing‐matlab‐on‐large‐distributed‐compu:ng‐clusters
 •  Incomplete
financial
markets,
by
Karaivanov
and
Townsend.
Work
in
Progress,
 Contribu:ng
code
reengineering
for
user
defined
regime
evalua:on
and
parallel
 implementa:on
and
speedup.

 •  Wealth‐constrained
occupa:onal
choice
(LEB).
Contributed
prototype
of
web‐ based
user‐driven
input
data
genera:on
and
model
execu:on
(for
interac:ve
 model
evalua:on).



  6. 6. Current
involvement
(cont)
 Dynamic
Programming
 •  aOacks
the
problem
from
the
other
end:
provide
high
performance,
 scalable
tools
to
economists
(ini:a:ve
of
Ken
Judd)
 •  dynamic
programming
is
the
current
(rediscovered)
wunderkind,
as
it
 allows
realis:c,
forward‐looking,
stochas:c
decision
modeling
 •  contribu:on
is
in
designing
a
general
plaborm
(for
many
classes
of
DP
 problems)
that
is
both
scalable
(in
computa:onal
resources)
and
easy
to
 use

 –  DP
engine
takes
as
parameters
the
problem
descrip:on
(state
space
grids
and
 produc:on,
u:lity,
stochas:c
transi:on
callbacks)
 –  the
parallelizable
DP
computa:on
steps
are
mapped
transparently
(from
the
 user’s
perspec:ve)
onto
the
resources
 •  address
curse
of
dimensionality
by
brute‐force:
throw
resources
at
the
 problem
 •  it
is
only
a
temporary
solu:on
(offsets
the
real
problem
with
the
size
of
the
 compu:ng
resources).
It
needs
to
be
combined
with
intelligent
 dimensionality
reduc:on
techniques
(state
space
approx.,
mul:‐grid,
etc)
 •  speedup
advantage
is
a
combina:on
of
resources
and
algorithmic
 approxima:on

  7. 7. Technical
adventures
(I)
 •  large‐scale
(Grid)
execu:on
implies

 –  using
open‐source
and
redistributable
soQware.
A
lot
goes
into
 replacing
commercial
alterna:ves
or
building
fresh
solu:ons

 •  open‐source
is
less
reputable/efficient/precise/available
 •  verifica:on
against
commercial
results
is
essen:al
(huge
debugging
:me)
 •  e.g.
replace
Matlab
model+CPLEX
with
alterna:ves
 •  choosing
the
right
“framework”
language
so
that
economists
will
adopt
it

 –  replica:ng
a
proper
model
solving
environment
on
those
resources.
 •  install
model
components,
dependency
libraries
 •  e.g.
install
python
adapters
to
hdf5
library
on
BlueGene,
OR
compile
open‐ source
solvers
(CLP,
LP_SOLVE)

with
Matlab
MEX
adapters
on
various
Grid
 sites.
Deal
with
32
vs
64
bit
or
Windows
vs
Linux
plaborm
issues.



 –  acquiring
the
compu:ng
resources.

 •  In
a
shared
academic
environment
the
tragedy
of
commons
kicks
in
 •  tools
exist
to
assist
with
this:
reserva:ons,
glide‐ins,
etc.
 •  alterna:vely,
go
commercial
(cloud
compu:ng)



  8. 8. Technical
Adventures
(II)
 •  parallel/distributed
model
execu:on
implies
 –  integra:on
of
diverse
soQware
(Matlab
executables,
op:miza:on
 libraries,
wrapper
scripts,
remote
invoca:on
facili:es)
 •  complex
management/lifecycle
of
the
code
base
 •  we
use
tools
such
as
SwiQ
or
web
services
to
choreograph
model
components
 –  a
proper
decomposi:on
of
the
model
that
op:mizes
execu:on
:me
 (given
the
resources)
 •  must
understand
model’s
logical
blocks,
inter‐dependencies,
and
their
 significance
in
the
economic
problem
(need
a
LOT
of
domain
knowledge
OR
a
 economist
to
collaborate
with)
 •  profiling
the
execu:on
involves
repeated
measurements
and
code
 reorganiza:on
(spent
20k+
CPU
hrs.
on
BlueGene
on
dynamic
programing)

 –  transparent
execu:on
for
the
user
 •  economists
do
not
(should
not)
have
to
know
technical
details:
provide
an
 opera:ng‐system‐like
abstrac:on:
execute
(op:mally)
this
piece
of
code
 •  several
op:ons
exist,
all
imply
lifecycle
management
of
the
model
library/ service
for
the
life:me
of
the
applica:ons
using
it.
Service‐oriented‐science
?



  9. 9. Technical
Adventures
(III)
 •  Data
is
essen:al
 –  Data
enables
model
parameter
es:ma:on/calibra:on
 –  Data
cleaning
is
a
pain
 –  We
need
good/clean/validated
data
:
survey
designing,
execu:on,
and
delivery
 can
cause
lots
of
pain.
See
Open‐Data‐Tool
mobile
collec:on
 •  Data
access
is
essen:al
 –  Fast
explora:on
/
Visualiza:on
/Web
hOp://age3.uchicago.edu:8080/thailand

 –  Model‐dependent
input
genera:on
(automated
?)
 –  Database
storage,
organiza:on,
access

 –  Con:nuous
data
collec:on,
schema
expansion
 –  User
data
access:
select
and
extract
into
favorite
tools
(Stata,
Excel)
 •  Data
has
many
dimensions
 –  cross‐sec:onal/panel/spa:al
(GIS)
 –  iden:fiers
for
connec:ng
fragmented
record
collec:ons
 •  Data
described
and
available
at
 –  hOp://cier.uchicago.edu
 –  hOp://dvn.iq.harvard.edu/dvn/dv/rtownsend

  10. 10. Philosophical
Musings

 •  The
dimensionality
of
problem
space
hurts
 –  Structural
es:ma:on
(MLE,
GMM)
are
the
most
expensive
procedures,
they
 re‐run
the
whole
models
with
different
structural
parameters
to
find
the
best
 fit
 –  Op:miza:on
rou:nes
that
drive
these
(non‐linear
with
finite
difference
 gradient
evalua:ons)
are
dependent
on
the
star:ng
point
and
on
the
user’s
 mastery
of
the
search
algorithm’s
knobs
 –  number
of
free
parameters
determine
exponen:ally
the
computa:onal
 requirements
 –  discre:zing
the
problem
variable
space
affects
computa:on
requirements,
 results




 •  Knowledge
of
the
economic
problem
and
understanding
of
the
tools
that
 solve
it
can
oQen
lead
to
improvements
that
trump
computa:onal
brute‐ force
methods.
 •  Economists
avoid
integra:ng
models
or
building
complex
systems
because
 it
becomes
difficult
to
explain
the
results
of
such
simula:ons
(ceteris
 paribus
assump:on
starts
geIng
weak)



  11. 11. What
kind
of
research
this
is
 •  Paraphrasing
from
Office
Space:
“I
deal
with
the
resources,
so
that
the
 economists
don’t
have
to”
 •  For
CS
types,
it
is
a
combina:on
of
soQware
engineering,
 parallel
programming,
systems
integra:on,
mainly
applied
 to
mathema:cal
models.
 •  For
the
science
addicts,
it
combines
linear
algebra,
 op:miza:on
theory,
sta:s:cs,
game
theory,
and
behavioral
 theories
into
a
big
numerical
model.

 •  For
economists,
it
enables
asking
and
answering
big
(in
 input
size)
ques:ons
and
tackle
complex
models.
 •  Where
is
the
fun
in
that?
Applied
Scalable
Science
 •  At
this
stage,
it’s
an
art


  12. 12. What
this
kind
of
research
this
is
not
 •  it
is
not
a
quick
and
easy
way
to
publish
an
econ
paper.
Quite
the
 opposite
!
 •  it
does
not
apply
to
the
mainstream,
reduced‐form,
analy:cal
 economics
research,
it
is
mainly
cuIng‐edge,
micro‐founda:ons,
 numerical
simula:on

 •  it
is
not
about
valida:ng
parallel
execu:on
plaborms,
integra:on
 schemes,
etc.
 •  it
is
not
about
showing
high
throughput/high
performance
 capabili:es
of
the
models
on
massive
resources
(BlueGene,
etc)
 •  it
should
not
forget
about
the
primary
beneficiary:
the
researcher
 who
needs
to
run
his
models
with
confidence
and
in
manageable
 :me
 •  it
should
not
be
a
way
to
add
buzzwords
to
your
grant
proposal
 (Grid/Economics)

  13. 13. Support
 •  The
BAD:
 –  Generic
(economic)
domain
tools
are
rarely
funded
by
govt.
agencies
(NSF,
etc)
 –  Since
this
is
not
pure
economic
research,
and
as
it’s
heavily
slanted
towards
 computa:onal
resource,
liOle
chance
of
publishing
in
economic
journals
 (Econometrica,
etc)
 –  Since
this
is
not
a
generic
computa:onal
plaborm,
or
a
resource
alloca:on
 mechanism,
or
a
new
Science
2.0,
it
receives
liOle
interest
from
the
computer
 science
community
(HPDC,
etc)
 •  The
GOOD:
 –  a
few
ini:a:ves
support
this
kind
of
work
(Townsend,
Judd)
 –  lots
of
interest
with
the
students
(ICE
@
UChicago)
 –  big
ins:tu:ons,
government
should
be
interested
in
result,
such
work
should
 be
the
policy
evalua:on
tool
they
always
needed.
 •  The
OFFER:
 –  Join
forces
with
the
AGE3
group
(Applied
general
equilibrium
for
Entreprise
 Economics)
and
be
involved
in
exci:ng
science.
We
cater
to
the
needs
of
big
 economists
!
 –  hOp://age3.uchicago.edu


  14. 14. About
me

 (example
personal
journey)
 •  Started
as
a
CS
(focus
on
systems),
ended
up
with
PhD
thesis
on
market‐ based
decentralized
(in
space
and
ownership)
resource

(web‐service)
 alloca:on
 •  Moved
to
Grid
technologies,
worked
on
scaling
up
(parallelizing)
 applica:ons
for
various
clustered
resources.
Used
SwiQ
parallel
workflow
 descrip:on
and
execu:on
engine.

 •  Specialized
in
Economics
applica:ons
(Growth
Theory,
Mechanism
Design,
 DSGE,
micro‐founda:on‐based
modeling)
and
their
applica:on
to
 emerging
economies
(with
incomplete
financial
markets,
entrepreneurial
 growth
poten:al,
etc).
2+
years
experience
 •  Close
collabora:on
with
the
Enterprise
Ini:a:ve
hOp://enterpriseini:a:ve.org

 •  A
related
(earlier)
presenta:on:
hOp://www.youtube.com/watch?v=Uaw7VMZw7tQ


 •  Interested
in
joint
grant
proposals
on
the
topics
above.
 •  Interested
in
collabora:ons
with
large
economics
ini:a:ves
 :berius@ci.uchicago.edu


×