Using Azure for Computationally Intensive Workloads

75 views

Published on

Presentation I gave at the NYC Microsoft Enterprise Developers Conference in 2009. Early cloud / Azure stuff.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
75
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Using Azure for Computationally Intensive Workloads

  1. 1. Neil
Palmer
 Partner
 SunGard
Consulting
Services
 neil.palmer@sungard.com
 Michael
Heydt
 Senior
Manager
 SunGard
Consulting
Services
 michael.heydt@sungard.com
 http://thecloudarchitect.com

  2. 2. Agenda
   Background
   Cloud
Architecture
Overview
   Functional
Languages
&
F#
   Problem
Statement
   Demo
   Solution
Architectures
&
Outcomes
   Lessons
Learned
   Economics
   Conclusions

  3. 3. Background
   This
is
a
period
of
intense
change
with
many
 disruptive
technologies:
   Cloud
computing
   Functional
languages
   Large
data
set
processing
   Which
have
a
large
impact
on
applications
since
 storage
and
compute
power
is
practically
unlimited
   Leads
to
questions:
   What
types
of
applications
are
technically
feasible?
   And
economically
feasible?

  4. 4. Cloud
Compu2ng
Stack
 Applications
(Salesforce)
 Services
(Google
Maps)
 Platforms
(Azure)
 Infrastructure
(EC2)
 Storage
(S3)

  5. 5. Cloud
Compu2ng
Overview
   Common
components
for
cloud
computing:
   CPU
and
Data
managed
by
another
provider
   Redundant
storage
for
availability
   Dynamically
allocated
CPU
based
upon
demand
   Pay
for
use
and
as
you
go
model
(no
capital
investment)
   Additional
application
services
provided
to
systems
in
 the
cloud
(e.g.
SQL
services,
payment
processing…)
   Operational
support
for
failover

  6. 6. Cloud
Benefits
 Internal
IT
 Managed
services
 The
cloud
 Capital
investment
 Significant
 Moderate
 Negligible
 On‐going
costs
 Moderate
 Significant
 Based
on
Usage
 Provisioning
time
 Significant
 Moderate
 None
 Scalability
 Limited
 Moderate
 Flexible
 Staff
expertise
 requirements
 Significant
 Limited
 Moderate
 Reliability
 Varies
 High
 Moderate
to
High

  7. 7. Applica2ons
suitable
for
the
cloud
   Web
applications
   Load
testing
   Monte
Carlo
Simulation
   Financial
Portfolio
Analysis
   Energy
Demand
Forecasting
   Algorithmic
Trading
   Parallel
/
concurrent
processing
   Processing
pipelines


  8. 8. Func2onal
Programming
   What
is
functional
programming?
   Facilitates
Parallelism
   Treat
computation
as
the
evaluation
of
functions
   Avoids
state
and
mutable
data,
which
greatly
simplifies
 parallel
execution
   Runtime
handles
parallelism
instead
of
through
explicit
 coding
   Take
better
advantage
of
growing
#
of
cores
available
to
 you
   Functional
decomposition
may
be
a
better
answer
to
 leveraging
the
architecture
of
tomorrow
   Beginning
to
think
this
way
needs
to
start
now

  9. 9. Problem
Statement
   Compute
historic
price
volatility
correlations
for
stocks
 over
many
years.
   Goal
is
to
test
the
usefulness
of
cloud
computing
for:
   Manipulating
a
large
data
set
   Handle
a
solution
that
would
otherwise
be
slow
&
expensive
 with
dedicated
servers
   To
determine:
   What
is
the
scalability
of
cloud
architectures?
   Cost
effectiveness
of
scale
out
(many
roles
in
the
cloud)
vs.
 scale
up
(high
CPU,
threading)
   How
does
the
solution
differ
from
a
non‐cloud
solution?

  10. 10. Cloud
Applica2on
Architecture

  11. 11. Cloud
Applica2on
Architecture
   Horsepower
   In
Azure,
these
are
the
 web
and
worker
roles

  12. 12. Cloud
Applica2on
Architecture
   Ingress
–
How
to
talk
to
the
cloud
   In
Azure,
   HTTP/s
to
web
roles
   .NET
Service
Bus
and
queues
to
worker
 roles

  13. 13. Cloud
Applica2on
Architecture
   Cloud
Storage
Services
   Fundamentally
three
types
   Tables
   Blobs
   Volumes
(EC2)

  14. 14. Cloud
Applica2on
Architecture
   Intra‐cloud
communications
   Queues
   .NET
Service
Bus

  15. 15. Cloud
Applica2on
Architecture
   Cloud
provided
services
   Value
add
from
your
cloud
provider
   You
can
also
use
these
from
other
 providers
   Azure
using
EC2
nodes,
AWS
payment
 services
   Google
API
using
REST
interfaces
into
 Azure

  16. 16. Solu2on
Architecture
 Key
points:
 • Scalability
is
goal
#1
 • Partitioned
dataset
 • Multiple
workers
 • Work
item
based
 • Competitive
processing
 • All
asynchronous

  17. 17. Demo
   Show
the
portal
deployments
   Explain
web
and
worker
roles
   Show
client
doing
volatilities
   Explain
parts
of
the
application
and
interop
with
Azure
   Show
/
delete
existing
blobs
   Calculate
some
volatilities
   Show
messages
received,
blobs
being
created
   Show
data
in
one
of
the
blobs

  18. 18. Solu2on
Architecture
   Architecture
was
evolutionary
   Started
with
EC2,
evolved
into
a
hybrid
Azure
/
EC2,
 and
then
full
Azure
   This
was
valuable
to
see
the
differences
in
cloud
 platforms

  19. 19. Solu2on
Architecture
 Version
1.0
 100%
EC2

  20. 20. Version
1.0
Outcome
   Linear
scalability.

Double
the
nodes,
half
the
time
to
 complete
   Took
time
to
image
and
manage
AMI’s
   Feels
like
you
manage
the
servers
in
their
entirety
   No
automatic
failover
or
restarts
provided
by
EC2
   Bandwidth
costs
–
must
watch
them
   Security
is
on
your
own
   Table
storage
was
considered
too
limited
in
max
size
to
 use
   Cost
effective
for
the
problem,
but
minimum
billing
is
per
 hour,
so
that
can
burn
you

  21. 21. Solu2on
Architecture
 Version
2.0
 100%
Azure
 with
Table

 Data

  22. 22. Version
2.0
Outcome
   Table
data
did
not
perform
well
   Gave
up
before
even
getting
historical
data
into
the
 cloud
   REST
performance
for
table
data
was
~1000
puts
per
 50
seconds
   4.7
million
historical
price
points
to
load
   Multi‐threading
did
not
help
   Scrapped
and
went
for
a
blob
model
with
ticker
data
 blobs
(version
3.0)

  23. 23. Solu2on
Architecture
 Version
2.5
 Azure
and
 EC2
Hybrid

  24. 24. Version
2.5
Outcome
   Easy
port
of
.Net
code
to
Azure
from
EC2
   Just
pointed
data
layer
to
existing
EC2
database
image
   Execution
time
about
the
same
as
with
EC2,
even
with
 the
database
across
the
Internet
   Not
as
much
headache
since
you
are
not
managing
as
 many
virtual
servers
   But
is
having
an
RDBMS
in
the
cloud
“cloudy”?

  25. 25. Solu2on
Architecture
 Version
3.0
 100%
Azure

  26. 26. Version
3.0
Outcome
   Same
benefits
of
2.5
   No
need
for
SQL
Server
   Blobs
solved
REST
problems

   Migration
of
data
to
Azure
blobs
did
take
took
some
 work
and
redesign
   Physical
partitioning
of
data
into
blobs
   Indexes
to
data
also
stored
in
blobs
   Binary
serialization
of
objects
into
blobs

  27. 27. Code
Review
   Lets
look
at
some
code
   WCF
service
API
   Silverlight
service
bridge
   Web
role
–
looks
just
like
ASP.NET
and
Silverlight
   Silverlight
–
show
what
happens
when
you
press
“start”
   WCF
service
–
show
hooks
to
azure
(storage
accounts)
   Worker
role,
show
how
processing
is
done

  28. 28. Poten2al
Enhancements
   Table
data
storage
for
statuses
   .Net
Service
bus
integration
for
monitoring
and
 communications
to
non‐cloud
systems
   Multicast
.Net
Service
Bus
for
broadcasting
to
all
 worker
roles
   Eventual
use
of
SQL
data
services
in
cloud
   Build
F#
libraries
for
quantitative
analysis
   Build
locally,
test
against
local
data
   Deploy
to
cloud
   Connect
as
required

  29. 29. Lessons
Learned
   Queues
and
Asynchronicity
   Queues
work
on
a
different
model
than
MSMQ
   Retrieve
with
a
delete
window
and
then
explicitly
delete
   Polling
model
(no
blocking)
   Get
used
to
asynchronous
processing
   Scalability
is
obtained
through
asynchronous
model
   Queue
based
communication
between
web
and
worker
roles
   Asynchronous
communications
from
Silverlight
to
Azure


  30. 30. Lessons
Learned
   EC2
vs
Azure
   Dynamic
allocation
in
Azure
is
not
as
good
as
with
EC2
   Ec2
billing
is
by
the
hour,
so
not
too
good
for
quick
needs
   .NET
code
was
very
portable
between
EC2
and
Azure
   Watch
the
bandwidth
between
storage
zones
   Management
is
difficult
in
both
   But
Azure
management
is
easier
than
EC2
   Azure
monitors
your
roles
and
restarts
them
(EC2
doesn’t)

   EC2
feels
a
lot
heavier
than
Azure
   Seems
great
for
appliances
   But
if
you
are
doing
.NET,
best
to
go
Azure

  31. 31. Lessons
Learned
   Data
   Getting
data
into
the
cloud
can
be
a
lot
of
work
   REST
does
is
not
performant
for
large
#’s
of
small
 records
   Designing
data
for
non‐relational
storage
is
 cumbersome
and
requires
a
change
of
mindset

  32. 32. Lessons
Learned
   Programming
   URLs
for
WCF
services
must
be
rewritten
in
the
various
 environments
   .NET
code
for
web
and
workers
is
very
similar
to
 normal
.NET
code
   Lack
of
full
trust
can
be
a
pain;
many
libraries
caused
failures
   F#
needs
to
be
linked
into
the
solution
due
to
not
being
 available
in
the
Azure
GAC
/
full
trust
   Can’t
talk
directly
to
Azure
easily
from
Silverlight
   Debugging
is
difficult:
logs,
writing
to
queues,
or
to
SQL

  33. 33. Things
to
look
for
in
the
future
   Concern
about
Azure
pricing
for
unutilized
workers
   Dynamic
/
API
based
allocation
of
roles
   Management
API’s
and
user
interfaces
   Caps
on
#
of
instances
/
roles
available

  34. 34. Economics
   EC2
   Ran
a
subset
of
the
overall
task
   Used
five
instances
as
baseline
   100
units
of
work
   50
volatility
blocks
of
work
   50
correlation
blocks
of
work
   Each
instance
handled
10
blocks
of
work
for
both
volatility
 and
correlation
   Volatilities
took
6.9
minutes
per
block

   Correlations
took
4.6
minutes
per
block

  35. 35. Economics
–
Subset
of
Solu2on
 Calculation
 Time
 #
Blocks
 Total
Time
 Volatility
 6.9
 10
 69
 
Correlation
 4.6
 10
 46
 Total
Time
 115
 Cost
/
Billable
Hour
 $0.125
 Cost/Node
 $0.25
 Total
Cost
(5
Nodes)
 $1.25

  36. 36. Economics
–
Full
Solu2on
 Calculation
 Time
 #
Blocks
 Total
Time
 Volatility
 6.9
 10
 69
 
Correlation
 4.6
 500
 2300
 Total
Time
 2369
 Cost
/
Billable
Hour
 $0.125
 Cost/Node
 $5.00
 Total
Cost
(5
Nodes)
 $25.00

  37. 37. Economics
–
Single
System
 Calculation
 Time
 #
Blocks
 Total
Time
 Volatility
 6.9
 50
 345
 
Correlation
 4.6
 2500
 11,500
 Total
Time
(mins)
 11,845
 Total
Time
(days)
 8.25

  38. 38. Economics
–
2500
Nodes
 Calculation
 Time
 #
Blocks
 Total
Time
 Volatility
 6.9
 1
 6.9
 
Correlation
 4.6
 1
 4.6
 Total
Time
 11.5
 Cost
/
Billable
Hour
 $0.125
 Cost/Node
 $0.125
 Total
Cost
(2500
Nodes)
 $312.50

  39. 39. Economics
   There
is
a
cross
over
of
speed
vs.
cost:
   $312.50
(quickest)
versus
$25.00
(most
cost
effective)
   Minimum
billing
hour
granularity
of
EC2
introduces
a
 fixed
cost
component
   Isn’t
necessarily
cheaper
compared
to
the
cost
of
 ‘fixed’
hardware
over
a
long
period
of
time
   Compute
time
not
the
only
cost
to
take
into
account
   Data
transfer
in
&
out
of
cloud
is
equally
as
costly

  40. 40. Conclusions
   What
is
the
scalability
of
cloud
architecture?
   This
problem
was
linearly
scalable;
double
the
nodes,
 roughly
half
the
time
   Cost
structure
–
is
it
economical?
   The
numbers
look
good
compared
to
investing
in
capital
and
 humans
   Make
sure
you
don’t
get
billed
for
non‐utilized
time
   Bandwidth
still
costs
you,
and
could
be
significant
   Watch
for
minimum
billing
times
   How
does
the
solution
differ
from
a
non‐cloud
solution?
   With
Azure,
it’s
very
similar
coding
(more
in
lessons
 learned)
   But
you
must
learn
to
partition
the
problem
set
for
scale
out

  41. 41. Q&A

  42. 42. The
conversa2on
doesn’t
stop
here!
   Sign
in
on
www.entdevcon.com
not
only
to
watch
the
 sessions,
but
to
also
discuss
the
content!
   Create
your
own
blogs,
wikis,
conversations
and
 special
interest
groups–
watch
content
–
all
for
FREE!
   Make
a
industry
connections
with
your
peers
and
 Microsoft
experts
online!
   Comments?
Email
EntDevCon@live.com


  43. 43. EDC
online!
   Join
the
community
of
enterprise
developers!
at
 www.entdevcon.com
   Share
your
stories
today
and
beyond
the
event!
   Tweet
directly
to
#EntDevCon
Twitter

   Share
your
pictures
on
our
Facebook
group!
   Discussion
of
the
community

  44. 44. Fill
out
your
evalua2ons!
   Day
1
–
Tonight’s
reception
requires
tickets.

In
order
to
 receive
your
ticket,
you
must
complete
an
evaluation
form
 for
Day
1,
and
return
to
registration
desk.
   Day
2
–
We
will
hold
a
prize
raffle
during
the
closing
 keynote.
Please
hand
in
your
Day
2
evaluations
prior
to
 the
keynote.

Prizes
include:
   
Xbox360
   Zune
music
players

   Windows

Mobile
device,
complete
with
mouse
enabled
 pointer
and
keyboard.

   Microsoft
Wireless
Laser
Desktop
   Lego
Mindstorm
kits


×