TIKAL COMMUNITY event
                                     1.10.09 CBI Center, Tel-Aviv




          The Cloud Dilemma for
        Developers – AWS Case Study


                   Dudi Landau
                        CTO
                   ClearForest


Hosted by Tikal.    www.tikalk.com               Cost-Benefit Open Source
Introduction

 Map of the world
 Why (should I care)?
 What is AWS?
 The road from the container to the cloud
 The “body and soul” architectural pattern
 The “Stem-Cell” architectural pattern
 Semantic Proxy
 Processing 11 million docs
 Things you’re not told ...




TIKAL COMMUNITY event 1.10.09                 The Cloud Dilemma for Developers
Map of the world

• SaaS

• PaaS

• IaaS
     • “well it is basically just a virtual machine, but elsewere on the clouds”

     • (…and multithreading is normal programming with just threads &
         locks)

     • (… and JEE is JSE on a server)

     •  Cloud Computing in general and IaaS in particular is a paradigm
         shift!

     • machines are created by a web-service programmatically!

TIKAL COMMUNITY event 1.10.09                                 The Cloud Dilemma for Developers
Why (should I care)?

• Moving from CAPEX to OPEX, pay as you go

• Ability to scale in time (and shrink back!)
     • Cope with Slashdot effect

• HW resilience is not our problem anymore …
     • But even the NLB becomes a machine!




     • Story: Animoto , SmugMug



TIKAL COMMUNITY event 1.10.09                   The Cloud Dilemma for Developers
What is AWS?

• S3 – Simple Storage Service

• EC2 – Elastic Compute Cloud
     • AMI , Elastic IP

• EBS – Elastic Block Store

• SQS – Simple Queuing Service

• SimpleDB - …

• CloudFront – CDN service (of S3)

• Mechanical Turk

• The lock-in problem
TIKAL COMMUNITY event 1.10.09        The Cloud Dilemma for Developers
The road from the container to the cloud

• The container is “my world” (and I’m sharing it with other EARS/EJBs
   …, deployment lingo)

• “The Server is mine” (but I’ll focus with the container I know …)

• Cluster for robustness

• True farm no n2 chattiness (the MemcacheD like)

• From imaging to virtualization

• Scale matters - Enterprise scale vs. Web Scale.

• Cloud Computing – machines are generated and destroyed
   programmatically by an oversight control program

TIKAL COMMUNITY event 1.10.09                           The Cloud Dilemma for Developers
The “body and soul” architectural pattern




TIKAL COMMUNITY event 1.10.09   The Cloud Dilemma for Developers
The “Stem-Cell” architectural pattern




TIKAL COMMUNITY event 1.10.09   The Cloud Dilemma for Developers
Semantic Proxy (logical)
                                    DNS round robin and fail over



                     External Squid (incoming)                                                Squid farm



                                                                Memcached farm
                                                                                 Storage
                                                                                 Nodes
                                SP Logic
                                SP Logic



                                            Internal Squid (outgoing)            Squid farm




                                                                    Web
                                   OpenCalais


TIKAL COMMUNITY event 1.10.09                                                                      The Cloud Dilemma for Developers
Semantic Proxy
                   DNS round robin and fail over

                                                                        Squid root node
            Squid root                    Squid root
                                                                         Memcached
                                                        Squid farm
                                                       Memcached farm
                                                                         process

SP node                  SP node                   SP node               Squid




                                                                        SP node
                                      Web
                                                                         SP logic
                                                                         (Java/Tomcat)
   OpenCalais
                                                                         Memcached
                                                                         process

                                                                         Squid



TIKAL COMMUNITY event 1.10.09                                                             The Cloud Dilemma for Developers
Processing 11 million docs

• Since you on compute time:
     • Running a 1000 machines for an hour

     Is the same as

     • Running 10 machines for 100 hours



• Process control – the single machine doesn’t mater any more
   you need to manage many of them they cannot be “spoiled”




TIKAL COMMUNITY event 1.10.09                  The Cloud Dilemma for Developers
Things you’re not told ...

• You can easily spend 20k a month

• The machines are relatively week

• Virtualization doesn’t work well with I/O intensive loads

• Context switches and predictability issues
     • Bigger instances are better (why?)

• Need (some) IT/System aware developers

• Design for big from start (at a minimum keep non linear
   scalability areas isolated and known).



TIKAL COMMUNITY event 1.10.09                    The Cloud Dilemma for Developers
Q&A

(1 slide and we’re done …)




TIKAL COMMUNITY event 1.10.09   The Cloud Dilemma for Developers
Bottom Line

• Cloud Computing is here to stay



• One needs to know when applicable and when not



• Change is the only constant (unless you’re working in Cobol)




TIKAL COMMUNITY event 1.10.09                   The Cloud Dilemma for Developers

AWS Case Study

  • 1.
    TIKAL COMMUNITY event 1.10.09 CBI Center, Tel-Aviv The Cloud Dilemma for Developers – AWS Case Study Dudi Landau CTO ClearForest Hosted by Tikal. www.tikalk.com Cost-Benefit Open Source
  • 2.
    Introduction  Map ofthe world  Why (should I care)?  What is AWS?  The road from the container to the cloud  The “body and soul” architectural pattern  The “Stem-Cell” architectural pattern  Semantic Proxy  Processing 11 million docs  Things you’re not told ... TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 3.
    Map of theworld • SaaS • PaaS • IaaS • “well it is basically just a virtual machine, but elsewere on the clouds” • (…and multithreading is normal programming with just threads & locks) • (… and JEE is JSE on a server) •  Cloud Computing in general and IaaS in particular is a paradigm shift! • machines are created by a web-service programmatically! TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 4.
    Why (should Icare)? • Moving from CAPEX to OPEX, pay as you go • Ability to scale in time (and shrink back!) • Cope with Slashdot effect • HW resilience is not our problem anymore … • But even the NLB becomes a machine! • Story: Animoto , SmugMug TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 5.
    What is AWS? •S3 – Simple Storage Service • EC2 – Elastic Compute Cloud • AMI , Elastic IP • EBS – Elastic Block Store • SQS – Simple Queuing Service • SimpleDB - … • CloudFront – CDN service (of S3) • Mechanical Turk • The lock-in problem TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 6.
    The road fromthe container to the cloud • The container is “my world” (and I’m sharing it with other EARS/EJBs …, deployment lingo) • “The Server is mine” (but I’ll focus with the container I know …) • Cluster for robustness • True farm no n2 chattiness (the MemcacheD like) • From imaging to virtualization • Scale matters - Enterprise scale vs. Web Scale. • Cloud Computing – machines are generated and destroyed programmatically by an oversight control program TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 7.
    The “body andsoul” architectural pattern TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 8.
    The “Stem-Cell” architecturalpattern TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 9.
    Semantic Proxy (logical) DNS round robin and fail over External Squid (incoming) Squid farm Memcached farm Storage Nodes SP Logic SP Logic Internal Squid (outgoing) Squid farm Web OpenCalais TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 10.
    Semantic Proxy DNS round robin and fail over Squid root node Squid root Squid root Memcached Squid farm Memcached farm process SP node SP node SP node Squid SP node Web SP logic (Java/Tomcat) OpenCalais Memcached process Squid TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 11.
    Processing 11 milliondocs • Since you on compute time: • Running a 1000 machines for an hour Is the same as • Running 10 machines for 100 hours • Process control – the single machine doesn’t mater any more you need to manage many of them they cannot be “spoiled” TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 12.
    Things you’re nottold ... • You can easily spend 20k a month • The machines are relatively week • Virtualization doesn’t work well with I/O intensive loads • Context switches and predictability issues • Bigger instances are better (why?) • Need (some) IT/System aware developers • Design for big from start (at a minimum keep non linear scalability areas isolated and known). TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 13.
    Q&A (1 slide andwe’re done …) TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers
  • 14.
    Bottom Line • CloudComputing is here to stay • One needs to know when applicable and when not • Change is the only constant (unless you’re working in Cobol) TIKAL COMMUNITY event 1.10.09 The Cloud Dilemma for Developers