SlideShare a Scribd company logo
1 of 21
Download to read offline
E N TE RP RI S E
A     R     C      H      I   T
E     C     T      U      R   E




                                  THE 5 PRINCIPLES OF OF
                                  GOOGLE’S
                                  ”CLOUD”
                                  Patrik Svensson, 2011, ptrksvnssn@gmail.com




torsdag den 12 maj 2011
E N TE RP RI S E
                                  THE VISION OF GOOGLE
A     R     C      H      I   T
E     C     T      U      R   E




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E
                                      THE 5 PRINCIPLES
                                  •   Everything is a service (or an application in
                                      Android)

                                  •   Relentless technical focus (thinking at nanoscale)

                                  •   Data centers are the foundation

                                  •   Code is king, Data is king kong

                                  •   Identify and keep track on your users



torsdag den 12 maj 2011
E N TE RP RI S E
A     R     C      H      I   T
E     C     T      U      R   E




torsdag den 12 maj 2011
#1 EVERYTHING IS A
E N TE RP RI S E
                                  SERVICE (OR AN
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   APPLICATION)




torsdag den 12 maj 2011
E N TE RP RI S E                  #2 RELENTLESS
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   TECHNICAL FOCUS
       •     Jedis build their own
             lightsabres

       •     Parallelize, Distribute, Cache,
             Compress, Redundantize
             everything

       •     Latency is VERY evil              Source: http://www.flickr.com/photos/60994749@N07/5557591956/




torsdag den 12 maj 2011
EXAMPLE: ”NUMBERS
E N TE RP RI S E
                                  EVERYONE SHOULD
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   KNOW”




                                                                                                          1,000,000 ns        = 1 ms
                                                                                                          1,000,000,000 ns = 1 s

                                   Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”



torsdag den 12 maj 2011
E N TE RP RI S E                  #3 DATA CENTERS ARE
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E
                                  THE FOUNDATION




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                                  T
                                  E    ECONOMIES OF SCALE
                              •       ~40 data centers in 2009, 1000,000 machines




                                                     Source: http://techcrunch.com/2008/04/11/where-are-all-the-google-data-centers/




torsdag den 12 maj 2011
E N TE RP RI S E
A     R     C      H      I   T
E     C     T      U      R   E




torsdag den 12 maj 2011
E N TE RP RI S E                                 #4 CODE IS KING, DATA
                                                 IS KING KONG
A     R     C      H         I      T
E     C     T      U         R      E




                                                     Enterprise Architecture
                                                   Technical Architecture i.e. which technologies do we use
                          DATA CENTERS                       DATA                      CODE                          CONTROL                        USERS

                                                         "We need:         "We need to build applications
                                "We need:                                     and services, application-,       "We need scheduling         "We need to identify our
                                                    One Distributed File
                              Cooling, Power,                               integration- & data platforms,      synchronization, lock          users to be able to
                                                    Systems, Distributed
                           Perimeter Networks,                             parallell computing platforms &       services, i.e. various     interact, differentiate and
                                                    One Shared memory,
                             Containers, Racks,                             use an open source OS, upon           forms of control             customize the user
                                                      & common data
                          Switches & Hardware at                           our data center/data platform"      mechanisms for data and             experience"
                                                    formats to get scale
                            low cost that scale"                                                                        code"
                                                       and low cost"




                                             Implementation Architecture i.e. how do we implement the technologies
                                                                                   Android, Chrome
                                                                            App Engine, Gmail, Search, Index          GFS master
                                                             GFS,                  Python, Java, C++               Google Work Queue,          OpenID, OAuth, Google
                             Google Container-             BigTable,                                             Chubby,Netscalar, Google    Accounts available for most
                                                                                 Protocol Buffers, Json
                            based Data Centers          Protocol Buffers                                          HTTP Server, (Spanner)              services
                                                                              Sawzall, Dremel, Percolator
                                                                                     MapReduce
                                                                                         Linux




torsdag den 12 maj 2011
E N TE RP RI S E                                                                   "Google's mission is to
A
E
      R
      C
            C
            T
                   H
                   U
                              I
                              R
                                  T
                                  E
                                        ABOUT DATA                                  organize the world's
                                                                                  information and make it
                                                                                       available to all"


                                                                                  +20 Petabyte/day
                   200



                   150



                   100


                                                          ~10 Terabyte/day
                     50

                                      ~2,5 Terabyte
                          0
                                  Structured, Numerical   Unstructured, Textual   Communication, Traffic




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   DATA CENTER ”ENTRY”
                                  •   The same entry to each Data Center

                                  •   ~50 caching (using Squid)

                                  •   Built their own HTTP servers/farms




                                             Source: Ed Austin, ”The Anatomy of the Google Architecture”




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   INSIDE THE CONTAINERS
                                  •   Customized commodity servers, is customized racks in
                                      containers (+1000 servers), organized into clusters

                                  •   All containers ”cloned” and look the same




                                                                   Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
THE SAME HW, OS AND
E N TE RP RI S E
                                  FILESYSTEM
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   EVERYWHERE




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   BIGDATA AS DATABASE




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
E N TE RP RI S E                  BIGDATA IS COLUMN-
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   BASED




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”



torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E       BIGDATA NEEDS GFS
                                  •   Use GFS to store data and logs




                                      Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
MAPREDUCE -
E N TE RP RI S E                  A PARALLELL
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E
                                  COMPUTING PLATFORM




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
E N TE RP RI S E                  ABOUT CODING AT
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   GOOGLE
      •     Linux as operating system everywhere - is open source, highly customized for this (Android is also
            a higly customized version of Linux)

      •     Serialization/Integration - Protocol buffers (RPC) runs at nano speed, internally used for
            ”everything”, Json and RESTful used for external API’s

      •     Application-oriented Programming languages - mainly Python, Java and C++

      •     Data-oriented programming languages - Percolator, Sawzall, Dremel for various data
            processing task (so specialised tools for data!)

      •     The Business Applications - Gmail, Search, App Engine etc - built upon data center
            infrasctructure, data platform and above




torsdag den 12 maj 2011
E N TE RP RI S E                  #5 IDENTIFY AND KEEP
A
E
      R
      C
              C
              T
                   H
                   U
                          I
                          R
                              T
                              E   TRACK OF YOUR USERS
          •       You need a google account to start
                  Android properly

          •       OpenSocial is a collaborate effort to
                  compete against Facebook

          •       OpenID is an identity standard and OAuth
                  is a standard for authorizing services

          •       Google is identifying and tracking every
                  step you take within their domains



torsdag den 12 maj 2011

More Related Content

Similar to The 5 principles of google's cloud

The Modern Software Engineer - Nuno Job
The Modern Software Engineer - Nuno JobThe Modern Software Engineer - Nuno Job
The Modern Software Engineer - Nuno JobGreta Strolyte
 
Painless OpenStack Deployments Powered by Puppet
Painless OpenStack Deployments Powered by PuppetPainless OpenStack Deployments Powered by Puppet
Painless OpenStack Deployments Powered by Puppetaedocw
 
0900 revision anne mac donald
0900 revision anne mac donald0900 revision anne mac donald
0900 revision anne mac donaldMediaPost
 
Terminology in openEHR
Terminology in openEHRTerminology in openEHR
Terminology in openEHRPablo Pazos
 
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...Tim55Ehrler
 
The Changing Face of Government IT
The Changing Face of Government ITThe Changing Face of Government IT
The Changing Face of Government ITDustin Haisler
 
John Eberhardt NSTAC Testimony
John Eberhardt NSTAC TestimonyJohn Eberhardt NSTAC Testimony
John Eberhardt NSTAC TestimonyJohn Eberhardt
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonKrishna Sankar
 
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012OSCON Byrum
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentationTheo Schlossnagle
 
LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19Alberto Paro
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewDr. Ananth Krishnamoorthy
 

Similar to The 5 principles of google's cloud (16)

Promise notes
Promise notesPromise notes
Promise notes
 
The Modern Software Engineer - Nuno Job
The Modern Software Engineer - Nuno JobThe Modern Software Engineer - Nuno Job
The Modern Software Engineer - Nuno Job
 
Painless OpenStack Deployments Powered by Puppet
Painless OpenStack Deployments Powered by PuppetPainless OpenStack Deployments Powered by Puppet
Painless OpenStack Deployments Powered by Puppet
 
0900 revision anne mac donald
0900 revision anne mac donald0900 revision anne mac donald
0900 revision anne mac donald
 
Terminology in openEHR
Terminology in openEHRTerminology in openEHR
Terminology in openEHR
 
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...
 
The Changing Face of Government IT
The Changing Face of Government ITThe Changing Face of Government IT
The Changing Face of Government IT
 
John Eberhardt NSTAC Testimony
John Eberhardt NSTAC TestimonyJohn Eberhardt NSTAC Testimony
John Eberhardt NSTAC Testimony
 
ET Ch - 2.pptx
ET Ch - 2.pptxET Ch - 2.pptx
ET Ch - 2.pptx
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012The Art of Social Media Analysis with Twitter & Python-OSCON 2012
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentation
 
LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape Overview
 
Complete placement guide(technical)
Complete placement guide(technical)Complete placement guide(technical)
Complete placement guide(technical)
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

The 5 principles of google's cloud

  • 1. E N TE RP RI S E A R C H I T E C T U R E THE 5 PRINCIPLES OF OF GOOGLE’S ”CLOUD” Patrik Svensson, 2011, ptrksvnssn@gmail.com torsdag den 12 maj 2011
  • 2. E N TE RP RI S E THE VISION OF GOOGLE A R C H I T E C T U R E torsdag den 12 maj 2011
  • 3. E N TE RP RI S E A E R C C T H U I R T E THE 5 PRINCIPLES • Everything is a service (or an application in Android) • Relentless technical focus (thinking at nanoscale) • Data centers are the foundation • Code is king, Data is king kong • Identify and keep track on your users torsdag den 12 maj 2011
  • 4. E N TE RP RI S E A R C H I T E C T U R E torsdag den 12 maj 2011
  • 5. #1 EVERYTHING IS A E N TE RP RI S E SERVICE (OR AN A E R C C T H U I R T E APPLICATION) torsdag den 12 maj 2011
  • 6. E N TE RP RI S E #2 RELENTLESS A E R C C T H U I R T E TECHNICAL FOCUS • Jedis build their own lightsabres • Parallelize, Distribute, Cache, Compress, Redundantize everything • Latency is VERY evil Source: http://www.flickr.com/photos/60994749@N07/5557591956/ torsdag den 12 maj 2011
  • 7. EXAMPLE: ”NUMBERS E N TE RP RI S E EVERYONE SHOULD A E R C C T H U I R T E KNOW” 1,000,000 ns = 1 ms 1,000,000,000 ns = 1 s Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 8. E N TE RP RI S E #3 DATA CENTERS ARE A E R C C T H U I R T E THE FOUNDATION torsdag den 12 maj 2011
  • 9. E N TE RP RI S E A E R C C T H U I R T E ECONOMIES OF SCALE • ~40 data centers in 2009, 1000,000 machines Source: http://techcrunch.com/2008/04/11/where-are-all-the-google-data-centers/ torsdag den 12 maj 2011
  • 10. E N TE RP RI S E A R C H I T E C T U R E torsdag den 12 maj 2011
  • 11. E N TE RP RI S E #4 CODE IS KING, DATA IS KING KONG A R C H I T E C T U R E Enterprise Architecture Technical Architecture i.e. which technologies do we use DATA CENTERS DATA CODE CONTROL USERS "We need: "We need to build applications "We need: and services, application-, "We need scheduling "We need to identify our One Distributed File Cooling, Power, integration- & data platforms, synchronization, lock users to be able to Systems, Distributed Perimeter Networks, parallell computing platforms & services, i.e. various interact, differentiate and One Shared memory, Containers, Racks, use an open source OS, upon forms of control customize the user & common data Switches & Hardware at our data center/data platform" mechanisms for data and experience" formats to get scale low cost that scale" code" and low cost" Implementation Architecture i.e. how do we implement the technologies Android, Chrome App Engine, Gmail, Search, Index GFS master GFS, Python, Java, C++ Google Work Queue, OpenID, OAuth, Google Google Container- BigTable, Chubby,Netscalar, Google Accounts available for most Protocol Buffers, Json based Data Centers Protocol Buffers HTTP Server, (Spanner) services Sawzall, Dremel, Percolator MapReduce Linux torsdag den 12 maj 2011
  • 12. E N TE RP RI S E "Google's mission is to A E R C C T H U I R T E ABOUT DATA organize the world's information and make it available to all" +20 Petabyte/day 200 150 100 ~10 Terabyte/day 50 ~2,5 Terabyte 0 Structured, Numerical Unstructured, Textual Communication, Traffic torsdag den 12 maj 2011
  • 13. E N TE RP RI S E A E R C C T H U I R T E DATA CENTER ”ENTRY” • The same entry to each Data Center • ~50 caching (using Squid) • Built their own HTTP servers/farms Source: Ed Austin, ”The Anatomy of the Google Architecture” torsdag den 12 maj 2011
  • 14. E N TE RP RI S E A E R C C T H U I R T E INSIDE THE CONTAINERS • Customized commodity servers, is customized racks in containers (+1000 servers), organized into clusters • All containers ”cloned” and look the same Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 15. THE SAME HW, OS AND E N TE RP RI S E FILESYSTEM A E R C C T H U I R T E EVERYWHERE Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 16. E N TE RP RI S E A E R C C T H U I R T E BIGDATA AS DATABASE Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 17. E N TE RP RI S E BIGDATA IS COLUMN- A E R C C T H U I R T E BASED Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 18. E N TE RP RI S E A E R C C T H U I R T E BIGDATA NEEDS GFS • Use GFS to store data and logs Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 19. MAPREDUCE - E N TE RP RI S E A PARALLELL A E R C C T H U I R T E COMPUTING PLATFORM Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 20. E N TE RP RI S E ABOUT CODING AT A E R C C T H U I R T E GOOGLE • Linux as operating system everywhere - is open source, highly customized for this (Android is also a higly customized version of Linux) • Serialization/Integration - Protocol buffers (RPC) runs at nano speed, internally used for ”everything”, Json and RESTful used for external API’s • Application-oriented Programming languages - mainly Python, Java and C++ • Data-oriented programming languages - Percolator, Sawzall, Dremel for various data processing task (so specialised tools for data!) • The Business Applications - Gmail, Search, App Engine etc - built upon data center infrasctructure, data platform and above torsdag den 12 maj 2011
  • 21. E N TE RP RI S E #5 IDENTIFY AND KEEP A E R C C T H U I R T E TRACK OF YOUR USERS • You need a google account to start Android properly • OpenSocial is a collaborate effort to compete against Facebook • OpenID is an identity standard and OAuth is a standard for authorizing services • Google is identifying and tracking every step you take within their domains torsdag den 12 maj 2011