Google Cloud Computing on Google Developer 2008 Day

Cloud Computing Ping Yeh June 14, 2008

Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Separation of Functionalities Cluster and grid images are from Fermilab and CERN, respectively.

Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Tightly coupled computing resources: CPU, storage, data, etc Usually connected within a LAN Managed as a single resource Separation of Functionalities Commodity, Open Source Cluster and grid images are from Fermilab and CERN, respectively.

Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Tightly coupled computing resources: CPU, storage, data, etc Usually connected within a LAN Managed as a single resource ,[object Object],[object Object],[object Object],[object Object],Separation of Functionalities Commodity, Open Source Global Resource Sharing Cluster and grid images are from Fermilab and CERN, respectively.

Evolution of Computing with the Network ,[object Object],[object Object],[object Object],[object Object],Network is computer (client - server) ‏ Tightly coupled computing resources: CPU, storage, data, etc Usually connected within a LAN Managed as a single resource ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Separation of Functionalities Commodity, Open Source Global Resource Sharing Ownership Model Cluster and grid images are from Fermilab and CERN, respectively.

The Next Step: Cloud Computing Services and data are in the cloud, accessible with any device connected to the cloud with a browser

The Next Step: Cloud Computing Services and data are in the cloud, accessible with any device connected to the cloud with a browser A key technical issue for developers: Scalability

Applications on the Web internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your user internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your Coolest Web Application

Applications on the Web internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 Your user internet splat map: http://flickr.com/photos/jurvetson/916142/ , CC-by 2.0 baby picture: http://flickr.com/photos/cdharrison/280252512/ , CC-by-sa 2.0 The Cloud Your Coolest Web Application

[object Object],[object Object],[object Object],[object Object],[object Object],I asked the kid under the pine tree, "Where might your master be?" "He is picking herbs in the mountain," he said, "the cloud is too deep to know where." Jia Dao, "Didn't meet the master," written around 800AD picture: http://flickr.com/photos/soylentgreen23/313880255/, CC-by 2.0

How many users do you want to have? The Cloud Your Coolest Web Application

Google Growth Nov. '98: 10,000 queries on 25 computers Apr. '99: 500,000 queries on 300 computers Sep. '99: 3,000,000 queries on 2100 computers

Counting the numbers Client / Server One : Many Personal Computer One : One

Counting the numbers Client / Server One : Many Personal Computer One : One Cloud Computing Many : Many Developer transition

What Powers Cloud Computing? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],chunk ... chunk ... chunk ... chunk ... /foo/bar

google.stanford.edu (circa 1997)‏

google.com (1999)‏ “ cork boards"

Google Data Center (circa 2000)‏

google.com (new data center 2001)‏

Current Design ,[object Object],[object Object],[object Object],[object Object],[object Object]

How to develop a web application that scales? Storage Database Serving Google's solution/replacement Google File System BigTable MapReduce Google AppEngine Data Processing

How to develop a web application that scales? Storage Database Serving Google's solution/replacement Google File System BigTable MapReduce Google AppEngine Published papers Opened on 2008/5/28 Data Processing hadoop: open source implementation

Google File System GFS Client Application Replicas Masters GFS Master GFS Master C 0 C 1 C 2 C 5 Chunkserver C 0 C 2 C 5 Chunkserver C 1 Chunkserver … File namespace chunk 2ef7 chunk ... chunk ... chunk ... /foo/bar GFS Client Application C 5 C 3 ,[object Object],[object Object],[object Object],[object Object]

GFS Usage @ Google ,[object Object],[object Object],[object Object],[object Object],[object Object]

BigTable “ www.cnn.com ” “ contents: ” Rows Columns Timestamps t 3 t 11 t 17 “ <html> …” ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Data model: (row, column, timestamp)‏  cell contents

Why not just use commercial DB? ,[object Object],[object Object],[object Object],[object Object]

System Structure Lock service Bigtable master Bigtable tablet server Bigtable tablet server Bigtable tablet server GFS Cluster scheduling system … holds metadata, handles master-election holds tablet data, logs handles failover, monitoring performs metadata ops + load balancing serves data serves data serves data BigTable Cell

System Structure Lock service Bigtable master Bigtable tablet server Bigtable tablet server Bigtable tablet server GFS Cluster scheduling system … holds metadata, handles master-election holds tablet data, logs handles failover, monitoring performs metadata ops + load balancing serves data serves data serves data Bigtable client Bigtable client library Open() ‏ read/write metadata ops BigTable Cell

BigTable Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Distributed Data Processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Pseudo Codes for Phase 1 and 2 def findBucket(requestTime): # return minute of the week numRequest = zeros(1440*7) # an array of 1440*7 zeros for filename in sys.argv[2:]: for line in open(filename): minuteBucket = findBucket(findTime(line)) ‏ numRequest[minuteBucket] += 1 outFile = open(sys.argv[1], 'w') ‏ for i in range(1440*7): outFile.write("%d %d" % (i, numRequest[i])) ‏ outFile.close() ‏ numRequest = zeros(1440*7) # an array of 1440*7 zeros for filename in sys.argv[2:]: for line in open(filename): col = line.split() ‏ [i, count] = [int(col[0]), int(col[1])] numRequest[i] += count # write out numRequest[] like phase 1

Task Management ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Technical Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Performance Robustness Reusability

MapReduce – A New Model and System ,[object Object],[object Object],Two phases of data processing

MapReduce Programming Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

MapReduce Version of Pseudo Code def findBucket(requestTime): # return minute of the week class LogMinuteCounter(MapReduction): def Map(key, value, output): # key is location minuteBucket = findBucket(findTime(value)) ‏ output.collect(str(minuteBucket), "1") def Reduce(key, iter, output): sum = 0 while not iter.done(): sum += 1 output.collect(key, str(sum)) ‏ ,[object Object],[object Object],[object Object]

MapReduce Framework ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Task Granularity And Pipelining ,[object Object],[object Object],[object Object],[object Object],[object Object]

MapReduce: Adoption at Google MapReduce Programs in Google ’ s Source Tree Summer intern effect New MapReduce Programs Per Month

MapReduce: Uses at Google ,[object Object],[object Object],[object Object],[object Object],[object Object]

MapReduce Summary ,[object Object],[object Object],[object Object],[object Object]

A Data Playground ,[object Object],[object Object],[object Object],MapReduce + BigTable + GFS = Data playground

Learning From Data Searching for Britney Spears...

Open Source Cloud Software: Project Hadoop ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Industrial interest in Hadoop ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

AppEngine ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Academic Cloud Computing Initiative ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

References ,[object Object],[object Object],[object Object],[object Object]

Summary ,[object Object],[object Object],[object Object],[object Object],[object Object]

The era of Cloud Computing is here! Photo by mr.hero on panoramio (http://www.panoramio.com/photo/1127015)‏ news people book search photo product search video maps e-mails mobile blogs groups calendar scholar Earth Sky web desktop translate messages

Google Cloud Computing on Google Developer 2008 Day

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Google Cloud Computing on Google Developer 2008 Day

Similar to Google Cloud Computing on Google Developer 2008 Day (20)

Google Cloud Computing on Google Developer 2008 Day