© 2013 IBM Corporation1
BigData processing in the cloud – Guest Lecture -
University of Applied Sciences Rapperswil - 29.4...
© 2013 IBM Corporation2
What is BIG data?
© 2013 IBM Corporation3
What is BIG data?
© 2013 IBM Corporation4
What is BIG data?
Big Data
Hadoop
© 2013 IBM Corporation5
What is BIG data?
Business Intelligence
Data Warehouse
© 2013 IBM Corporation6
Map-Reduce → Hadoop → BigInsights
© 2013 IBM Corporation7
BigData UseCases
●
Google Index
●
40 X 10^9 = 40.000.000.000 => 40 billion pages indexed
●
Will br...
© 2013 IBM Corporation8
BigData UseCases
●
CERN LHC
●
25 petabytes per year
●
Facebook
●
Hive Datawarehouse
●
300 PB, grow...
© 2013 IBM Corporation9
BigData Analytics
© 2013 IBM Corporation10
BigData Analytics – Predictive Analytics
"sometimes it's not
who has the best
algorithm that wins...
© 2013 IBM Corporation11
Data Parallelism
© 2013 IBM Corporation12
Aggregated Bandwith between CPU, Main
Memory and Hard Drive
1 TB (at 10 GByte/s)
- 1 Node - 100 s...
© 2013 IBM Corporation13
Fault Tolerance / Commodity Hardware
AMD Turion II Neo N40L (2x 1,5GHz / 2MB / 15W), 8 GB RAM,
3T...
© 2013 IBM Corporation14
© 2013 IBM Corporation15
© 2013 IBM Corporation16
HDFS – Hadoop File System
© 2013 IBM Corporation17
© 2013 IBM Corporation18
© 2013 IBM Corporation19
© 2013 IBM Corporation20
© 2013 IBM Corporation21
© 2013 IBM Corporation22
© 2013 IBM Corporation23
© 2013 IBM Corporation24
© 2013 IBM Corporation25
© 2013 IBM Corporation26
© 2013 IBM Corporation27
© 2013 IBM Corporation28
© 2013 IBM Corporation29
© 2013 IBM Corporation30
© 2013 IBM Corporation31
© 2013 IBM Corporation32
© 2013 IBM Corporation33
© 2013 IBM Corporation34
© 2013 IBM Corporation35
Map-Reduce
Source: http://www.cloudcomputingpatterns.org/Map_Reduce
© 2013 IBM Corporation36
© 2013 IBM Corporation37
© 2013 IBM Corporation38
© 2013 IBM Corporation39
© 2013 IBM Corporation40
© 2013 IBM Corporation41
© 2013 IBM Corporation42
© 2013 IBM Corporation43
© 2013 IBM Corporation44
© 2013 IBM Corporation45
© 2013 IBM Corporation46
© 2013 IBM Corporation47
© 2013 IBM Corporation48
© 2013 IBM Corporation49
© 2013 IBM Corporation50
© 2013 IBM Corporation51
© 2013 IBM Corporation52
© 2013 IBM Corporation53
© 2013 IBM Corporation54
© 2013 IBM Corporation55
© 2013 IBM Corporation56
© 2013 IBM Corporation57
© 2013 IBM Corporation58
© 2013 IBM Corporation59
© 2013 IBM Corporation60
© 2013 IBM Corporation61
© 2013 IBM Corporation62
© 2013 IBM Corporation63
© 2013 IBM Corporation64
© 2013 IBM Corporation65
© 2013 IBM Corporation66
© 2013 IBM Corporation67
© 2013 IBM Corporation68
© 2013 IBM Corporation69
© 2013 IBM Corporation70
© 2013 IBM Corporation71
© 2013 IBM Corporation72
© 2013 IBM Corporation73
© 2013 IBM Corporation74
© 2013 IBM Corporation75
© 2013 IBM Corporation76
© 2013 IBM Corporation77
What role is the cloud playing here?
© 2013 IBM Corporation78
“Elastic” Scale-Out
Source: http://www.cloudcomputingpatterns.org/Continuously_Changing_Workload
© 2013 IBM Corporation79
“Elastic” Scale-Out
of
© 2013 IBM Corporation80
“Elastic” Scale-Out
of
CPU Cores
© 2013 IBM Corporation81
“Elastic” Scale-Out
of
CPU Cores Storage
© 2013 IBM Corporation82
“Elastic” Scale-Out
of
CPU Cores Storage
© 2013 IBM Corporation83
“Elastic” Scale-Out
of
CPU Cores Storage Memory
© 2013 IBM Corporation84
“Elastic” Scale-Out
of
CPU Cores Storage Memory
© 2013 IBM Corporation85
“Elastic” Scale-Out
linear
Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
© 2013 IBM Corporation86
“Elastic” Scale-Out
linear
Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
© 2013 IBM Corporation87
BigData Scale-Out
How do Databases Scale-Out?
© 2013 IBM Corporation88
BigData Scale-Out
How do Databases Scale-Out?
© 2013 IBM Corporation89
How do Databases Scale-Out?
Shared Disk Architectures
© 2013 IBM Corporation90
How do Databases Scale-Out?
Shared Disk Architectures
© 2013 IBM Corporation91
How do Databases Scale-Out?
Shared Nothing Architectures
© 2013 IBM Corporation92
Born on the cloud Databases
Source: http://www.constructioncloudcomputing.com/wp-content/uploads/...
© 2013 IBM Corporation93
Google AppEngine
Google App Engine is a Platform as a Service (PaaS) offering that lets
you build...
© 2013 IBM Corporation94
Google AppEngine Database Services
© 2013 IBM Corporation95
© 2013 IBM Corporation96
IBM BlueMix
BlueMix is a Platform as a Service Cloud,
based on Cloud Foundry, employing Enterpris...
© 2013 IBM Corporation97
IBM BlueMix, a Cloudfoundry runtime
Linux VM
Linux VM
Code
Runtime
Framework+
Droplet
Linux VM
Co...
© 2013 IBM Corporation98
●
Summary
●
BigData is born on the cloud
●
Cloud facilitates resource provisioning, configuration...
Upcoming SlideShare
Loading in...5
×

BigData processing in the cloud – Guest Lecture - University of Applied Sciences Rapperswil - 29.4.14

230

Published on

Published in: Technology, Business

BigData processing in the cloud – Guest Lecture - University of Applied Sciences Rapperswil - 29.4.14

  1. 1. © 2013 IBM Corporation1 BigData processing in the cloud – Guest Lecture - University of Applied Sciences Rapperswil - 29.4.14 Romeo Kienzler IBM Innovation Center Source: http://res.sys-con.com/story/oct12/2398990/Cloud_BigData_468.jpg
  2. 2. © 2013 IBM Corporation2 What is BIG data?
  3. 3. © 2013 IBM Corporation3 What is BIG data?
  4. 4. © 2013 IBM Corporation4 What is BIG data? Big Data Hadoop
  5. 5. © 2013 IBM Corporation5 What is BIG data? Business Intelligence Data Warehouse
  6. 6. © 2013 IBM Corporation6 Map-Reduce → Hadoop → BigInsights
  7. 7. © 2013 IBM Corporation7 BigData UseCases ● Google Index ● 40 X 10^9 = 40.000.000.000 => 40 billion pages indexed ● Will break 100 PB barrier soon ● Derived from MapReduce ● now “caffeine” based on “percolator” ● Incremental vs. batch ● In-Memory vs. disk
  8. 8. © 2013 IBM Corporation8 BigData UseCases ● CERN LHC ● 25 petabytes per year ● Facebook ● Hive Datawarehouse ● 300 PB, growing 600 TB / d ● > 100 k servers ● Genomics ● Enterprises ● Data center analytics (Logflies, OS/NW monitors, ...) ● Predictive Maintenance, Cybersecurity ● Social Media Analytics ● DWH offload ● Call Detail Record (CDR) data preservation http://www.balthasar-glaettli.ch/vorratsdaten/
  9. 9. © 2013 IBM Corporation9 BigData Analytics
  10. 10. © 2013 IBM Corporation10 BigData Analytics – Predictive Analytics "sometimes it's not who has the best algorithm that wins; it's who has the most data." (C) Google Inc. The Unreasonable Effectiveness of Data¹ ¹http://www.csee.wvu.edu/~gidoretto/courses/2011-fall-cp/reading/TheUnreasonable%20EffectivenessofData_IEEE_IS2009.pdf No Sampling => Work with full dataset => No p-Value/z-Scores anymore
  11. 11. © 2013 IBM Corporation11 Data Parallelism
  12. 12. © 2013 IBM Corporation12 Aggregated Bandwith between CPU, Main Memory and Hard Drive 1 TB (at 10 GByte/s) - 1 Node - 100 sec - 10 Nodes - 10 sec - 100 Nodes - 1 sec - 1000 Nodes - 100 msec
  13. 13. © 2013 IBM Corporation13 Fault Tolerance / Commodity Hardware AMD Turion II Neo N40L (2x 1,5GHz / 2MB / 15W), 8 GB RAM, 3TB SEAGATE Barracuda 7200.14 < CHF 500  100 K => 200 X (2, 4, 3) => 400 Cores, 1,6 TB RAM, 200 TB HD  MTBF ~ 365 d > 1,5 d Source: http://www.cloudcomputingpatterns.org/Watchdog
  14. 14. © 2013 IBM Corporation14
  15. 15. © 2013 IBM Corporation15
  16. 16. © 2013 IBM Corporation16 HDFS – Hadoop File System
  17. 17. © 2013 IBM Corporation17
  18. 18. © 2013 IBM Corporation18
  19. 19. © 2013 IBM Corporation19
  20. 20. © 2013 IBM Corporation20
  21. 21. © 2013 IBM Corporation21
  22. 22. © 2013 IBM Corporation22
  23. 23. © 2013 IBM Corporation23
  24. 24. © 2013 IBM Corporation24
  25. 25. © 2013 IBM Corporation25
  26. 26. © 2013 IBM Corporation26
  27. 27. © 2013 IBM Corporation27
  28. 28. © 2013 IBM Corporation28
  29. 29. © 2013 IBM Corporation29
  30. 30. © 2013 IBM Corporation30
  31. 31. © 2013 IBM Corporation31
  32. 32. © 2013 IBM Corporation32
  33. 33. © 2013 IBM Corporation33
  34. 34. © 2013 IBM Corporation34
  35. 35. © 2013 IBM Corporation35 Map-Reduce Source: http://www.cloudcomputingpatterns.org/Map_Reduce
  36. 36. © 2013 IBM Corporation36
  37. 37. © 2013 IBM Corporation37
  38. 38. © 2013 IBM Corporation38
  39. 39. © 2013 IBM Corporation39
  40. 40. © 2013 IBM Corporation40
  41. 41. © 2013 IBM Corporation41
  42. 42. © 2013 IBM Corporation42
  43. 43. © 2013 IBM Corporation43
  44. 44. © 2013 IBM Corporation44
  45. 45. © 2013 IBM Corporation45
  46. 46. © 2013 IBM Corporation46
  47. 47. © 2013 IBM Corporation47
  48. 48. © 2013 IBM Corporation48
  49. 49. © 2013 IBM Corporation49
  50. 50. © 2013 IBM Corporation50
  51. 51. © 2013 IBM Corporation51
  52. 52. © 2013 IBM Corporation52
  53. 53. © 2013 IBM Corporation53
  54. 54. © 2013 IBM Corporation54
  55. 55. © 2013 IBM Corporation55
  56. 56. © 2013 IBM Corporation56
  57. 57. © 2013 IBM Corporation57
  58. 58. © 2013 IBM Corporation58
  59. 59. © 2013 IBM Corporation59
  60. 60. © 2013 IBM Corporation60
  61. 61. © 2013 IBM Corporation61
  62. 62. © 2013 IBM Corporation62
  63. 63. © 2013 IBM Corporation63
  64. 64. © 2013 IBM Corporation64
  65. 65. © 2013 IBM Corporation65
  66. 66. © 2013 IBM Corporation66
  67. 67. © 2013 IBM Corporation67
  68. 68. © 2013 IBM Corporation68
  69. 69. © 2013 IBM Corporation69
  70. 70. © 2013 IBM Corporation70
  71. 71. © 2013 IBM Corporation71
  72. 72. © 2013 IBM Corporation72
  73. 73. © 2013 IBM Corporation73
  74. 74. © 2013 IBM Corporation74
  75. 75. © 2013 IBM Corporation75
  76. 76. © 2013 IBM Corporation76
  77. 77. © 2013 IBM Corporation77 What role is the cloud playing here?
  78. 78. © 2013 IBM Corporation78 “Elastic” Scale-Out Source: http://www.cloudcomputingpatterns.org/Continuously_Changing_Workload
  79. 79. © 2013 IBM Corporation79 “Elastic” Scale-Out of
  80. 80. © 2013 IBM Corporation80 “Elastic” Scale-Out of CPU Cores
  81. 81. © 2013 IBM Corporation81 “Elastic” Scale-Out of CPU Cores Storage
  82. 82. © 2013 IBM Corporation82 “Elastic” Scale-Out of CPU Cores Storage
  83. 83. © 2013 IBM Corporation83 “Elastic” Scale-Out of CPU Cores Storage Memory
  84. 84. © 2013 IBM Corporation84 “Elastic” Scale-Out of CPU Cores Storage Memory
  85. 85. © 2013 IBM Corporation85 “Elastic” Scale-Out linear Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
  86. 86. © 2013 IBM Corporation86 “Elastic” Scale-Out linear Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
  87. 87. © 2013 IBM Corporation87 BigData Scale-Out How do Databases Scale-Out?
  88. 88. © 2013 IBM Corporation88 BigData Scale-Out How do Databases Scale-Out?
  89. 89. © 2013 IBM Corporation89 How do Databases Scale-Out? Shared Disk Architectures
  90. 90. © 2013 IBM Corporation90 How do Databases Scale-Out? Shared Disk Architectures
  91. 91. © 2013 IBM Corporation91 How do Databases Scale-Out? Shared Nothing Architectures
  92. 92. © 2013 IBM Corporation92 Born on the cloud Databases Source: http://www.constructioncloudcomputing.com/wp-content/uploads/2010/10/dreamstime_7360880-480x300.jpg Source: http://www.cloudcomputingpatterns.org/Execution_Environment
  93. 93. © 2013 IBM Corporation93 Google AppEngine Google App Engine is a Platform as a Service (PaaS) offering that lets you build and run applications on Google’s infrastructure. App Engine applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage needs change. With App Engine, there are no servers for you to maintain. You simply upload your application and it’s ready to go. Source: http://www.cloudcomputingpatterns.org/Platform_as_a_Service_%28PaaS%29
  94. 94. © 2013 IBM Corporation94 Google AppEngine Database Services
  95. 95. © 2013 IBM Corporation95
  96. 96. © 2013 IBM Corporation96 IBM BlueMix BlueMix is a Platform as a Service Cloud, based on Cloud Foundry, employing Enterprise grade services enriched with IBM Software and hosted at SOFTLAYER
  97. 97. © 2013 IBM Corporation97 IBM BlueMix, a Cloudfoundry runtime Linux VM Linux VM Code Runtime Framework+ Droplet Linux VM Container Container Container SQL Push SSO Services: ... DropletDroplet
  98. 98. © 2013 IBM Corporation98 ● Summary ● BigData is born on the cloud ● Cloud facilitates resource provisioning, configuration and deployment ● Highly innovative area ● Technology ● UseCases ● Links ● http://en.wikipedia.org/wiki/MapReduce ● http://www.se-radio.net/2013/12/episode-199-michael-stonebraker/ ● Sign up for the free BlueMix beta ● http://bluemix.net ● Come to the BlueMix Days ● http://bit.ly/1lsIY8J ● Use our software ● Biginsights: http://www.ibm.com/software/data/infosphere/biginsights/quick-start/
  1. Gostou de algum slide específico?

    Recortar slides é uma maneira fácil de colecionar informações para acessar mais tarde.

×