Successfully reported this slideshow.

“Cloud Computing y Big Data, próxima frontera de la innovación”

1,389 views

Published on

Jornada sobre Big Data y Cloud organizada por la Fundación Areces el jueves 21 de marzo en Madrid.
Adjunto las slides de mi ponencia titulada “Cloud Computing y Big Data, próxima frontera de la innovación”.
Más información en http://www.jorditorres.org/jornada-el-impacto-de-la-nube-y-el-big-data-en-la-ciencia/

Published in: Technology
  • Be the first to comment

“Cloud Computing y Big Data, próxima frontera de la innovación”

  1. 1. Cloud Computing y Big Data,próxima frontera de la innovaciónCloud Computing and Big Data,the next frontier of innovationJordi Torres, UPC-BSCMadrid, 21 Marzo 2013
  2. 2. ResumenUn gran avance de la ciencia se produjo hace siglos cuando la teoría matemáticapermitió formalizar la experimentación. Pero sin duda otro paso fundamental para elavance de la ciencia se dio gracias a la aparición de los computadores. Gracias aello hoy en día disponemos de potentes supercomputadores que por medio desimulaciones nos permiten crear escenarios caros, peligrosos o incluso imposiblesde reproducir en la vida real. Todo un avance para la ciencia y el progreso.Pero lamentablemente hasta ahora la potencia de la supercomputación no se haencontrado al alcance de todo el mundo, reduciéndose a un conjunto limitado degrupos de investigación, dado los costes de crear y mantener las grandesinfraestructuras de este tipo.Pero esto está cambiando con la llegada de lo que se conoce como CloudComputing, y que ya está permitiendo que muchos otros ámbitos de la ciencia quehasta ahora no podían beneficiarse de esta tecnología puedan hacerlo.Ahora bien, dado que hoy en día los datos disponibles para poder realizar loscálculos han adquirido dimensiones de gran magnitud , lo que se conoce por BigData, los sistemas de computación actuales presentan nuevos retos que la propiaciencia informática ha empezado a abordar.En esta presentación se discutirá cuáles son las características de esta nuevarealidad que conforma el Cloud y el Big Data, enfatizando los nuevos retos quedeben ser abordados urgentemente para poder dar respuesta a las necesidadesdel avance de la ciencia.
  3. 3. SummaryA major breakthrough in science occurred centuries ago when mathematical theoryenabled formalized experimentation. But certainly another important step in theadvancement of science came with the advent of computers.As a result today we have powerful supercomputers that through simulations allowus to create different scenarios: expensive, dangerous, or even those which areimpossible to replicate in real life. A breakthrough for science and progress.But unfortunately, so far the power of supercomputing is not available to everyone. Itis restricted to a limited set of research groups, due the costs of creating andmaintaining large infrastructures like this.With the advent of what is known as cloud computing, this situation is changing.Today, many other fields of science, that until now could not benefit from thistechnology, are able to take advantage of it.However, given that nowadays the data available to perform calculations hasacquired large-scale magnitudes, which is known as Big Data. Current computersystems present new challenges that computer science itself has begun to address.In this presentation we will discuss the characteristics of this new reality that makeup the Cloud and Big Data. We will emphasize the new challenges that must beaddressed urgently, in order to respond to the needs of the advancement of science.
  4. 4. HOW DID SCIENCE START?
  5. 5. Source: Prof. Mateo Valero, BSC-CNS 2010
  6. 6. Source: Prof. Mateo Valero, BSC-CNS 2010
  7. 7. HOW IS SCIENCE ADVANCING TODAY?
  8. 8. Source: Prof. Mateo Valero, BSC-CNS 2010
  9. 9. Source: Prof. Mateo Valero, BSC-CNS 2010
  10. 10. MATHEMATICAL CALCULATIONS? WHERE?
  11. 11. MN3 Cores/chip 8 Chip/node 2Compute Cores/node 16 Nodes 3028 Total cores 48448 Freq. 2,6 Gflops/core 20,8Performance Gflops/node 332,8 Total Tflops 1000,0 GB/core (GB) 2Memory GB/node (GB) 32 Total (TB) 96,89 Latency (μs) 0,7Network Bandwidth (Gb/s) 40Storage (TB) 2000Consumption (KW) 1080
  12. 12. FOR SOME SPANISH RESEARCH GROUPS!
  13. 13. AND…FOR THE REST OF THE WORLD?
  14. 14. GOOD NEWS!Source: http://news.cnet.com/8301-13846_3-57349321-62/amazon-takes-supercomputing-to-the-cloud
  15. 15. CLOUD COMPUTING?
  16. 16. Source: http://www.wired.com/wiredenterprise/2011/12/nonexistent-supercomputer/all/1
  17. 17. Source: http://www.facebook.com/media/ set/?set=a.190842620965185.47008.140375289345252 40 Mw28.000 m2
  18. 18. Foto: Google
  19. 19. HUGE DATA CENTERSFoto: Google > football pitch x 4
  20. 20. Source: http://www.google.com/about/datacenters/gallery/images
  21. 21. Source: http://www.google.com/about/datacenters/gallery/images
  22. 22. Source: http://www.google.com/about/datacenters/gallery/images
  23. 23. Different IT productionFoto: J.T.
  24. 24. CLOUD COMPUTING: IT as a serviceOn-demand self-service Pay per use Rapid elasticity Ubiquitous access .... Source: http://www.telegraph.co.uk/technology /reviews/9241719/Power-Ethernet-Sockets-review.html
  25. 25. Example of benefits (IaaS):1 computer in a rackfor 120 hours 120 computers in three racks for 1 hour Idea : Tutorial SC2011 - Robert Grossman
  26. 26. AND DATA?
  27. 27. Source: http://www.docuciencia.es/2009/05/lhc-el-acelerador-de-particulas/“… the LHC produces 1PetaByte of data every second, big data andlack of computing resources were becoming the European Organizationfor Nuclear Research’s biggest IT challenges…” Source: computerweekly.com/news/2240173897/CERN-adopts -OpenStack-private-cloud-to-solve-big-data-challenges
  28. 28. 1 Gigabyte (GB) = 1.000.000.000 byte1 Terabyte (TB) = 1.000 Gigabyte (GB)1 Petabyte (PB) = 1.000.000 Gigabyte (GB)1 Exabyte (EB) = 1.000.000.000 Gigabyte (GB)1 Zettabyte (ZB) = 1.000.000.000.000 (GB)
  29. 29. Deluge of data created daily Source: Economist , Feb 25th, 2010 http://www.economist.com/node/15579717
  30. 30. Big Data?definition?
  31. 31. BIG DATA?Big Data is data that exceeds thestoring, processing and managingcapacity of conventional systems.
  32. 32. BIG DATA?The reason is that the data is too big,moves too fast, or doesn’t fit thestructures of our current systems’architectures.
  33. 33. BIG DATA?Moreover, to gain value from this data,we must change the way to analyze it.
  34. 34. BIG DATA?Big Data is data that exceeds the storing,processing and managing capacity ofconventional systems.The reason is that the data is too big,moves too fast, or doesn’t fit thestructures of our current systems’architectures.Moreover, to gain value from this data, wemust change the way to analyze it.
  35. 35. NEW CHALLENGESthat must be addressed urgently, in order to respond to the needs of the advancement of science 1. Storing 2. Managing 3. Processing 4. Analyzing
  36. 36. Affordable Storage
  37. 37. But scanning disks…assume 100MB/sec
  38. 38. But scanning disks…assume 100MB/secmore than 5 hours
  39. 39. approach: massive parallelism assume 20.000 disks:scanning 2 TB takes 1 secondSource: http://www.google.com/about/datacenters/gallery/images/_2000/IDI_018.jpg
  40. 40. 1 Data processing challengesRethinking data processing is required: MapReduce, Storm, S4,… Source: http://www.google.com/about/datacenters/gallery/images/_2000/IDI_018.jpg
  41. 41. 2 Data storage challengesNew Storage technologies are required HHD 100 cheaper than RAM But 1000 times slowerRAM vs HHD Solid- state drive (SSD) Not volatilePresent solutions: Storage Class Memory (SCM)Research:
  42. 42. 3 Data management challenges Relational DB can’t support everythingExample: eventual consistencySolution: “NoSQL systems”Research: New management systems Source: gigaom.com/cloud/big-data- and-nosql-march-to-the-enterprise/ 43
  43. 43. 4 Obtaining value from data The information is non actionable knowledge- Data prediction using data mining & + machine learning techniquesValue Volume Information Research: The majority of algorithms function well in thousands of registers,+ however at the moment they are Knowledge - impractical for thousands of milions.
  44. 44. Cloud Computing and Big Data:the next frontier of science and innovation
  45. 45. Thank you for your attentionwww.JordiTorres.org - @JordiTorresBCN www.smartcityexpo.com www.bsc.es/eBusiness Autonomic Systems and e-Business Platforms research line at BSC/UPC

×