SlideShare a Scribd company logo
1 of 28
1 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Dr. Axel Koester – Storage Chief Technologist, European Storage Competence Center
Future perspectives: the new Era of Computing
BIG DATA from a storage point of view
2 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
BIG DATA ≠≠≠≠ MUCH DATA
3 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
MUCH DATA
Example
4 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
www.extremetech.com
press release August 2011
5 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Details of the 120 PB Cluster
No "Replication Cluster" like Amazon EC2 or Google Cloud
– no 3 copies of each data block
Why?
– Would require 550.000 instead of 200.000 disk drives
– and produce 30 instead of 12 daily failures (at identical net capacity)
6 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
RAID technology won't work in a 200.000 disk cluster
1TB disk drives × 200.000
8+3 software RAID cluster
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD
http://www.almaden.ibm.com/storagesystems/projects/perseus/
No spare
No spare
7 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
There is always something broken, somewhere
8+3 Reed Solomon encoding = tolerates up to 3 faults
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD JBOD
JBOD
JBOD
JBOD JBOD
JBOD JBOD
JBOD
http://www.almaden.ibm.com/storagesystems/projects/perseus/
One fault: low rebuild thread priority
Two faults: prioritize rebuild
Three faults: rebuild asap
< 4 min 20 sec in this state
No spare
No spare
8 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
120 PB storage grid : One of many p775 storage enclosures
Dense Disk Enclosure – 384 disks per unit (192 front & back)
2% Flash SSD for metadata
9 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Power 775 rack with 3 "Dense Disk Enclosures" (1152 drives)
11 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
IBM GSS*: "120 PB predictive technology" for everyone
2x IBM x3650 M42x IBM x3650 M4
GPFS
1PB
*
16GB/s
1PB
*
16GB/s
2U
2U
4U
4U
4U
4U
240 SAS Disks @ 4TB
(currently 720 TB @3TB disks)
(*) GSS = GPFS Storage Server, RAID-less (General Parallel File System)
4 M odules – 240 disks
1 PB
4 M odules – 240 disks
1 PB
12 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Cost per 1MB sequenced genomic data
www.crops.org
For whom?
predicted
actual
» Statistical Medicine
13 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Sometimes…
MUCH DATA » BIG DATA
14 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
BIG DATA without MUCH DATA
15 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson answers complex "trivia" questions from any subject area.
Real money is at stake.
16 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Jeopardy! all-time champions Ken Jennings & Brad Rutter
You don't have this hereditary
lack of pigment. You just need
a little more sun!
17 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson's "knowledge" is collected, not coded
Knowledge base = English WWW
Multiple millions of analyses per second
200 million memorized book pages,
with Wikipedia alone totaling 2,25 Mio.
10km10km
~2000 years to read
18 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson holds all its knowledge in RAM
90 × 32 core IBM Power®750 / 16 TB RAM, 1 TB data, 500 GB/sec
out of 4 TB GPFS disk space, with 16TB = 15 TiB RAM
"information aggregator"
"information aggregator"
*Parallel access to 100% of the data,
no Internet access during games
19 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Watson's inventors surprised by amazing correct answers
Dr. Jennifer Chu-Carrol,
Watson algorithms
Why? Because Watson's knowledge is collected, not coded.
Only rules are coded, but subject to machine learning.
20 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
BIG DATA and ENERGY
21 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
20 Watt20 Watt
IBM Watson: 200.000 Watt
22 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Human brains are incredibly good at "low power"
http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.html
20 Watt20 Watt
Recognize a face in a crowd – efficiently
Distinguish own from outside sound
Combine unrelated facts
Filter & distill information
How?
Brains are not 100% accurate. Bit errors don't bother.
23 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Root cause
Coronary Syndrome 60%
Pneumonia 25%
Pulmonary Embolism 9%
Congestion prediction
Ring A99 in 2 hrs 95%
Feeder A8 in 1 hr 90%
Energy production
Line load estimation
Production mix %
:
"Lower power" from abandoning the 100% bit-accurate IT
Patient symptoms
Road traffic sensors
Wind & sun forecast
:
Unreliableinputdata
Approximation
Approximations based on unreliable data
should not require bit-accurate processing !
24 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
Projects SyNAPSE and BlueBrain: Simulating the brain
http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.html
Understand
learning down
to synapses
Understand
learning down
to synapses
Explore large-
scale brain
simulations
Explore large-
scale brain
simulations
Design a chip
that "learns" at
molecular level
Design a chip
that "learns" at
molecular level
25 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
The brain transistor
no more science-fiction, since March
26 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
March 2013 : IBM publishes a liquid-based transistor
that process data like the human brain
27 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
by IBM Fellow Stuart Parkin, inventor of "Racetrack Memory"
Droplet that can be turned into
"liquid metal" and switch currents
Stuart ParkinStuart Parkin
“We turn this material into a
metal and maintain it without
any need to supply power.”
metalized ions liquid
The programmable liquid can be the
information conveyor, not just a bit cell
28 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
IT energy problems? Start learning from nature!
copied nature « » optimized engineering
30 © 2013 IBM Corporationaxel.koester@de.ibm.com
IBM Big Data Usergroup, Mai 2013
axel.koester@de.ibm.com

More Related Content

What's hot

Nimble-Storage-AFA-Datasheet
Nimble-Storage-AFA-DatasheetNimble-Storage-AFA-Datasheet
Nimble-Storage-AFA-Datasheet
Mike Finnegan
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Romeo Kienzler
 
Jax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised MedicineJax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised Medicine
Gaurav Kaul
 

What's hot (19)

Big Data, Fast Data - MapReduce in Hazelcast
Big Data, Fast Data - MapReduce in HazelcastBig Data, Fast Data - MapReduce in Hazelcast
Big Data, Fast Data - MapReduce in Hazelcast
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)
 
Introduction to SQream and the IoT environment
Introduction to SQream and the IoT environmentIntroduction to SQream and the IoT environment
Introduction to SQream and the IoT environment
 
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
Tensorflow London 12: Marcel Horstmann and Laurent Decamp 'Using TensorFlow t...
 
Nimble-Storage-AFA-Datasheet
Nimble-Storage-AFA-DatasheetNimble-Storage-AFA-Datasheet
Nimble-Storage-AFA-Datasheet
 
What would you do with a million cores - HPC on AWS
What would you do with a million cores - HPC on AWSWhat would you do with a million cores - HPC on AWS
What would you do with a million cores - HPC on AWS
 
Using Alluxio as a Fault Tolerant Pluggable Optimization Component to Compute...
Using Alluxio as a Fault Tolerant Pluggable Optimization Component to Compute...Using Alluxio as a Fault Tolerant Pluggable Optimization Component to Compute...
Using Alluxio as a Fault Tolerant Pluggable Optimization Component to Compute...
 
Processing images with Deep Learning
Processing images with Deep LearningProcessing images with Deep Learning
Processing images with Deep Learning
 
Scientific Computing With Amazon Web Services
Scientific Computing With Amazon Web ServicesScientific Computing With Amazon Web Services
Scientific Computing With Amazon Web Services
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
 
Gaurav slides
Gaurav slidesGaurav slides
Gaurav slides
 
Building Fast SQL Analytics on Anything with Presto, Alluxio
Building Fast SQL Analytics on Anything with Presto, AlluxioBuilding Fast SQL Analytics on Anything with Presto, Alluxio
Building Fast SQL Analytics on Anything with Presto, Alluxio
 
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
 
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_DataPeter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
Peter_Smith_PhD_ACL_10000_Foot_View_of_Big_Data
 
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
"Democratizing Big Data", Ami Gal, CEO & Co-Founder of SQream Technologies
 
WekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound AgainWekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound Again
 
Accelerating analytics in a new era of data
Accelerating analytics in a new era of dataAccelerating analytics in a new era of data
Accelerating analytics in a new era of data
 
Jax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised MedicineJax 2013 - Big Data and Personalised Medicine
Jax 2013 - Big Data and Personalised Medicine
 
Sqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performanceSqream DB on OpenPOWER performance
Sqream DB on OpenPOWER performance
 

Viewers also liked

Alfabeto de nomes n
Alfabeto de nomes   nAlfabeto de nomes   n
Alfabeto de nomes n
Dário Reis
 
Stress and bindings_dr._shriniwas_kashalikar
Stress and bindings_dr._shriniwas_kashalikarStress and bindings_dr._shriniwas_kashalikar
Stress and bindings_dr._shriniwas_kashalikar
shriniwas kashalikar
 
Healthy Living
Healthy LivingHealthy Living
Healthy Living
ryryry678
 
Alfabeto de nomes k
Alfabeto de nomes   kAlfabeto de nomes   k
Alfabeto de nomes k
Dário Reis
 
Un anno di funzionamento del registro REMIT: principali questioni e relative ...
Un anno di funzionamento del registro REMIT: principali questioni e relative ...Un anno di funzionamento del registro REMIT: principali questioni e relative ...
Un anno di funzionamento del registro REMIT: principali questioni e relative ...
ARERA
 

Viewers also liked (20)

Alfabeto de nomes n
Alfabeto de nomes   nAlfabeto de nomes   n
Alfabeto de nomes n
 
Separate Pieces Woven Into One
Separate Pieces Woven Into OneSeparate Pieces Woven Into One
Separate Pieces Woven Into One
 
Intelligent Systems Project: Bike sharing service modeling
Intelligent Systems Project: Bike sharing service modelingIntelligent Systems Project: Bike sharing service modeling
Intelligent Systems Project: Bike sharing service modeling
 
Stress and bindings_dr._shriniwas_kashalikar
Stress and bindings_dr._shriniwas_kashalikarStress and bindings_dr._shriniwas_kashalikar
Stress and bindings_dr._shriniwas_kashalikar
 
BI Architectures - Next Generation
BI Architectures - Next GenerationBI Architectures - Next Generation
BI Architectures - Next Generation
 
Plants
PlantsPlants
Plants
 
Krzyształówe Pióra Konkurs Dla Dziennikarzy - X edycja jubileuszowa 2016
Krzyształówe Pióra Konkurs Dla Dziennikarzy - X edycja jubileuszowa 2016Krzyształówe Pióra Konkurs Dla Dziennikarzy - X edycja jubileuszowa 2016
Krzyształówe Pióra Konkurs Dla Dziennikarzy - X edycja jubileuszowa 2016
 
Healthy Living
Healthy LivingHealthy Living
Healthy Living
 
Technical review of the data collection, scoring and analysis process
Technical review of the data collection, scoring and analysis processTechnical review of the data collection, scoring and analysis process
Technical review of the data collection, scoring and analysis process
 
Imperialism and namasmaran dr shriniwas kashalikar
Imperialism and namasmaran dr shriniwas kashalikarImperialism and namasmaran dr shriniwas kashalikar
Imperialism and namasmaran dr shriniwas kashalikar
 
Organization development : Leadership
Organization development : LeadershipOrganization development : Leadership
Organization development : Leadership
 
Estrategia trading-neptuno
Estrategia trading-neptunoEstrategia trading-neptuno
Estrategia trading-neptuno
 
Alfabeto de nomes k
Alfabeto de nomes   kAlfabeto de nomes   k
Alfabeto de nomes k
 
Notes about concurrent and distributed systems & x86 virtualization
Notes about concurrent and distributed systems & x86 virtualizationNotes about concurrent and distributed systems & x86 virtualization
Notes about concurrent and distributed systems & x86 virtualization
 
HTML_Slideshow1
HTML_Slideshow1HTML_Slideshow1
HTML_Slideshow1
 
Un anno di funzionamento del registro REMIT: principali questioni e relative ...
Un anno di funzionamento del registro REMIT: principali questioni e relative ...Un anno di funzionamento del registro REMIT: principali questioni e relative ...
Un anno di funzionamento del registro REMIT: principali questioni e relative ...
 
A real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operatorA real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operator
 
La Governance del settore idrico
La Governance del settore idricoLa Governance del settore idrico
La Governance del settore idrico
 
Fancy car rental system final presentation
Fancy car rental system final presentationFancy car rental system final presentation
Fancy car rental system final presentation
 
BlaBlaCar - NOAH15 Berlin
BlaBlaCar - NOAH15 BerlinBlaBlaCar - NOAH15 Berlin
BlaBlaCar - NOAH15 Berlin
 

Similar to Technology Outlook - The new Era of computing

BigData processing in the cloud – Guest Lecture - University of Applied Scien...
BigData processing in the cloud – Guest Lecture - University of Applied Scien...BigData processing in the cloud – Guest Lecture - University of Applied Scien...
BigData processing in the cloud – Guest Lecture - University of Applied Scien...
Romeo Kienzler
 
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
Romeo Kienzler
 
Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...
Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...
Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...
Romeo Kienzler
 
Flash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointFlash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling Point
CTI Group
 
Nimble storage
Nimble storageNimble storage
Nimble storage
dvmug1
 

Similar to Technology Outlook - The new Era of computing (20)

BigData processing in the cloud – Guest Lecture - University of Applied Scien...
BigData processing in the cloud – Guest Lecture - University of Applied Scien...BigData processing in the cloud – Guest Lecture - University of Applied Scien...
BigData processing in the cloud – Guest Lecture - University of Applied Scien...
 
The future of tape
The future of tapeThe future of tape
The future of tape
 
Hadoop Fundamentals I
Hadoop Fundamentals IHadoop Fundamentals I
Hadoop Fundamentals I
 
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
 
Vortrag ralph behrens_ibm-data
Vortrag ralph behrens_ibm-dataVortrag ralph behrens_ibm-data
Vortrag ralph behrens_ibm-data
 
IBM Tape the future of tape
IBM Tape the future of tapeIBM Tape the future of tape
IBM Tape the future of tape
 
Helathcare modernize-tebc105-v1704a
Helathcare modernize-tebc105-v1704aHelathcare modernize-tebc105-v1704a
Helathcare modernize-tebc105-v1704a
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
IBM Cloud Object Storage: How it works and typical use cases
IBM Cloud Object Storage: How it works and typical use casesIBM Cloud Object Storage: How it works and typical use cases
IBM Cloud Object Storage: How it works and typical use cases
 
Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...
Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...
Software and Hardware Infrastructures to conquer Data Explosion in Life Scien...
 
S104876 ibm-cos-jburg-v1809b
S104876 ibm-cos-jburg-v1809bS104876 ibm-cos-jburg-v1809b
S104876 ibm-cos-jburg-v1809b
 
OpenPOWER Seminar at IIT Madras
OpenPOWER Seminar at IIT MadrasOpenPOWER Seminar at IIT Madras
OpenPOWER Seminar at IIT Madras
 
Storage and The Cloud 1. What is driving IT / Businesses to Cloud 2. Traditio...
Storage and The Cloud 1. What is driving IT / Businesses to Cloud 2. Traditio...Storage and The Cloud 1. What is driving IT / Businesses to Cloud 2. Traditio...
Storage and The Cloud 1. What is driving IT / Businesses to Cloud 2. Traditio...
 
IBM Power Systems - enabling cloud solutions
IBM Power Systems - enabling cloud solutionsIBM Power Systems - enabling cloud solutions
IBM Power Systems - enabling cloud solutions
 
Flash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling PointFlash Ahead: IBM Flash System Selling Point
Flash Ahead: IBM Flash System Selling Point
 
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center ZurichData Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
 
Nimble storage
Nimble storageNimble storage
Nimble storage
 
IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013IBM Special Announcement session Intel #IDF2013 September 10, 2013
IBM Special Announcement session Intel #IDF2013 September 10, 2013
 
2018 bsc power9 and power ai
2018   bsc power9 and power ai 2018   bsc power9 and power ai
2018 bsc power9 and power ai
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 

More from Swiss Big Data User Group

Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density Choice
Swiss Big Data User Group
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maket
Swiss Big Data User Group
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
Swiss Big Data User Group
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
Swiss Big Data User Group
 

More from Swiss Big Data User Group (20)

Making Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to useMaking Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to use
 
Data Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2CData Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2C
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Closing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisClosing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data Analysis
 
Big Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesBig Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companies
 
Design Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time LearningDesign Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time Learning
 
Educating Data Scientists of the Future
Educating Data Scientists of the FutureEducating Data Scientists of the Future
Educating Data Scientists of the Future
 
Unleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data WarehouseUnleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data Warehouse
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
Project "Babelfish" - A data warehouse to attack complexity
 Project "Babelfish" - A data warehouse to attack complexity Project "Babelfish" - A data warehouse to attack complexity
Project "Babelfish" - A data warehouse to attack complexity
 
Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density Choice
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maket
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
 
In-Store Analysis with Hadoop
In-Store Analysis with HadoopIn-Store Analysis with Hadoop
In-Store Analysis with Hadoop
 
Big Data Visualization With ParaView
Big Data Visualization With ParaViewBig Data Visualization With ParaView
Big Data Visualization With ParaView
 
Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
 
Oracle's BigData solutions
Oracle's BigData solutionsOracle's BigData solutions
Oracle's BigData solutions
 
Introducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data EngineIntroducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data Engine
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Technology Outlook - The new Era of computing

  • 1. 1 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Dr. Axel Koester – Storage Chief Technologist, European Storage Competence Center Future perspectives: the new Era of Computing BIG DATA from a storage point of view
  • 2. 2 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 BIG DATA ≠≠≠≠ MUCH DATA
  • 3. 3 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 MUCH DATA Example
  • 4. 4 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 www.extremetech.com press release August 2011
  • 5. 5 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Details of the 120 PB Cluster No "Replication Cluster" like Amazon EC2 or Google Cloud – no 3 copies of each data block Why? – Would require 550.000 instead of 200.000 disk drives – and produce 30 instead of 12 daily failures (at identical net capacity)
  • 6. 6 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 RAID technology won't work in a 200.000 disk cluster 1TB disk drives × 200.000 8+3 software RAID cluster JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD http://www.almaden.ibm.com/storagesystems/projects/perseus/ No spare No spare
  • 7. 7 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 There is always something broken, somewhere 8+3 Reed Solomon encoding = tolerates up to 3 faults JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD JBOD http://www.almaden.ibm.com/storagesystems/projects/perseus/ One fault: low rebuild thread priority Two faults: prioritize rebuild Three faults: rebuild asap < 4 min 20 sec in this state No spare No spare
  • 8. 8 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 120 PB storage grid : One of many p775 storage enclosures Dense Disk Enclosure – 384 disks per unit (192 front & back) 2% Flash SSD for metadata
  • 9. 9 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Power 775 rack with 3 "Dense Disk Enclosures" (1152 drives)
  • 10. 11 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 IBM GSS*: "120 PB predictive technology" for everyone 2x IBM x3650 M42x IBM x3650 M4 GPFS 1PB * 16GB/s 1PB * 16GB/s 2U 2U 4U 4U 4U 4U 240 SAS Disks @ 4TB (currently 720 TB @3TB disks) (*) GSS = GPFS Storage Server, RAID-less (General Parallel File System) 4 M odules – 240 disks 1 PB 4 M odules – 240 disks 1 PB
  • 11. 12 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Cost per 1MB sequenced genomic data www.crops.org For whom? predicted actual » Statistical Medicine
  • 12. 13 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Sometimes… MUCH DATA » BIG DATA
  • 13. 14 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 BIG DATA without MUCH DATA
  • 14. 15 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Watson answers complex "trivia" questions from any subject area. Real money is at stake.
  • 15. 16 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Jeopardy! all-time champions Ken Jennings & Brad Rutter You don't have this hereditary lack of pigment. You just need a little more sun!
  • 16. 17 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Watson's "knowledge" is collected, not coded Knowledge base = English WWW Multiple millions of analyses per second 200 million memorized book pages, with Wikipedia alone totaling 2,25 Mio. 10km10km ~2000 years to read
  • 17. 18 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Watson holds all its knowledge in RAM 90 × 32 core IBM Power®750 / 16 TB RAM, 1 TB data, 500 GB/sec out of 4 TB GPFS disk space, with 16TB = 15 TiB RAM "information aggregator" "information aggregator" *Parallel access to 100% of the data, no Internet access during games
  • 18. 19 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Watson's inventors surprised by amazing correct answers Dr. Jennifer Chu-Carrol, Watson algorithms Why? Because Watson's knowledge is collected, not coded. Only rules are coded, but subject to machine learning.
  • 19. 20 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 BIG DATA and ENERGY
  • 20. 21 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 20 Watt20 Watt IBM Watson: 200.000 Watt
  • 21. 22 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Human brains are incredibly good at "low power" http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.html 20 Watt20 Watt Recognize a face in a crowd – efficiently Distinguish own from outside sound Combine unrelated facts Filter & distill information How? Brains are not 100% accurate. Bit errors don't bother.
  • 22. 23 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Root cause Coronary Syndrome 60% Pneumonia 25% Pulmonary Embolism 9% Congestion prediction Ring A99 in 2 hrs 95% Feeder A8 in 1 hr 90% Energy production Line load estimation Production mix % : "Lower power" from abandoning the 100% bit-accurate IT Patient symptoms Road traffic sensors Wind & sun forecast : Unreliableinputdata Approximation Approximations based on unreliable data should not require bit-accurate processing !
  • 23. 24 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 Projects SyNAPSE and BlueBrain: Simulating the brain http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.html Understand learning down to synapses Understand learning down to synapses Explore large- scale brain simulations Explore large- scale brain simulations Design a chip that "learns" at molecular level Design a chip that "learns" at molecular level
  • 24. 25 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 The brain transistor no more science-fiction, since March
  • 25. 26 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 March 2013 : IBM publishes a liquid-based transistor that process data like the human brain
  • 26. 27 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 by IBM Fellow Stuart Parkin, inventor of "Racetrack Memory" Droplet that can be turned into "liquid metal" and switch currents Stuart ParkinStuart Parkin “We turn this material into a metal and maintain it without any need to supply power.” metalized ions liquid The programmable liquid can be the information conveyor, not just a bit cell
  • 27. 28 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 IT energy problems? Start learning from nature! copied nature « » optimized engineering
  • 28. 30 © 2013 IBM Corporationaxel.koester@de.ibm.com IBM Big Data Usergroup, Mai 2013 axel.koester@de.ibm.com