Gaining business advantages from big data is moving beyond just the efficient storage and deep analytics on diverse data sources to using AI methods and analytics on streaming data to catch insights and take action at the edge of the network.
https://hortonworks.com/webinar/accelerating-data-science-real-time-analytics-scale/
What Are The Drone Anti-jamming Systems Technology?
Accelerating Data Science and Real Time Analytics at Scale
1. Accelerating Data Science
and
Real-Time Analytics
at Scale
Nadeem Asghar, Hortonworks, Field CTO and
Global Head Partner Engineering
Steve Roberts, IBM, Big Data Offering Manager
2. Data
Time
Available
Data
Understood
Data
Enterprise
Amnesia
80 million
wearable health
devices will
be available by
2017.
2.5
quintillion
bytes of data
generated daily
by connected
machines.
There
will be
28 times
more
sensor-
enabled
devices
than
people
by the
year 2020.
25 gigabytes
of data per hour
is generated by a
connected car.
90% of cars will
be connected by 2020.
153 exabytes
of healthcare
data generated by
devices in 2013.
Increasing to 2,314
exabytes in 2020.
1.7 megabytes
of data per
second
generated by
every human
being on the
planet by 2020.
26. IBM Power Systems
designed to deliver
breakthrough performance
for data
threads per core
processor cache
memory bandwidth
open innovation
+++
MOREvs.
x86
+ BETTER
L1 ßà L4
COMMUNITY
availability | scalability | reliability | serviceability
get more work done
fastest memory lives on cores
more data than ever is flowing
faster innovation and value
MEANS
26
27. Accelerate Data Science with Power Systems
Test results based on running a machine learning workload based on k-means clustering algorithm on data sets size ranging from 1GB to 15 GB. Test System details – Power Systems
S822 LC HPC – 20 Cores, 512 GB RAM and SSD, Power Systems S822LC Big Data – 20 Cores, 512 GB, HDDs, Intel Server with Broadwell E5 2640 v4 – 20 cores, 512 GB and SSD,
Intel Server with Broadwell E5 2699 v4 – 44 cores, 512 GB, HDD
• Increase Data Science Team productivity
• Reduce model training time
− 2.5X with S822LC for HPC vs E5-2640 v4
(with SSD)
− 1.5X with S822LC for Big Data vs E5-2699 v4
(with HDD)
• Leverage larger datasets for model
training
• 2.5X larger dataset in the same time (1200 Seconds -
~5GB for x86 server E5 2640 with SSD vs 13GB for
Power server S822 LC HPC with SSD)
0
600
1200
1800
2400
3000
3600
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Data Size (GB)
Elapsed time to form 5 clusters in 100 Iterations using
k-means clustering with one user
S822LC HPC with SSD S822LC BigData with HDD
E5 2699 v4 with HDD E5 2640 v4 with SSD
ElapsedTime(seconds)
28. The Perfect Blend of Data Science and an Enterprise Data Lake
28
Better
Together
datascience.ibm.com
Boost Data Science Team
Productivity: model training
in less than half the time
versus x86
Blazing Fast Insights for Line
of Business: A 1.7x
improvement in time to result
Secure and Reliable Data Access at Scale: Open, comprehensive data
lifecycle and security management on the most reliable servers.
For clients building a high
performing Data Science
practice with a fast, scalable,
enterprise Data Lake
Acomplete solution of Data Science
and Hadoop software, hardware and
quick start services.
30. How to Get Started with Hortonworks on OpenPOWER Systems
• Learn more about the benefits of IBM Power Systems and OpenPOWER
• Join the Hortonworks Community: https://community.hortonworks.com/
• Learn more about the benefits of Hortonworks: http://hortonworks.com/training/
• Sign up for Free Data Science and Cognitive Computing courses:
https://cognitiveclass.ai/
• Try the solution: IBM benchmark centers, on the cloud or on your premise