SlideShare a Scribd company logo
© Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
What Is Big Data?
Architectures and Practical Use Cases
Tony Pearson
Master Inventor and Senior IT Specialist
IBM Corporation
2
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Abstract
Do you understand the storage
implications of big data analytics?
This session will explain what big
data is, provide some practical use
cases, then explain the IBM
products that support big data
3
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
This week with Tony Pearson
Day Time Topic
Monday 10:15am Opening Session – Storage
01:45pm IBM's Cloud Storage Options
Tuesday 11:30am Software Defined Storage -- Why? What? How? (repeats Friday)
03:15pm The Pendulum Swings Back –
Understanding Converged and Hyperconverged Environments
04:30pm New Generation of Storage Tiering:
Less Management Lower Cost and Increased Performance
Wednesday 09:00am What Is Big Data? Architectures and Practical Use Cases
01:45pm Data Footprint Reduction – Understanding IBM Storage Efficiency Options
03:15pm IBM Spectrum Virtualize – SVC, Storwize and FlashSystem V9000 (repeats Friday)
Thursday 10:15am IBM Spectrum Scale and Elastic Storage Offerings
01:45pm IBM Spectrum Scale for File and Object storage
03:15pm IBM Storage Integration with OpenStack
05:45pm Meet the Experts
Friday 09:00am Software Defined Storage -- Why? What? How?
10:15am IBM Spectrum Virtualize – SVC, Storwize and FlashSystem V9000
What is Big Data?
Big Data Use Cases
IBM Analytics Platform
IBM Spectrum Scale
Agenda
5
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
What is Big Data?
Data sets so large and complex
that it becomes difficult to process
using relational databases
The challenges include capture,
curation, storage, search, sharing,
transfer, analysis and visualization
Analysis of a single large set of
related data allows correlations to
be found
Can be used to identify trends,
patterns and insights to make
better decisions
Source: Wikipedia
6
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
OLAP
cube
Extract
Transform
Load (ETL)
Strategic planning
based on historical
analysis and
speculation
Day-to-day
operations based on
reports, news,
intuition
Business Executives
Make decisions
3
Traditional Decision Making Process
Reports
Batch
Processing
Transaction and
Application data
Database
Administrators
System of Record
Gather data
1
Business
Analysts
Analyze
2
7
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
What has Changed in the Last Few Decades?
1986 2015
6%
99%
Analog
data
Digital
data
Transaction and
Application data
Machine
data
Social media,
email
Enterprise
content
20%
Structured data
80%
Unstructured data
8
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
New Sources of Data to Analyze –
the Four V’s of big data
Volume
– Scale of data has grown beyond
relational database capabilities
Variety
– Machine data, enterprise content,
and social media and email
Velocity
– Computing has advanced to
receive and analyze real-time
data streams
Veracity
– How much can you trust the data
is right and accurate?
Transaction and
Application data
Database
Administrators
System of Record
System of Engagement
System of Insight
Machine
Data,
log data
Social
media,
photos,
audio,
video,
email
Enterprise
content
Storage
Administrators
Gather and Identify sources of data
1
9
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Data is the New Oil
DATA is the
new OIL In its raw form,
oil has little value…
Once processed
and refined,
it helps to power the
world!
10
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Structured,
Repeatable,
Linear
OLAP
cube
Unstructured,
Exploratory,
Iterative
New Capabilities to Analyze the Data
Reports Visualization and
Discovery
Hadoop
Data warehousing
Stream
Computing
Integration and
Governance
Text Analytics
Business
Analyst
Data
Scientist
Analyze data2
11
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
What does a Data Scientist do?
“It’s no longer hard to find the answer to a
given question; the hard part is finding the
right question. And as questions evolve, we
gain better insight into our ecosystem and
our business.”
-- Kevin Weil, Lead Analyst at Twitter
A data scientist must have…
– Strong business acumen
– Modeling, statistics, analytics and math skills
– Ability to communicate findings, tell a story
from the data, to both business and IT leaders
Inquisitive: exploring, doing “what if?”
analyses, questioning existing assumptions
and processes to spot trends, patterns and
hidden insight.
Computers are useless.
They can only give you
answers.
– Pablo Picasso
Source: http://www-01.ibm.com/software/data/infosphere/data-scientist/
http://blog.cloudera.com/blog/2010/09/twitter-analytics-lead-kevin-weil-and-a-presenter-at-hadoop-world-interviewed/
12
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Data Information Knowledge Wisdom (DIKW)
Wisdom
Applied I better stop the car!
Knowledge
Context
The traffic light I am
driving towards has
turned red
Information
Meaning
South-facing light at
corner of Pitt and George
streets has turn red
Data
Raw
červený
685 nm, 421 THz,
#FF0000
http://legoviews.com/2013/04/06/put-knowledge-into-action-and-enhance-organisational-wisdom-lsp-and-dikw/
13
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Better Decisions for New Business Outcomes
Day-to-day
operations based
on real-time
analytics
Strategic planning
based on science,
trends, patterns
and insight
Know Everything
about your
Customers
Innovate new
products at Speed
and Scale
Instant Awareness
of Fraud and Risk
Exploit Instrumented
Assets
Run Zero-latency
Operations
Business
Executive
Make Decisions
and Take Action
3
Empowered
Employees
14
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
statistical
models
Decision Making Process in the Era of big data
Real-time
Analytics
Database
Administrators
System of Insight
Strategic planning
based on science,
trends, patterns and
insight
Dashboard
Storage
Administrators
Gather and Identify sources of data
1
Day-to-day
operations based
on real-time
analytics
Business Executives
Empowered Employees
Make Decisions
and Take Action
3Data
Scientists
Business
Analysts
Analyze data2
What is Big Data?
Big Data Use Cases
IBM Analytics Platform
IBM Spectrum Scale
Agenda
16
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Practical Use Cases – The Analytics Landscape
Degree of Complexity
CompetitiveAdvantage
Standard Reporting
Ad hoc reporting
Query/drill down
Alerts
Simulation
Forecasting
Predictive modeling
Optimization
What exactly is the problem?
What will happen next if ?
What if these trends continue?
What could happen…. ?
What actions are needed?
How many, how often, where?
What happened?
Stochastic Optimization
Based on: Competing on Analytics, Davenport and Harris, 2007
Descriptive
Prescriptive
Predictive
How can we achieve the best
outcome?
How can we achieve the best
outcome including the effects of
variability?
17
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Innovate New Products and Services at Speed and Scale
Vestas, the world’s largest wind energy company, was able to use
big data and IBM technology to increase wind power generation
through optimal turbine placement.
Reducing the time to analyze petabytes of data with
IBM Big Insights software and IBM Spectrum Scale
“Before, it could take us three
weeks to get a response to
some of our questions simply
because we had to process a
lot of data. We expect that we
can get answers for the same
questions now in 15 minutes.”
– Lars Christian Christensen
18
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
If You are Not Paying for it…
Then you are not the Customer,
… You are the Product Being Sold!
How much is each
user worth to Social
Media companies?
Sources: Geek & Poke comic,
“Let’s Talk about Data” by Neha Mehta
19
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Social Network Public
Database
How valuable is Amy to my retail
sales? Who does she influence?
What do they spend?
Retailer
Amy Bearn
32, Married, mother of 3,
Accountant
Telco Score: 91
CPG Score: 76
Fashion Score: 88
Telco
company
How valuable is Amy to my mobile
phone network? How likely is she to
switch carriers? How many other
customers will follow
Merged Network
Calling Network
360 Degree View of the Customer –
A Demographic of One
20
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Deep Individual
Customer Insight
• Preferences
• Interests
• Likes
Run Zero-Latency Operations
Direct Channel Workflow Enrich
Initiate Direct
Response
Initiate
Channel
Response
Initiate
Process or
Workflow
Enrich
Customer
Profile
Real-time
Decision
21
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
How Target® Figured Out a Teen Girl Was Pregnant
Before Her Father Did
Every time you go shopping, you share intimate
details about your consumption patterns with
retailers.
Target has figured out how to data-mine whether
you have a baby on the way
Looked at historical buying data for all the ladies
who had signed up for Target baby registries
– Unscented soaps and lotions
– Calcium, magnesium and zinc supplements
About 25 products help generate “pregnancy
prediction” score and her “baby due date”
Target sends coupons timed to very specific
stages of her pregnancy
Source: http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/
“My daughter got this in the mail. She’s
still in high school, and you’re sending
her coupons for baby clothes and cribs?”
-- Angry father of teen girl
“I had a talk with my daughter,…She’s due
in August. I owe you an apology.”
-- Same father, 3 days later
22
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Exploit Instrumented Assets
Doctors from University of Ontario apply big data to
neonatal infant monitoring to predict infection
Detect Neonatal Patient Symptoms
Up to 24 Hours sooner
Continuously correlate data
Thousands of events
each second
Signal Processing
and Data Cleansing
Heart Rate Variability
What is Big Data?
Big Data Use Cases
IBM Analytics Platform
IBM Spectrum Scale
Agenda
24
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
The IBM big data platform advantage
BI /
Reporting
BI /
Reporting
Exploration /
Visualization
Functional
App
Industry
App
Predictive
Analytics
Content
Analytics
Analytic Applications
IBM big data platform
Systems
Management
Application
Development
Visualization
& Discovery
Accelerators
Information Integration & Governance
Hadoop
System
Stream
Computing
Data
Warehouse
• The platform provides benefit
as you move from an entry
point to a second and third
project
• Shared components and
integration between systems
lowers deployment costs
• Key points of leverage
• Reuse text analytics across streams and
BigInsights
• Hadoop connectors between Streams
and Information Integration
• Common integration, metadata and
governance across all engines
• Accelerators built across multiple engines
– common analytics, models, and
visualization
25
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Simplify your data warehouse
Customer Need
– Business users are hampered by the poor
performance of analytics of a general-purpose
enterprise warehouse – queries take hours to
run
– Enterprise data warehouse is encumbered by
too much data for too many purposes
– Need to ingest huge volumes of structured data
and run multiple concurrent deep analytic
queries against it
– IT needs to reduce the cost of maintaining the
data warehouse
Value Statement
– Speed and Simplicity for deep analytics
– 100s to 1000s users/second for operation
analytics
Customer examples
– Catalina Marketing – executing 10x the amount
of predictive workloads with the same staff
System for Transactions
System for Analytics
System for Operational Analytics
Get started with
IBM PureData Systems!
26
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Ad-Hoc versus Operational Analytics
27
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Analyze streaming data in Real time
Customer Need
– Harness and process streaming data
sources
– Select valuable data and insights to be
stored for further processing
– Quickly process and analyze perishable
data, and take timely action
Value Statement
– Significantly reduced processing time and
cost – process and then store what’s
valuable
– React in real-time to capture opportunities
before they expire
Customer examples
– Ufone – Telco Call Detail Record (CDR)
analytics for customer churn prevention
Get started with IBM Streams!
Visualization
Streams Runtime
Deployments
Sync
Adapters
Analytic
Operators
Source
Adapters
Automated
and
Optimized
Deployment
Streaming Data
Sources
Streams Studio IDE
28
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Dominant Players vs. Contender platforms
OS Tape Cloud
Management
Big Data &
Analytics
Dominant
Player
Microsoft
Windows
Quantum
DLT
Amazon Web
Services
Cloudera
Contender
platform
Linux Linear Tape
Open (LTO)
OpenStack Open Data
Platform
Supporters
of Contender
platform
IBM,
RedHat,
SUSE,
Oracle and
others
IBM, HP,
Certance
and others
IBM, HP,
Rackspace,
RedHat, Dell,
Cisco, VMware
and others
IBM, Pivotal,
Hortonworks
and others
29
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
IBM InfoSphere BigInsights is a 100% standard Hadoop distribution
By default, open source components are always deployed
Elect to use proprietary capabilities depending on your needs
In some cases, proprietary capabilities offer significant benefits
Open standards first, but with freedom of choice
HDFS
YARN
HIVE
MapReduce
PIG
Spectrum
Scale
Platform
Symphony
Big SQL
Adaptive
MapReduce
BigSheets
Share data with non-Hadoop applications
and simplify data management
Re-use existing tools and expertise,
Avoid additional development costs
Boost performance, support time-critical
workloads, do more with less
True multi-tenancy to boost service levels
and avoid duplication on infrastructure
Simplify access for end-users,
minimize software development
30
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Text Analytics
Spectrum Scale Platform Symphony
IBM BigInsights
Enterprise Management
System ML on Big R
Distributed R
IBM Open Platform with Apache Hadoop
IBM BigInsights Data Scientist
IBM BigInsights Analyst
Big SQL
Big Sheets
Big SQL
BigSheets
IBM BigInsights for
Apache Hadoop
IBM BigInsights for Apache Hadoop
Three new user-centric modules founded on an Open Data Platform
IBM Open Platform with Apache Hadoop is IBM’s own 100% open source Apache
Hadoop distribution. IBM will include the ODP common kernel when available.
Business Analyst
Data Scientist
Administrator
31
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Platform Symphony Integrates with Hadoop
YARN uses a pluggable architecture for schedulers.
– FIFO, Fair, and Capacity Schedulers implemented this way
– Symphony EGO is also implemented this way.
Therefore, scheduler is completely transparent to YARN Applications.
ISV Certification for Platform Symphony is not required.
YARN (open source)
Fair Capacity
Symphony
EGO
FIFO
Like other schedulers, queues and policies are defined in Platform Symphony EGO.
App1 App2 App3
32
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Spark, a Complement to Hadoop
32
• Spark - complement Hadoop, not replace
• Provides distributed memory abstractions for clusters to support applications that repeatedly use a
working set of data,
• Iterative algorithms (machine learning),
• Interactive data mining tools (R, Python, ..)
• Spark Programming Model – Resilient Distributed Datasets (RDDs)
• Immutable collections partitioned across cluster that can be rebuilt if a partition is lost
• Created by transforming data in stable storage using data flow operators (map, filter, group-by, …)
• Can be cached across parallel operations
• Spark uses HDFS or IBM Spectrum Scale
• Can use any Hadoop data source
• Use Hadoop InputFormats and OutputFormats
• Spark runs on YARN
• Can run on the same cluster with MapReduce
• Spark works with Hadoop ecosystem
• Flume, Sqoop, HBase
• Spark architectural considerations
• Keep dataset in memory
• Spark programs can be bottlenecked by any
resource in cluster: CPU, network bandwidth,
memory. Most often, if data fits in memory, the
bottleneck is network bandwidth.
HDFS or IBM Spectrum Scale YARN
33
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
IBM InfoSphere BigInsights – Big SQL
Native Hadoop Data Sources
CSV SEQ Parquet RC
AVRO ORC JSON Custom
Optimized SQL MPP Run-time
Big SQL
SQL based
Application
IBM’s SQL for Hadoop
• Makes Hadoop data accessible to a
wider audience
• Familiar, widely known syntax
• Leverage native Hadoop data sources
Complements the Data Warehouse
• Exploratory analytics
• Sandbox, Data Lake
Included in IBM BigInsights
Use familiar SQL tools
• Cognos, SPSS, Tableau, MicroStrategy
34
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Information
Ingestion and
Operational
Information
Decision
Management
BI and Predictive
Analytics
Navigation
and Discovery
Intelligence
Analysis
Landing Area,
Analytics Zone
and Archive
Raw Data
Structured Data
Text Analytics
Data Mining
Entity Analytics
Machine Learning
Real-time
Analytics
Video/Audio
Network/Sensor
Entity Analytics
Predictive
Exploration,
Integrated Warehouse,
and Mart Zones
Discovery
Deep Reflection
Operational
Predictive
Stream Processing
Data Integration
Master Data
Streams
Information Governance, Security and Business Continuity
Architecture Pattern for big data Implementation
Application
Transaction
Machine
data
Social media,
email
Enterprise
content
Data at Rest
What is Big Data?
Big Data Use Cases
IBM Analytics Platform
IBM Spectrum Scale
Agenda
36
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Why use IBM Spectrum Scale™
Extreme Scalability
Add or Remove nodes and
storage, without disruption or
performance impact to
applications
Universal Access to Data
All servers and clients have access to
data through a variety of file and object
protocols
High Performance
Parallel access with no hot spots
Proven Reliability
Used by over 200 of the top 500 Supercomputers
Survive any node or storage failure with Distributed
RAID and redundant components
37
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Hadoop Analytics – HDFS vs IBM Spectrum Scale™
HDFS
Save
Results
Discard
Rest
IBM Hadoop Connector
allows Map/Reduce
programs to process data
without application
changes
IBM Spectrum Scale
Application data
stored on IBM
Spectrum Scale is
readily available
for analytics
Save
Results
JFS2
NTFS
EXT4
Data Sources
mashup of structured and unstructured data
from a variety of sources
Actionable Insights
Provides answers to the
Who, What, Where, When,
Why and How
Business Intelligence
& Predictive Analytics
> Competitive Advantages
> New Threats and Fraud
> Changing Needs
and Forecasting
> And More!
38
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Hadoop HDFS
HDFS NameNode HA added in version 2.0.
NameNode HA in active/passive configuration
Difficulty to ingest data – special tools required
Lacking enterprise readiness
No single point of failure, distributed
metadata in active/active configuration since
1998
Ingest data using policies for data
placement
Versatile, Multi-purpose,
Hybrid Storage (locality and shared)
Enterprise ready with support for advanced
storage features (Encryption, DR, replication,
SW RAID etc)
Large block-sizes – poor support for small files
Variable block sizes – suited to multiple types
of data and metadata access pattern
Scale compute and storage independently
(Policy based ILM)
Compute and Storage tightly coupled –
leading to very low CPU utilization
Single-purpose, Hadoop MapReduce only
POSIX file system – easy to use and manage
Non-POSIX file system – obscure commands.
Does not support in-place updates.
IBM Spectrum Scale
HDFS versus IBM Spectrum Scale™
39
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
HDFS
Namenode
Secondary
Namenode
IBM Spectrum Scale™ – File Placement Optimization
SAN
Internal, Direct-Attach
TCP/IP or RDMA Network
• Spectrum Scale avoids the need for a central namenode, a
common failure point in HDFS
• Avoid long recovery times in the event of namenode
failure
• Spectrum Scale can intermix FPO with standard NSD server
and client nodes in the same cluster
• POSIX compliance which is key to avoid data islands.
• Robustness and performance at massive scale and
maturity
File Placement Optimization
(FPO)
Creates a “shared nothing”
cluster similar to HDFS in
Hadoop environments
40
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Share-Nothing versus Shared-Disk Deployments
Data
Data
Data Parity
Data
Data
Data
Copy
Copy
Copy
Copy
Copy
Copy
TCP/IP
or RDMA
Need more compute?
Add another node!
Spectrum Scale and Elastic Storage
Server reduce storage to one
RAID-protected copy of the data
Scale compute and storage
capacity separately
Spectrum Scale FPO
can keep 1,2 or 3
replicas of the data
Need more
storage capacity?
Add another
node!
3x versus 1.3x
TCP/IP
or RDMA
41
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
IBM Spectrum Scale™ –
Software, Systems or Cloud Services
Software
• Install software on your
own choice of Industry
standard x86 or
POWER servers
Pre-built Systems
• Elastic Storage Server with
distributed RAID
• Storwize V7000 Unified
Cloud Services
• Spectrum Scale can be
deployed on any Cloud
Scale
42
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Session summary
Big data is being generated by
everything around us
– Every digital process and social
media exchange produces it
– Systems, sensors and mobile
devices transmit it
Big data is arriving from multiple
sources at amazing velocities,
volumes and varieties
To extract meaningful value from
big data, you need optimal
processing power, storage,
analytics capabilities, and skills
Sources: The Economist, and special thanks to
Dr. Bob Sutor, IBM VP, Business Solutions & Mathematical Sciences
43
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Session Evaluations
YOUR OPINION MATTERS!
Submit four or more session
evaluations by 5:30pm Wednesday
to be eligible for drawings!
*Winners will be notified Thursday morning. Prizes must be picked up at
registration desk, during operating hours, by the conclusion of the event.
1 2 3 4
44
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
45
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Big Data & Analytics
Building Big Data and Analytics Solutions in the Cloud
http://www.redbooks.ibm.com/abstracts/redp5085.html?Open
o IBM BigInsights
o IBM PureData System for Hadoop
o IBM PureData System for Analytics
o IBM PureData System for Operational Analytics
o IBM InfoSphere Warehouse
o IBM Streams
o IBM InfoSphere Data Explorer (Watson Explorer)
o IBM InfoSphere Data Architect
o IBM InfoSphere Information Analyzer
o IBM InfoSphere Information Server
o IBM InfoSphere Information Server for Data Quality
o IBM InfoSphere Master Data Management Family
o IBM InfoSphere Optim Family
o IBM InfoSphere Guardium Family
“Analytics is about examining data to derive interesting and relevant
trends and patterns, which can be used to inform decisions, optimize
processes, and even drive new business models.”
46
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Research Paper
“In this paper, we revisit the debate on
the need of a new non-POSIX storage
stack for cloud analytics and argue,
based on an initial evaluation, that it can
be built on traditional POSIX-based
cluster filesystems.“
47
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Hadoop for the Enterprise
http://www.ibm.com/software/data/infosphere/hadoop/enterprise.html
IBM BigInsights for Apache Hadoop provides a 100% open source platform and offers
analytic and enterprise capabilities for Hadoop.
48
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
IBM Tucson Executive Briefing Center
Tucson, Arizona is home for
storage hardware and software
design and development
IBM Tucson Executive
Briefing Center offers:
–Technology briefings
–Product demonstrations
–Solution workshops
Take a video tour!
– http://youtu.be/CXrpoCZAazg
49
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
About the Speaker
Tony Pearson is a Master Inventor and Senior managing consultant for the IBM System Storage™ product line. Tony joined
IBM Corporation in 1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings
on storage topics covering the entire System Storage product line, Tivoli storage software products, and topics related to Cloud
Computing. He interacts with clients, speaks at conferences and events, and leads client workshops to help clients with
strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization products.
Tony writes the “Inside System Storage” blog, which is read by hundreds of clients, IBM sales reps and IBM Business Partners
every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine, and #1
most read IBM blog on IBM’s developerWorks. The blog has been published in series of books, Inside System Storage:
Volume I through V.
Over the past years, Tony has worked in development, marketing and customer care positions for various storage hardware
and software products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in
Electrical Engineering, both from the University of Arizona. Tony holds 19 IBM patents for inventions on storage hardware and
software products.
9000 S. Rita Road
Bldg 9032 Floor 1
Tucson, AZ 85744
+1 520-799-4309 (Office)
tpearson@us.ibm.com
Tony Pearson
Master Inventor,
Senior IT Specialist
IBM System Storage™
50
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Email:
tpearson@us.ibm.com
Twitter:
twitter.com/az99Øtony
Blog:
ibm.co/Pearson
Books:
www.lulu.com/spotlight/99Ø_tony
IBM Expert Network on Slideshare:
www.slideshare.net/az99Øtony
Facebook:
www.facebook.com/tony.pearson.16121
Linkedin:
www.linkedin.com/profile/view?id=103718598
Additional Resources from Tony Pearson
51
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Continue growing your IBM skills
ibm.com/training
provides a comprehensive
portfolio of skills and career
accelerators that are
designed to meet all your
training needs.
If you can’t find the training that is right for
you with our Global Training Providers, we
can help.
Contact IBM Training at dpmc@us.ibm.com
Global Skills Initiative
52
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Trademarks and Disclaimers
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other
countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks
of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a
registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open
Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband
Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO
Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
Other product and service names might be trademarks of IBM or other companies. Information is provided "AS IS" without warranty of any kind.
The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental
costs and performance characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not
constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor
announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to
non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or
delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's
current investment and development activities as a good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will
experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the
workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.
Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM
representative or Business Partner for the most current pricing in your geography.
Photographs shown may be engineering prototypes. Changes may be incorporated in production models.
© IBM Corporation 2015. All rights reserved.
References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the
World Wide Web at http://www.ibm.com/legal/copytrade.shtml.
ZSP03490-USEN-00

More Related Content

What's hot

ETL Microsoft Material
ETL Microsoft MaterialETL Microsoft Material
ETL Microsoft Material
Ahmed Hashem
 
Data Privacy in the DMBOK - No Need to Reinvent the Wheel
Data Privacy in the DMBOK - No Need to Reinvent the WheelData Privacy in the DMBOK - No Need to Reinvent the Wheel
Data Privacy in the DMBOK - No Need to Reinvent the Wheel
DATAVERSITY
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
Sudheer Kondla
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
DATAVERSITY
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
adivasoft
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
DATAVERSITY
 
Snowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for EveryoneSnowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for Everyone
Angel Abundez
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
James Serra
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must Know
Bernard Marr
 
Data mesh
Data meshData mesh
Data mesh
ManojKumarR41
 
Big data
Big dataBig data
Big data
Samira Riki
 
Lessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDMLessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDM
DATAVERSITY
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
Kamal Acharya
 
Star schema PPT
Star schema PPTStar schema PPT
Star schema PPT
Swati Kulkarni Jaipurkar
 
Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and Future
Lorenzo Nicora
 
Review of Data Management Maturity Models
Review of Data Management Maturity ModelsReview of Data Management Maturity Models
Review of Data Management Maturity Models
Alan McSweeney
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Tristan Baker
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
thomasmary607
 
Strategic imperative the enterprise data model
Strategic imperative the enterprise data modelStrategic imperative the enterprise data model
Strategic imperative the enterprise data model
DATAVERSITY
 
data-management-strategy data-management-strategy
data-management-strategy data-management-strategydata-management-strategy data-management-strategy
data-management-strategy data-management-strategy
maheshs191007
 

What's hot (20)

ETL Microsoft Material
ETL Microsoft MaterialETL Microsoft Material
ETL Microsoft Material
 
Data Privacy in the DMBOK - No Need to Reinvent the Wheel
Data Privacy in the DMBOK - No Need to Reinvent the WheelData Privacy in the DMBOK - No Need to Reinvent the Wheel
Data Privacy in the DMBOK - No Need to Reinvent the Wheel
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data ArchitectureADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
ADV Slides: Strategies for Fitting a Data Lake into a Modern Data Architecture
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
Snowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for EveryoneSnowflake + Power BI: Cloud Analytics for Everyone
Snowflake + Power BI: Cloud Analytics for Everyone
 
Big Data: It’s all about the Use Cases
Big Data: It’s all about the Use CasesBig Data: It’s all about the Use Cases
Big Data: It’s all about the Use Cases
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must Know
 
Data mesh
Data meshData mesh
Data mesh
 
Big data
Big dataBig data
Big data
 
Lessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDMLessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDM
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Star schema PPT
Star schema PPTStar schema PPT
Star schema PPT
 
Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and Future
 
Review of Data Management Maturity Models
Review of Data Management Maturity ModelsReview of Data Management Maturity Models
Review of Data Management Maturity Models
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 
Data Warehouse Basic Guide
Data Warehouse Basic GuideData Warehouse Basic Guide
Data Warehouse Basic Guide
 
Strategic imperative the enterprise data model
Strategic imperative the enterprise data modelStrategic imperative the enterprise data model
Strategic imperative the enterprise data model
 
data-management-strategy data-management-strategy
data-management-strategy data-management-strategydata-management-strategy data-management-strategy
data-management-strategy data-management-strategy
 

Viewers also liked

Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
Stratebi
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Cynthia Saracco
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
Anil Rana
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
Vikas Manoria
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
Mark Kromer
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Global Business Solutions SME
 
What is big data?
What is big data?What is big data?
What is big data?
David Wellman
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
Bernard Marr
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
Nasrin Hussain
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
Bernard Marr
 

Viewers also liked (12)

Big Data Analytics 2014
Big Data Analytics 2014Big Data Analytics 2014
Big Data Analytics 2014
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 

Similar to IBM Big Data Analytics Concepts and Use Cases

S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7
Tony Pearson
 
IBM Academy of Technology & Cognitive Computing
IBM Academy of Technology & Cognitive ComputingIBM Academy of Technology & Cognitive Computing
IBM Academy of Technology & Cognitive Computing
Nico Chillemi
 
Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive era Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive era
Steven Miller
 
2019 Top IT Trends - Understanding the fundamentals of the next generation ...
2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...
2019 Top IT Trends - Understanding the fundamentals of the next generation ...
Tony Pearson
 
G111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910aG111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910a
Tony Pearson
 
G107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904bG107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904b
Tony Pearson
 
Robert Lecklin - BigData is making a difference
Robert Lecklin - BigData is making a differenceRobert Lecklin - BigData is making a difference
Robert Lecklin - BigData is making a difference
IBM Sverige
 
Industry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 finalIndustry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 final
Steven Miller
 
Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...
Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...
Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...
WithTheBest
 
ICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data Science
Karan Sachdeva
 
S sy0883 smarter-storage-strategy-edge2015-v4
S sy0883 smarter-storage-strategy-edge2015-v4S sy0883 smarter-storage-strategy-edge2015-v4
S sy0883 smarter-storage-strategy-edge2015-v4
Tony Pearson
 
Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive eraPreparing the next generation for the cognitive era
Preparing the next generation for the cognitive era
Steven Miller
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff Scheel
Kangaroot
 
20150702 - Strategy and Business Value for connected appliances public version
20150702 - Strategy and Business Value for connected appliances public version20150702 - Strategy and Business Value for connected appliances public version
20150702 - Strategy and Business Value for connected appliances public version
Thorsten Schroeer
 
Jakarta keynote
Jakarta keynoteJakarta keynote
Jakarta keynote
Karan Sachdeva
 
High Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environmentsHigh Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environments
Gabor Samu
 
Ibm big data-platform
Ibm big data-platformIbm big data-platform
Ibm big data-platform
IBM Sverige
 
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Precisely
 
The Eco-System of AI and How to Use It
The Eco-System of AI and How to Use ItThe Eco-System of AI and How to Use It
The Eco-System of AI and How to Use It
inside-BigData.com
 
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
Haluk Demirkan
 

Similar to IBM Big Data Analytics Concepts and Use Cases (20)

S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7
 
IBM Academy of Technology & Cognitive Computing
IBM Academy of Technology & Cognitive ComputingIBM Academy of Technology & Cognitive Computing
IBM Academy of Technology & Cognitive Computing
 
Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive era Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive era
 
2019 Top IT Trends - Understanding the fundamentals of the next generation ...
2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...2019 Top IT Trends - Understanding the  fundamentals of the next  generation ...
2019 Top IT Trends - Understanding the fundamentals of the next generation ...
 
G111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910aG111614 top-trends-sydney2019-v1910a
G111614 top-trends-sydney2019-v1910a
 
G107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904bG107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904b
 
Robert Lecklin - BigData is making a difference
Robert Lecklin - BigData is making a differenceRobert Lecklin - BigData is making a difference
Robert Lecklin - BigData is making a difference
 
Industry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 finalIndustry and academic partnerships july 2015 final
Industry and academic partnerships july 2015 final
 
Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...
Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...
Infrastructure Designed for Cognitive Workloads: Why is it Crucial? - Xavier ...
 
ICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data ScienceICP for Data- Enterprise platform for AI, ML and Data Science
ICP for Data- Enterprise platform for AI, ML and Data Science
 
S sy0883 smarter-storage-strategy-edge2015-v4
S sy0883 smarter-storage-strategy-edge2015-v4S sy0883 smarter-storage-strategy-edge2015-v4
S sy0883 smarter-storage-strategy-edge2015-v4
 
Preparing the next generation for the cognitive era
Preparing the next generation for the cognitive eraPreparing the next generation for the cognitive era
Preparing the next generation for the cognitive era
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff Scheel
 
20150702 - Strategy and Business Value for connected appliances public version
20150702 - Strategy and Business Value for connected appliances public version20150702 - Strategy and Business Value for connected appliances public version
20150702 - Strategy and Business Value for connected appliances public version
 
Jakarta keynote
Jakarta keynoteJakarta keynote
Jakarta keynote
 
High Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environmentsHigh Value Business Intelligence for IBM Platform compute environments
High Value Business Intelligence for IBM Platform compute environments
 
Ibm big data-platform
Ibm big data-platformIbm big data-platform
Ibm big data-platform
 
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
Digital Transformation: How to Run Best-in-Class IT Operations in a World of ...
 
The Eco-System of AI and How to Use It
The Eco-System of AI and How to Use ItThe Eco-System of AI and How to Use It
The Eco-System of AI and How to Use It
 
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
WHY DO SO MANY ANALYTICS PROJECTS STILL FAIL?
 

Recently uploaded

在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 

Recently uploaded (20)

在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 

IBM Big Data Analytics Concepts and Use Cases

  • 1. © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. What Is Big Data? Architectures and Practical Use Cases Tony Pearson Master Inventor and Senior IT Specialist IBM Corporation
  • 2. 2 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Abstract Do you understand the storage implications of big data analytics? This session will explain what big data is, provide some practical use cases, then explain the IBM products that support big data
  • 3. 3 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. This week with Tony Pearson Day Time Topic Monday 10:15am Opening Session – Storage 01:45pm IBM's Cloud Storage Options Tuesday 11:30am Software Defined Storage -- Why? What? How? (repeats Friday) 03:15pm The Pendulum Swings Back – Understanding Converged and Hyperconverged Environments 04:30pm New Generation of Storage Tiering: Less Management Lower Cost and Increased Performance Wednesday 09:00am What Is Big Data? Architectures and Practical Use Cases 01:45pm Data Footprint Reduction – Understanding IBM Storage Efficiency Options 03:15pm IBM Spectrum Virtualize – SVC, Storwize and FlashSystem V9000 (repeats Friday) Thursday 10:15am IBM Spectrum Scale and Elastic Storage Offerings 01:45pm IBM Spectrum Scale for File and Object storage 03:15pm IBM Storage Integration with OpenStack 05:45pm Meet the Experts Friday 09:00am Software Defined Storage -- Why? What? How? 10:15am IBM Spectrum Virtualize – SVC, Storwize and FlashSystem V9000
  • 4. What is Big Data? Big Data Use Cases IBM Analytics Platform IBM Spectrum Scale Agenda
  • 5. 5 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. What is Big Data? Data sets so large and complex that it becomes difficult to process using relational databases The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization Analysis of a single large set of related data allows correlations to be found Can be used to identify trends, patterns and insights to make better decisions Source: Wikipedia
  • 6. 6 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. OLAP cube Extract Transform Load (ETL) Strategic planning based on historical analysis and speculation Day-to-day operations based on reports, news, intuition Business Executives Make decisions 3 Traditional Decision Making Process Reports Batch Processing Transaction and Application data Database Administrators System of Record Gather data 1 Business Analysts Analyze 2
  • 7. 7 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. What has Changed in the Last Few Decades? 1986 2015 6% 99% Analog data Digital data Transaction and Application data Machine data Social media, email Enterprise content 20% Structured data 80% Unstructured data
  • 8. 8 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. New Sources of Data to Analyze – the Four V’s of big data Volume – Scale of data has grown beyond relational database capabilities Variety – Machine data, enterprise content, and social media and email Velocity – Computing has advanced to receive and analyze real-time data streams Veracity – How much can you trust the data is right and accurate? Transaction and Application data Database Administrators System of Record System of Engagement System of Insight Machine Data, log data Social media, photos, audio, video, email Enterprise content Storage Administrators Gather and Identify sources of data 1
  • 9. 9 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Data is the New Oil DATA is the new OIL In its raw form, oil has little value… Once processed and refined, it helps to power the world!
  • 10. 10 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Structured, Repeatable, Linear OLAP cube Unstructured, Exploratory, Iterative New Capabilities to Analyze the Data Reports Visualization and Discovery Hadoop Data warehousing Stream Computing Integration and Governance Text Analytics Business Analyst Data Scientist Analyze data2
  • 11. 11 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. What does a Data Scientist do? “It’s no longer hard to find the answer to a given question; the hard part is finding the right question. And as questions evolve, we gain better insight into our ecosystem and our business.” -- Kevin Weil, Lead Analyst at Twitter A data scientist must have… – Strong business acumen – Modeling, statistics, analytics and math skills – Ability to communicate findings, tell a story from the data, to both business and IT leaders Inquisitive: exploring, doing “what if?” analyses, questioning existing assumptions and processes to spot trends, patterns and hidden insight. Computers are useless. They can only give you answers. – Pablo Picasso Source: http://www-01.ibm.com/software/data/infosphere/data-scientist/ http://blog.cloudera.com/blog/2010/09/twitter-analytics-lead-kevin-weil-and-a-presenter-at-hadoop-world-interviewed/
  • 12. 12 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Data Information Knowledge Wisdom (DIKW) Wisdom Applied I better stop the car! Knowledge Context The traffic light I am driving towards has turned red Information Meaning South-facing light at corner of Pitt and George streets has turn red Data Raw červený 685 nm, 421 THz, #FF0000 http://legoviews.com/2013/04/06/put-knowledge-into-action-and-enhance-organisational-wisdom-lsp-and-dikw/
  • 13. 13 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Better Decisions for New Business Outcomes Day-to-day operations based on real-time analytics Strategic planning based on science, trends, patterns and insight Know Everything about your Customers Innovate new products at Speed and Scale Instant Awareness of Fraud and Risk Exploit Instrumented Assets Run Zero-latency Operations Business Executive Make Decisions and Take Action 3 Empowered Employees
  • 14. 14 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. statistical models Decision Making Process in the Era of big data Real-time Analytics Database Administrators System of Insight Strategic planning based on science, trends, patterns and insight Dashboard Storage Administrators Gather and Identify sources of data 1 Day-to-day operations based on real-time analytics Business Executives Empowered Employees Make Decisions and Take Action 3Data Scientists Business Analysts Analyze data2
  • 15. What is Big Data? Big Data Use Cases IBM Analytics Platform IBM Spectrum Scale Agenda
  • 16. 16 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Practical Use Cases – The Analytics Landscape Degree of Complexity CompetitiveAdvantage Standard Reporting Ad hoc reporting Query/drill down Alerts Simulation Forecasting Predictive modeling Optimization What exactly is the problem? What will happen next if ? What if these trends continue? What could happen…. ? What actions are needed? How many, how often, where? What happened? Stochastic Optimization Based on: Competing on Analytics, Davenport and Harris, 2007 Descriptive Prescriptive Predictive How can we achieve the best outcome? How can we achieve the best outcome including the effects of variability?
  • 17. 17 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Innovate New Products and Services at Speed and Scale Vestas, the world’s largest wind energy company, was able to use big data and IBM technology to increase wind power generation through optimal turbine placement. Reducing the time to analyze petabytes of data with IBM Big Insights software and IBM Spectrum Scale “Before, it could take us three weeks to get a response to some of our questions simply because we had to process a lot of data. We expect that we can get answers for the same questions now in 15 minutes.” – Lars Christian Christensen
  • 18. 18 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. If You are Not Paying for it… Then you are not the Customer, … You are the Product Being Sold! How much is each user worth to Social Media companies? Sources: Geek & Poke comic, “Let’s Talk about Data” by Neha Mehta
  • 19. 19 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Social Network Public Database How valuable is Amy to my retail sales? Who does she influence? What do they spend? Retailer Amy Bearn 32, Married, mother of 3, Accountant Telco Score: 91 CPG Score: 76 Fashion Score: 88 Telco company How valuable is Amy to my mobile phone network? How likely is she to switch carriers? How many other customers will follow Merged Network Calling Network 360 Degree View of the Customer – A Demographic of One
  • 20. 20 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Deep Individual Customer Insight • Preferences • Interests • Likes Run Zero-Latency Operations Direct Channel Workflow Enrich Initiate Direct Response Initiate Channel Response Initiate Process or Workflow Enrich Customer Profile Real-time Decision
  • 21. 21 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. How Target® Figured Out a Teen Girl Was Pregnant Before Her Father Did Every time you go shopping, you share intimate details about your consumption patterns with retailers. Target has figured out how to data-mine whether you have a baby on the way Looked at historical buying data for all the ladies who had signed up for Target baby registries – Unscented soaps and lotions – Calcium, magnesium and zinc supplements About 25 products help generate “pregnancy prediction” score and her “baby due date” Target sends coupons timed to very specific stages of her pregnancy Source: http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/ “My daughter got this in the mail. She’s still in high school, and you’re sending her coupons for baby clothes and cribs?” -- Angry father of teen girl “I had a talk with my daughter,…She’s due in August. I owe you an apology.” -- Same father, 3 days later
  • 22. 22 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Exploit Instrumented Assets Doctors from University of Ontario apply big data to neonatal infant monitoring to predict infection Detect Neonatal Patient Symptoms Up to 24 Hours sooner Continuously correlate data Thousands of events each second Signal Processing and Data Cleansing Heart Rate Variability
  • 23. What is Big Data? Big Data Use Cases IBM Analytics Platform IBM Spectrum Scale Agenda
  • 24. 24 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. The IBM big data platform advantage BI / Reporting BI / Reporting Exploration / Visualization Functional App Industry App Predictive Analytics Content Analytics Analytic Applications IBM big data platform Systems Management Application Development Visualization & Discovery Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse • The platform provides benefit as you move from an entry point to a second and third project • Shared components and integration between systems lowers deployment costs • Key points of leverage • Reuse text analytics across streams and BigInsights • Hadoop connectors between Streams and Information Integration • Common integration, metadata and governance across all engines • Accelerators built across multiple engines – common analytics, models, and visualization
  • 25. 25 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Simplify your data warehouse Customer Need – Business users are hampered by the poor performance of analytics of a general-purpose enterprise warehouse – queries take hours to run – Enterprise data warehouse is encumbered by too much data for too many purposes – Need to ingest huge volumes of structured data and run multiple concurrent deep analytic queries against it – IT needs to reduce the cost of maintaining the data warehouse Value Statement – Speed and Simplicity for deep analytics – 100s to 1000s users/second for operation analytics Customer examples – Catalina Marketing – executing 10x the amount of predictive workloads with the same staff System for Transactions System for Analytics System for Operational Analytics Get started with IBM PureData Systems!
  • 26. 26 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Ad-Hoc versus Operational Analytics
  • 27. 27 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Analyze streaming data in Real time Customer Need – Harness and process streaming data sources – Select valuable data and insights to be stored for further processing – Quickly process and analyze perishable data, and take timely action Value Statement – Significantly reduced processing time and cost – process and then store what’s valuable – React in real-time to capture opportunities before they expire Customer examples – Ufone – Telco Call Detail Record (CDR) analytics for customer churn prevention Get started with IBM Streams! Visualization Streams Runtime Deployments Sync Adapters Analytic Operators Source Adapters Automated and Optimized Deployment Streaming Data Sources Streams Studio IDE
  • 28. 28 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Dominant Players vs. Contender platforms OS Tape Cloud Management Big Data & Analytics Dominant Player Microsoft Windows Quantum DLT Amazon Web Services Cloudera Contender platform Linux Linear Tape Open (LTO) OpenStack Open Data Platform Supporters of Contender platform IBM, RedHat, SUSE, Oracle and others IBM, HP, Certance and others IBM, HP, Rackspace, RedHat, Dell, Cisco, VMware and others IBM, Pivotal, Hortonworks and others
  • 29. 29 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. IBM InfoSphere BigInsights is a 100% standard Hadoop distribution By default, open source components are always deployed Elect to use proprietary capabilities depending on your needs In some cases, proprietary capabilities offer significant benefits Open standards first, but with freedom of choice HDFS YARN HIVE MapReduce PIG Spectrum Scale Platform Symphony Big SQL Adaptive MapReduce BigSheets Share data with non-Hadoop applications and simplify data management Re-use existing tools and expertise, Avoid additional development costs Boost performance, support time-critical workloads, do more with less True multi-tenancy to boost service levels and avoid duplication on infrastructure Simplify access for end-users, minimize software development
  • 30. 30 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Text Analytics Spectrum Scale Platform Symphony IBM BigInsights Enterprise Management System ML on Big R Distributed R IBM Open Platform with Apache Hadoop IBM BigInsights Data Scientist IBM BigInsights Analyst Big SQL Big Sheets Big SQL BigSheets IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Three new user-centric modules founded on an Open Data Platform IBM Open Platform with Apache Hadoop is IBM’s own 100% open source Apache Hadoop distribution. IBM will include the ODP common kernel when available. Business Analyst Data Scientist Administrator
  • 31. 31 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Platform Symphony Integrates with Hadoop YARN uses a pluggable architecture for schedulers. – FIFO, Fair, and Capacity Schedulers implemented this way – Symphony EGO is also implemented this way. Therefore, scheduler is completely transparent to YARN Applications. ISV Certification for Platform Symphony is not required. YARN (open source) Fair Capacity Symphony EGO FIFO Like other schedulers, queues and policies are defined in Platform Symphony EGO. App1 App2 App3
  • 32. 32 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Spark, a Complement to Hadoop 32 • Spark - complement Hadoop, not replace • Provides distributed memory abstractions for clusters to support applications that repeatedly use a working set of data, • Iterative algorithms (machine learning), • Interactive data mining tools (R, Python, ..) • Spark Programming Model – Resilient Distributed Datasets (RDDs) • Immutable collections partitioned across cluster that can be rebuilt if a partition is lost • Created by transforming data in stable storage using data flow operators (map, filter, group-by, …) • Can be cached across parallel operations • Spark uses HDFS or IBM Spectrum Scale • Can use any Hadoop data source • Use Hadoop InputFormats and OutputFormats • Spark runs on YARN • Can run on the same cluster with MapReduce • Spark works with Hadoop ecosystem • Flume, Sqoop, HBase • Spark architectural considerations • Keep dataset in memory • Spark programs can be bottlenecked by any resource in cluster: CPU, network bandwidth, memory. Most often, if data fits in memory, the bottleneck is network bandwidth. HDFS or IBM Spectrum Scale YARN
  • 33. 33 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. IBM InfoSphere BigInsights – Big SQL Native Hadoop Data Sources CSV SEQ Parquet RC AVRO ORC JSON Custom Optimized SQL MPP Run-time Big SQL SQL based Application IBM’s SQL for Hadoop • Makes Hadoop data accessible to a wider audience • Familiar, widely known syntax • Leverage native Hadoop data sources Complements the Data Warehouse • Exploratory analytics • Sandbox, Data Lake Included in IBM BigInsights Use familiar SQL tools • Cognos, SPSS, Tableau, MicroStrategy
  • 34. 34 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Information Ingestion and Operational Information Decision Management BI and Predictive Analytics Navigation and Discovery Intelligence Analysis Landing Area, Analytics Zone and Archive Raw Data Structured Data Text Analytics Data Mining Entity Analytics Machine Learning Real-time Analytics Video/Audio Network/Sensor Entity Analytics Predictive Exploration, Integrated Warehouse, and Mart Zones Discovery Deep Reflection Operational Predictive Stream Processing Data Integration Master Data Streams Information Governance, Security and Business Continuity Architecture Pattern for big data Implementation Application Transaction Machine data Social media, email Enterprise content Data at Rest
  • 35. What is Big Data? Big Data Use Cases IBM Analytics Platform IBM Spectrum Scale Agenda
  • 36. 36 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Why use IBM Spectrum Scale™ Extreme Scalability Add or Remove nodes and storage, without disruption or performance impact to applications Universal Access to Data All servers and clients have access to data through a variety of file and object protocols High Performance Parallel access with no hot spots Proven Reliability Used by over 200 of the top 500 Supercomputers Survive any node or storage failure with Distributed RAID and redundant components
  • 37. 37 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Hadoop Analytics – HDFS vs IBM Spectrum Scale™ HDFS Save Results Discard Rest IBM Hadoop Connector allows Map/Reduce programs to process data without application changes IBM Spectrum Scale Application data stored on IBM Spectrum Scale is readily available for analytics Save Results JFS2 NTFS EXT4 Data Sources mashup of structured and unstructured data from a variety of sources Actionable Insights Provides answers to the Who, What, Where, When, Why and How Business Intelligence & Predictive Analytics > Competitive Advantages > New Threats and Fraud > Changing Needs and Forecasting > And More!
  • 38. 38 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Hadoop HDFS HDFS NameNode HA added in version 2.0. NameNode HA in active/passive configuration Difficulty to ingest data – special tools required Lacking enterprise readiness No single point of failure, distributed metadata in active/active configuration since 1998 Ingest data using policies for data placement Versatile, Multi-purpose, Hybrid Storage (locality and shared) Enterprise ready with support for advanced storage features (Encryption, DR, replication, SW RAID etc) Large block-sizes – poor support for small files Variable block sizes – suited to multiple types of data and metadata access pattern Scale compute and storage independently (Policy based ILM) Compute and Storage tightly coupled – leading to very low CPU utilization Single-purpose, Hadoop MapReduce only POSIX file system – easy to use and manage Non-POSIX file system – obscure commands. Does not support in-place updates. IBM Spectrum Scale HDFS versus IBM Spectrum Scale™
  • 39. 39 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. HDFS Namenode Secondary Namenode IBM Spectrum Scale™ – File Placement Optimization SAN Internal, Direct-Attach TCP/IP or RDMA Network • Spectrum Scale avoids the need for a central namenode, a common failure point in HDFS • Avoid long recovery times in the event of namenode failure • Spectrum Scale can intermix FPO with standard NSD server and client nodes in the same cluster • POSIX compliance which is key to avoid data islands. • Robustness and performance at massive scale and maturity File Placement Optimization (FPO) Creates a “shared nothing” cluster similar to HDFS in Hadoop environments
  • 40. 40 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Share-Nothing versus Shared-Disk Deployments Data Data Data Parity Data Data Data Copy Copy Copy Copy Copy Copy TCP/IP or RDMA Need more compute? Add another node! Spectrum Scale and Elastic Storage Server reduce storage to one RAID-protected copy of the data Scale compute and storage capacity separately Spectrum Scale FPO can keep 1,2 or 3 replicas of the data Need more storage capacity? Add another node! 3x versus 1.3x TCP/IP or RDMA
  • 41. 41 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. IBM Spectrum Scale™ – Software, Systems or Cloud Services Software • Install software on your own choice of Industry standard x86 or POWER servers Pre-built Systems • Elastic Storage Server with distributed RAID • Storwize V7000 Unified Cloud Services • Spectrum Scale can be deployed on any Cloud Scale
  • 42. 42 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Session summary Big data is being generated by everything around us – Every digital process and social media exchange produces it – Systems, sensors and mobile devices transmit it Big data is arriving from multiple sources at amazing velocities, volumes and varieties To extract meaningful value from big data, you need optimal processing power, storage, analytics capabilities, and skills Sources: The Economist, and special thanks to Dr. Bob Sutor, IBM VP, Business Solutions & Mathematical Sciences
  • 43. 43 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Session Evaluations YOUR OPINION MATTERS! Submit four or more session evaluations by 5:30pm Wednesday to be eligible for drawings! *Winners will be notified Thursday morning. Prizes must be picked up at registration desk, during operating hours, by the conclusion of the event. 1 2 3 4
  • 44. 44 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
  • 45. 45 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Big Data & Analytics Building Big Data and Analytics Solutions in the Cloud http://www.redbooks.ibm.com/abstracts/redp5085.html?Open o IBM BigInsights o IBM PureData System for Hadoop o IBM PureData System for Analytics o IBM PureData System for Operational Analytics o IBM InfoSphere Warehouse o IBM Streams o IBM InfoSphere Data Explorer (Watson Explorer) o IBM InfoSphere Data Architect o IBM InfoSphere Information Analyzer o IBM InfoSphere Information Server o IBM InfoSphere Information Server for Data Quality o IBM InfoSphere Master Data Management Family o IBM InfoSphere Optim Family o IBM InfoSphere Guardium Family “Analytics is about examining data to derive interesting and relevant trends and patterns, which can be used to inform decisions, optimize processes, and even drive new business models.”
  • 46. 46 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Research Paper “In this paper, we revisit the debate on the need of a new non-POSIX storage stack for cloud analytics and argue, based on an initial evaluation, that it can be built on traditional POSIX-based cluster filesystems.“
  • 47. 47 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Hadoop for the Enterprise http://www.ibm.com/software/data/infosphere/hadoop/enterprise.html IBM BigInsights for Apache Hadoop provides a 100% open source platform and offers analytic and enterprise capabilities for Hadoop.
  • 48. 48 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. IBM Tucson Executive Briefing Center Tucson, Arizona is home for storage hardware and software design and development IBM Tucson Executive Briefing Center offers: –Technology briefings –Product demonstrations –Solution workshops Take a video tour! – http://youtu.be/CXrpoCZAazg
  • 49. 49 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. About the Speaker Tony Pearson is a Master Inventor and Senior managing consultant for the IBM System Storage™ product line. Tony joined IBM Corporation in 1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings on storage topics covering the entire System Storage product line, Tivoli storage software products, and topics related to Cloud Computing. He interacts with clients, speaks at conferences and events, and leads client workshops to help clients with strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization products. Tony writes the “Inside System Storage” blog, which is read by hundreds of clients, IBM sales reps and IBM Business Partners every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine, and #1 most read IBM blog on IBM’s developerWorks. The blog has been published in series of books, Inside System Storage: Volume I through V. Over the past years, Tony has worked in development, marketing and customer care positions for various storage hardware and software products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in Electrical Engineering, both from the University of Arizona. Tony holds 19 IBM patents for inventions on storage hardware and software products. 9000 S. Rita Road Bldg 9032 Floor 1 Tucson, AZ 85744 +1 520-799-4309 (Office) tpearson@us.ibm.com Tony Pearson Master Inventor, Senior IT Specialist IBM System Storage™
  • 50. 50 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Email: tpearson@us.ibm.com Twitter: twitter.com/az99Øtony Blog: ibm.co/Pearson Books: www.lulu.com/spotlight/99Ø_tony IBM Expert Network on Slideshare: www.slideshare.net/az99Øtony Facebook: www.facebook.com/tony.pearson.16121 Linkedin: www.linkedin.com/profile/view?id=103718598 Additional Resources from Tony Pearson
  • 51. 51 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Continue growing your IBM skills ibm.com/training provides a comprehensive portfolio of skills and career accelerators that are designed to meet all your training needs. If you can’t find the training that is right for you with our Global Training Providers, we can help. Contact IBM Training at dpmc@us.ibm.com Global Skills Initiative
  • 52. 52 IBM Systems Technical University, October 5-9 | Hilton Orlando © Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Trademarks and Disclaimers Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries. Other product and service names might be trademarks of IBM or other companies. Information is provided "AS IS" without warranty of any kind. The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer. Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning. Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here. Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM representative or Business Partner for the most current pricing in your geography. Photographs shown may be engineering prototypes. Changes may be incorporated in production models. © IBM Corporation 2015. All rights reserved. References in this document to IBM products or services do not imply that IBM intends to make them available in every country. Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the World Wide Web at http://www.ibm.com/legal/copytrade.shtml. ZSP03490-USEN-00