IBM Big Data Analytics Concepts and Use Cases

© Copyright IBM Corporation 2015. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
What Is Big Data?
Architectures and Practical Use Cases
Tony Pearson
Master Inventor and Senior IT Specialist
IBM Corporation

2
IBM Systems Technical University, October 5-9 | Hilton Orlando
© Copyright IBM Corporation 2015. Technical University/Symposia materials
may not be reproduced in whole or in part without the prior written permission of
IBM.
Abstract
Do you understand the storage
implications of big data analytics?
This session will explain what big
data is, provide some practical use
cases, then explain the IBM
products that support big data

3
IBM.
This week with Tony Pearson
Day Time Topic
Monday 10:15am Opening Session – Storage
01:45pm IBM's Cloud Storage Options
Tuesday 11:30am Software Defined Storage -- Why? What? How? (repeats Friday)
03:15pm The Pendulum Swings Back –
Understanding Converged and Hyperconverged Environments
04:30pm New Generation of Storage Tiering:
Less Management Lower Cost and Increased Performance
Wednesday 09:00am What Is Big Data? Architectures and Practical Use Cases
01:45pm Data Footprint Reduction – Understanding IBM Storage Efficiency Options
03:15pm IBM Spectrum Virtualize – SVC, Storwize and FlashSystem V9000 (repeats Friday)
Thursday 10:15am IBM Spectrum Scale and Elastic Storage Offerings
01:45pm IBM Spectrum Scale for File and Object storage
03:15pm IBM Storage Integration with OpenStack
05:45pm Meet the Experts
Friday 09:00am Software Defined Storage -- Why? What? How?
10:15am IBM Spectrum Virtualize – SVC, Storwize and FlashSystem V9000

What is Big Data?
Big Data Use Cases
IBM Analytics Platform
IBM Spectrum Scale
Agenda

5
IBM.
What is Big Data?
Data sets so large and complex
that it becomes difficult to process
using relational databases
The challenges include capture,
curation, storage, search, sharing,
transfer, analysis and visualization
Analysis of a single large set of
related data allows correlations to
be found
Can be used to identify trends,
patterns and insights to make
better decisions
Source: Wikipedia

6
IBM.
OLAP
cube
Extract
Transform
Load (ETL)
Strategic planning
based on historical
analysis and
speculation
Day-to-day
operations based on
reports, news,
intuition
Business Executives
Make decisions
3
Traditional Decision Making Process
Reports
Batch
Processing
Transaction and
Application data
Database
Administrators
System of Record
Gather data
1
Business
Analysts
Analyze
2

7
IBM.
What has Changed in the Last Few Decades?
1986 2015
6%
99%
Analog
data
Digital
data
Transaction and
Application data
Machine
data
Social media,
email
Enterprise
content
20%
Structured data
80%
Unstructured data

8
IBM.
New Sources of Data to Analyze –
the Four V’s of big data
Volume
– Scale of data has grown beyond
relational database capabilities
Variety
– Machine data, enterprise content,
and social media and email
Velocity
– Computing has advanced to
receive and analyze real-time
data streams
Veracity
– How much can you trust the data
is right and accurate?
Transaction and
Application data
Database
Administrators
System of Record
System of Engagement
System of Insight
Machine
Data,
log data
Social
media,
photos,
audio,
video,
email
Enterprise
content
Storage
Administrators
Gather and Identify sources of data
1

9
IBM.
Data is the New Oil
DATA is the
new OIL In its raw form,
oil has little value…
Once processed
and refined,
it helps to power the
world!

10
IBM.
Structured,
Repeatable,
Linear
OLAP
cube
Unstructured,
Exploratory,
Iterative
New Capabilities to Analyze the Data
Reports Visualization and
Discovery
Hadoop
Data warehousing
Stream
Computing
Integration and
Governance
Text Analytics
Business
Analyst
Data
Scientist
Analyze data2

11
IBM.
What does a Data Scientist do?
“It’s no longer hard to find the answer to a
given question; the hard part is finding the
right question. And as questions evolve, we
gain better insight into our ecosystem and
our business.”
-- Kevin Weil, Lead Analyst at Twitter
A data scientist must have…
– Strong business acumen
– Modeling, statistics, analytics and math skills
– Ability to communicate findings, tell a story
from the data, to both business and IT leaders
Inquisitive: exploring, doing “what if?”
analyses, questioning existing assumptions
and processes to spot trends, patterns and
hidden insight.
Computers are useless.
They can only give you
answers.
– Pablo Picasso
Source: http://www-01.ibm.com/software/data/infosphere/data-scientist/
http://blog.cloudera.com/blog/2010/09/twitter-analytics-lead-kevin-weil-and-a-presenter-at-hadoop-world-interviewed/

12
IBM.
Data Information Knowledge Wisdom (DIKW)
Wisdom
Applied I better stop the car!
Knowledge
Context
The traffic light I am
driving towards has
turned red
Information
Meaning
South-facing light at
corner of Pitt and George
streets has turn red
Data
Raw
červený
685 nm, 421 THz,
#FF0000
http://legoviews.com/2013/04/06/put-knowledge-into-action-and-enhance-organisational-wisdom-lsp-and-dikw/

13
IBM.
Better Decisions for New Business Outcomes
Day-to-day
operations based
on real-time
analytics
Strategic planning
based on science,
trends, patterns
and insight
Know Everything
about your
Customers
Innovate new
products at Speed
and Scale
Instant Awareness
of Fraud and Risk
Exploit Instrumented
Assets
Run Zero-latency
Operations
Business
Executive
Make Decisions
and Take Action
3
Empowered
Employees

14
IBM.
statistical
models
Decision Making Process in the Era of big data
Real-time
Analytics
Database
Administrators
System of Insight
Strategic planning
based on science,
trends, patterns and
insight
Dashboard
Storage
Administrators
Gather and Identify sources of data
1
Day-to-day
operations based
on real-time
analytics
Business Executives
Empowered Employees
Make Decisions
and Take Action
3Data
Scientists
Business
Analysts
Analyze data2

16
IBM.
Practical Use Cases – The Analytics Landscape
Degree of Complexity
CompetitiveAdvantage
Standard Reporting
Ad hoc reporting
Query/drill down
Alerts
Simulation
Forecasting
Predictive modeling
Optimization
What exactly is the problem?
What will happen next if ?
What if these trends continue?
What could happen…. ?
What actions are needed?
How many, how often, where?
What happened?
Stochastic Optimization
Based on: Competing on Analytics, Davenport and Harris, 2007
Descriptive
Prescriptive
Predictive
How can we achieve the best
outcome?
How can we achieve the best
outcome including the effects of
variability?

17
IBM.
Innovate New Products and Services at Speed and Scale
Vestas, the world’s largest wind energy company, was able to use
big data and IBM technology to increase wind power generation
through optimal turbine placement.
Reducing the time to analyze petabytes of data with
IBM Big Insights software and IBM Spectrum Scale
“Before, it could take us three
weeks to get a response to
some of our questions simply
because we had to process a
lot of data. We expect that we
can get answers for the same
questions now in 15 minutes.”
– Lars Christian Christensen

18
IBM.
If You are Not Paying for it…
Then you are not the Customer,
… You are the Product Being Sold!
How much is each
user worth to Social
Media companies?
Sources: Geek & Poke comic,
“Let’s Talk about Data” by Neha Mehta

19
IBM.
Social Network Public
Database
How valuable is Amy to my retail
sales? Who does she influence?
What do they spend?
Retailer
Amy Bearn
32, Married, mother of 3,
Accountant
Telco Score: 91
CPG Score: 76
Fashion Score: 88
Telco
company
How valuable is Amy to my mobile
phone network? How likely is she to
switch carriers? How many other
customers will follow
Merged Network
Calling Network
360 Degree View of the Customer –
A Demographic of One

20
IBM.
Deep Individual
Customer Insight
• Preferences
• Interests
• Likes
Run Zero-Latency Operations
Direct Channel Workflow Enrich
Initiate Direct
Response
Initiate
Channel
Response
Initiate
Process or
Workflow
Enrich
Customer
Profile
Real-time
Decision

21
IBM.
How Target® Figured Out a Teen Girl Was Pregnant
Before Her Father Did
Every time you go shopping, you share intimate
details about your consumption patterns with
retailers.
Target has figured out how to data-mine whether
you have a baby on the way
Looked at historical buying data for all the ladies
who had signed up for Target baby registries
– Unscented soaps and lotions
– Calcium, magnesium and zinc supplements
About 25 products help generate “pregnancy
prediction” score and her “baby due date”
Target sends coupons timed to very specific
stages of her pregnancy
Source: http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/
“My daughter got this in the mail. She’s
still in high school, and you’re sending
her coupons for baby clothes and cribs?”
-- Angry father of teen girl
“I had a talk with my daughter,…She’s due
in August. I owe you an apology.”
-- Same father, 3 days later

22
IBM.
Exploit Instrumented Assets
Doctors from University of Ontario apply big data to
neonatal infant monitoring to predict infection
Detect Neonatal Patient Symptoms
Up to 24 Hours sooner
Continuously correlate data
Thousands of events
each second
Signal Processing
and Data Cleansing
Heart Rate Variability

24
IBM.
The IBM big data platform advantage
BI /
Reporting
BI /
Reporting
Exploration /
Visualization
Functional
App
Industry
App
Predictive
Analytics
Content
Analytics
Analytic Applications
IBM big data platform
Systems
Management
Application
Development
Visualization
& Discovery
Accelerators
Information Integration & Governance
Hadoop
System
Stream
Computing
Data
Warehouse
• The platform provides benefit
as you move from an entry
point to a second and third
project
• Shared components and
integration between systems
lowers deployment costs
• Key points of leverage
• Reuse text analytics across streams and
BigInsights
• Hadoop connectors between Streams
and Information Integration
• Common integration, metadata and
governance across all engines
• Accelerators built across multiple engines
– common analytics, models, and
visualization

25
IBM.
Simplify your data warehouse
Customer Need
– Business users are hampered by the poor
performance of analytics of a general-purpose
enterprise warehouse – queries take hours to
run
– Enterprise data warehouse is encumbered by
too much data for too many purposes
– Need to ingest huge volumes of structured data
and run multiple concurrent deep analytic
queries against it
– IT needs to reduce the cost of maintaining the
data warehouse
Value Statement
– Speed and Simplicity for deep analytics
– 100s to 1000s users/second for operation
analytics
Customer examples
– Catalina Marketing – executing 10x the amount
of predictive workloads with the same staff
System for Transactions
System for Analytics
System for Operational Analytics
Get started with
IBM PureData Systems!

26
IBM.
Ad-Hoc versus Operational Analytics

27
IBM.
Analyze streaming data in Real time
Customer Need
– Harness and process streaming data
sources
– Select valuable data and insights to be
stored for further processing
– Quickly process and analyze perishable
data, and take timely action
Value Statement
– Significantly reduced processing time and
cost – process and then store what’s
valuable
– React in real-time to capture opportunities
before they expire
Customer examples
– Ufone – Telco Call Detail Record (CDR)
analytics for customer churn prevention
Get started with IBM Streams!
Visualization
Streams Runtime
Deployments
Sync
Adapters
Analytic
Operators
Source
Adapters
Automated
and
Optimized
Deployment
Streaming Data
Sources
Streams Studio IDE

28
IBM.
Dominant Players vs. Contender platforms
OS Tape Cloud
Management
Big Data &
Analytics
Dominant
Player
Microsoft
Windows
Quantum
DLT
Amazon Web
Services
Cloudera
Contender
platform
Linux Linear Tape
Open (LTO)
OpenStack Open Data
Platform
Supporters
of Contender
platform
IBM,
RedHat,
SUSE,
Oracle and
others
IBM, HP,
Certance
and others
IBM, HP,
Rackspace,
RedHat, Dell,
Cisco, VMware
and others
IBM, Pivotal,
Hortonworks
and others

29
IBM.
IBM InfoSphere BigInsights is a 100% standard Hadoop distribution
By default, open source components are always deployed
Elect to use proprietary capabilities depending on your needs
In some cases, proprietary capabilities offer significant benefits
Open standards first, but with freedom of choice
HDFS
YARN
HIVE
MapReduce
PIG
Spectrum
Scale
Platform
Symphony
Big SQL
Adaptive
MapReduce
BigSheets
Share data with non-Hadoop applications
and simplify data management
Re-use existing tools and expertise,
Avoid additional development costs
Boost performance, support time-critical
workloads, do more with less
True multi-tenancy to boost service levels
and avoid duplication on infrastructure
Simplify access for end-users,
minimize software development

30
IBM.
Text Analytics
Spectrum Scale Platform Symphony
IBM BigInsights
Enterprise Management
System ML on Big R
Distributed R
IBM Open Platform with Apache Hadoop
IBM BigInsights Data Scientist
IBM BigInsights Analyst
Big SQL
Big Sheets
Big SQL
BigSheets
IBM BigInsights for
Apache Hadoop
IBM BigInsights for Apache Hadoop
Three new user-centric modules founded on an Open Data Platform
IBM Open Platform with Apache Hadoop is IBM’s own 100% open source Apache
Hadoop distribution. IBM will include the ODP common kernel when available.
Business Analyst
Data Scientist
Administrator

31
IBM.
Platform Symphony Integrates with Hadoop
YARN uses a pluggable architecture for schedulers.
– FIFO, Fair, and Capacity Schedulers implemented this way
– Symphony EGO is also implemented this way.
Therefore, scheduler is completely transparent to YARN Applications.
ISV Certification for Platform Symphony is not required.
YARN (open source)
Fair Capacity
Symphony
EGO
FIFO
Like other schedulers, queues and policies are defined in Platform Symphony EGO.
App1 App2 App3

32
IBM.
Spark, a Complement to Hadoop
32
• Spark - complement Hadoop, not replace
• Provides distributed memory abstractions for clusters to support applications that repeatedly use a
working set of data,
• Iterative algorithms (machine learning),
• Interactive data mining tools (R, Python, ..)
• Spark Programming Model – Resilient Distributed Datasets (RDDs)
• Immutable collections partitioned across cluster that can be rebuilt if a partition is lost
• Created by transforming data in stable storage using data flow operators (map, filter, group-by, …)
• Can be cached across parallel operations
• Spark uses HDFS or IBM Spectrum Scale
• Can use any Hadoop data source
• Use Hadoop InputFormats and OutputFormats
• Spark runs on YARN
• Can run on the same cluster with MapReduce
• Spark works with Hadoop ecosystem
• Flume, Sqoop, HBase
• Spark architectural considerations
• Keep dataset in memory
• Spark programs can be bottlenecked by any
resource in cluster: CPU, network bandwidth,
memory. Most often, if data fits in memory, the
bottleneck is network bandwidth.
HDFS or IBM Spectrum Scale YARN

33
IBM.
IBM InfoSphere BigInsights – Big SQL
Native Hadoop Data Sources
CSV SEQ Parquet RC
AVRO ORC JSON Custom
Optimized SQL MPP Run-time
Big SQL
SQL based
Application
IBM’s SQL for Hadoop
• Makes Hadoop data accessible to a
wider audience
• Familiar, widely known syntax
• Leverage native Hadoop data sources
Complements the Data Warehouse
• Exploratory analytics
• Sandbox, Data Lake
Included in IBM BigInsights
Use familiar SQL tools
• Cognos, SPSS, Tableau, MicroStrategy

34
IBM.
Information
Ingestion and
Operational
Information
Decision
Management
BI and Predictive
Analytics
Navigation
and Discovery
Intelligence
Analysis
Landing Area,
Analytics Zone
and Archive
Raw Data
Structured Data
Text Analytics
Data Mining
Entity Analytics
Machine Learning
Real-time
Analytics
Video/Audio
Network/Sensor
Entity Analytics
Predictive
Exploration,
Integrated Warehouse,
and Mart Zones
Discovery
Deep Reflection
Operational
Predictive
Stream Processing
Data Integration
Master Data
Streams
Information Governance, Security and Business Continuity
Architecture Pattern for big data Implementation
Application
Transaction
Machine
data
Social media,
email
Enterprise
content
Data at Rest

36
IBM.
Why use IBM Spectrum Scale™
Extreme Scalability
Add or Remove nodes and
storage, without disruption or
performance impact to
applications
Universal Access to Data
All servers and clients have access to
data through a variety of file and object
protocols
High Performance
Parallel access with no hot spots
Proven Reliability
Used by over 200 of the top 500 Supercomputers
Survive any node or storage failure with Distributed
RAID and redundant components

37
IBM.
Hadoop Analytics – HDFS vs IBM Spectrum Scale™
HDFS
Save
Results
Discard
Rest
IBM Hadoop Connector
allows Map/Reduce
programs to process data
without application
changes
IBM Spectrum Scale
Application data
stored on IBM
Spectrum Scale is
readily available
for analytics
Save
Results
JFS2
NTFS
EXT4
Data Sources
mashup of structured and unstructured data
from a variety of sources
Actionable Insights
Provides answers to the
Who, What, Where, When,
Why and How
Business Intelligence
& Predictive Analytics
> Competitive Advantages
> New Threats and Fraud
> Changing Needs
and Forecasting
> And More!

38
IBM.
Hadoop HDFS
HDFS NameNode HA added in version 2.0.
NameNode HA in active/passive configuration
Difficulty to ingest data – special tools required
Lacking enterprise readiness
No single point of failure, distributed
metadata in active/active configuration since
1998
Ingest data using policies for data
placement
Versatile, Multi-purpose,
Hybrid Storage (locality and shared)
Enterprise ready with support for advanced
storage features (Encryption, DR, replication,
SW RAID etc)
Large block-sizes – poor support for small files
Variable block sizes – suited to multiple types
of data and metadata access pattern
Scale compute and storage independently
(Policy based ILM)
Compute and Storage tightly coupled –
leading to very low CPU utilization
Single-purpose, Hadoop MapReduce only
POSIX file system – easy to use and manage
Non-POSIX file system – obscure commands.
Does not support in-place updates.
IBM Spectrum Scale
HDFS versus IBM Spectrum Scale™

39
IBM.
HDFS
Namenode
Secondary
Namenode
IBM Spectrum Scale™ – File Placement Optimization
SAN
Internal, Direct-Attach
TCP/IP or RDMA Network
• Spectrum Scale avoids the need for a central namenode, a
common failure point in HDFS
• Avoid long recovery times in the event of namenode
failure
• Spectrum Scale can intermix FPO with standard NSD server
and client nodes in the same cluster
• POSIX compliance which is key to avoid data islands.
• Robustness and performance at massive scale and
maturity
File Placement Optimization
(FPO)
Creates a “shared nothing”
cluster similar to HDFS in
Hadoop environments

40
IBM.
Share-Nothing versus Shared-Disk Deployments
Data
Data
Data Parity
Data
Data
Data
Copy
Copy
Copy
Copy
Copy
Copy
TCP/IP
or RDMA
Need more compute?
Add another node!
Spectrum Scale and Elastic Storage
Server reduce storage to one
RAID-protected copy of the data
Scale compute and storage
capacity separately
Spectrum Scale FPO
can keep 1,2 or 3
replicas of the data
Need more
storage capacity?
Add another
node!
3x versus 1.3x
TCP/IP
or RDMA

41
IBM.
IBM Spectrum Scale™ –
Software, Systems or Cloud Services
Software
• Install software on your
own choice of Industry
standard x86 or
POWER servers
Pre-built Systems
• Elastic Storage Server with
distributed RAID
• Storwize V7000 Unified
Cloud Services
• Spectrum Scale can be
deployed on any Cloud
Scale

42
IBM.
Session summary
Big data is being generated by
everything around us
– Every digital process and social
media exchange produces it
– Systems, sensors and mobile
devices transmit it
Big data is arriving from multiple
sources at amazing velocities,
volumes and varieties
To extract meaningful value from
big data, you need optimal
processing power, storage,
analytics capabilities, and skills
Sources: The Economist, and special thanks to
Dr. Bob Sutor, IBM VP, Business Solutions & Mathematical Sciences

43
IBM.
Session Evaluations
YOUR OPINION MATTERS!
Submit four or more session
evaluations by 5:30pm Wednesday
to be eligible for drawings!
*Winners will be notified Thursday morning. Prizes must be picked up at
registration desk, during operating hours, by the conclusion of the event.
1 2 3 4

44
IBM.

45
IBM.
Big Data & Analytics
Building Big Data and Analytics Solutions in the Cloud
http://www.redbooks.ibm.com/abstracts/redp5085.html?Open
o IBM BigInsights
o IBM PureData System for Hadoop
o IBM PureData System for Analytics
o IBM PureData System for Operational Analytics
o IBM InfoSphere Warehouse
o IBM Streams
o IBM InfoSphere Data Explorer (Watson Explorer)
o IBM InfoSphere Data Architect
o IBM InfoSphere Information Analyzer
o IBM InfoSphere Information Server
o IBM InfoSphere Information Server for Data Quality
o IBM InfoSphere Master Data Management Family
o IBM InfoSphere Optim Family
o IBM InfoSphere Guardium Family
“Analytics is about examining data to derive interesting and relevant
trends and patterns, which can be used to inform decisions, optimize
processes, and even drive new business models.”

46
IBM.
Research Paper
“In this paper, we revisit the debate on
the need of a new non-POSIX storage
stack for cloud analytics and argue,
based on an initial evaluation, that it can
be built on traditional POSIX-based
cluster filesystems.“

47
IBM.
Hadoop for the Enterprise
http://www.ibm.com/software/data/infosphere/hadoop/enterprise.html
IBM BigInsights for Apache Hadoop provides a 100% open source platform and offers
analytic and enterprise capabilities for Hadoop.

48
IBM.
IBM Tucson Executive Briefing Center
Tucson, Arizona is home for
storage hardware and software
design and development
IBM Tucson Executive
Briefing Center offers:
–Technology briefings
–Product demonstrations
–Solution workshops
Take a video tour!
– http://youtu.be/CXrpoCZAazg

49
IBM.
About the Speaker
Tony Pearson is a Master Inventor and Senior managing consultant for the IBM System Storage™ product line. Tony joined
IBM Corporation in 1986 in Tucson, Arizona, USA, and has lived there ever since. In his current role, Tony presents briefings
on storage topics covering the entire System Storage product line, Tivoli storage software products, and topics related to Cloud
Computing. He interacts with clients, speaks at conferences and events, and leads client workshops to help clients with
strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization products.
Tony writes the “Inside System Storage” blog, which is read by hundreds of clients, IBM sales reps and IBM Business Partners
every week. This blog was rated one of the top 10 blogs for the IT storage industry by “Networking World” magazine, and #1
most read IBM blog on IBM’s developerWorks. The blog has been published in series of books, Inside System Storage:
Volume I through V.
Over the past years, Tony has worked in development, marketing and customer care positions for various storage hardware
and software products. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in
Electrical Engineering, both from the University of Arizona. Tony holds 19 IBM patents for inventions on storage hardware and
software products.
9000 S. Rita Road
Bldg 9032 Floor 1
Tucson, AZ 85744
+1 520-799-4309 (Office)
tpearson@us.ibm.com
Tony Pearson
Master Inventor,
Senior IT Specialist
IBM System Storage™

50
IBM.
Email:
tpearson@us.ibm.com
Twitter:
twitter.com/az99Øtony
Blog:
ibm.co/Pearson
Books:
www.lulu.com/spotlight/99Ø_tony
IBM Expert Network on Slideshare:
www.slideshare.net/az99Øtony
Facebook:
www.facebook.com/tony.pearson.16121
Linkedin:
www.linkedin.com/profile/view?id=103718598
Additional Resources from Tony Pearson

51
IBM.
Continue growing your IBM skills
ibm.com/training
provides a comprehensive
portfolio of skills and career
accelerators that are
designed to meet all your
training needs.
If you can’t find the training that is right for
you with our Global Training Providers, we
can help.
Contact IBM Training at dpmc@us.ibm.com
Global Skills Initiative

52
IBM.
Trademarks and Disclaimers
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other
countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks
of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a
registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. UNIX is a registered trademark of The Open
Group in the United States and other countries. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband
Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Linear Tape-Open, LTO, the LTO
Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
Other product and service names might be trademarks of IBM or other companies. Information is provided "AS IS" without warranty of any kind.
The customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental
costs and performance characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not
constitute an endorsement of such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor
announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims related to
non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Some information addresses anticipated future capabilities. Such information is not intended as a definitive statement of a commitment to specific levels of performance, function or
delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The information is presented here to communicate IBM's
current investment and development activities as a good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will
experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the
workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.
Prices are suggested U.S. list prices and are subject to change without notice. Starting price may not include a hard drive, operating system or other features. Contact your IBM
representative or Business Partner for the most current pricing in your geography.
Photographs shown may be engineering prototypes. Changes may be incorporated in production models.
© IBM Corporation 2015. All rights reserved.
References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Trademarks of International Business Machines Corporation in the United States, other countries, or both can be found on the
World Wide Web at http://www.ibm.com/legal/copytrade.shtml.
ZSP03490-USEN-00

IBM Big Data Analytics Concepts and Use Cases

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to IBM Big Data Analytics Concepts and Use Cases

Similar to IBM Big Data Analytics Concepts and Use Cases (20)

Recently uploaded

Recently uploaded (20)

IBM Big Data Analytics Concepts and Use Cases