SlideShare a Scribd company logo
1 of 16
Download to read offline
IBM Software
Understanding big data
so you can act with confidence
More data, more problems? Not if you have an agile, automated
information integration and governance program in place
Understanding big data so you can act with confidence
1 2 3 4 5Introduction The four Vs:
Challenges
inherent to
big data
Big data, big
opportunities
Understanding:
The key to
successfully
leveraging
big data
Why InfoSphere?
Understanding big data so you can act with confidence
3
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
Introduction
Business leaders are eager to harness
the power of big data. However, as the
opportunity increases, it becomes
exponentially more difficult to ensure that
source information is trustworthy and
protected. If this trustworthiness issue is
not addressed directly, end users may lose
confidence in the insights generated from
their data—which can result in a failure to
act on opportunities or against threats.
Take, for example, the case of a large
multinational corporation whose CFO
asked a very simple question every month:
How did sales this month compare to
the same month last year? The answer
turned out to be quite complicated.
Sales were calculated and reported
in various systems: material planning,
financial, rent and royalty, real estate and
so forth. Each system reported a different
number because each system defined
a “sale” slightly differently, and therefore
extracted dissimilar data from the millions
of daily transactions generated around
the world. So what was the truth? No
one actually knew.
1	Introduction
Understanding big data so you can act with confidence
4
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
The sheer volume and complexity of big
data means that the traditional method
of discovering, governing and correcting
information using manual stewardship
may not apply. Information integration and
governance must be implemented within
big data applications, providing appropriate
governance and rapid integration from the
start. By automating information integration
and governance and employing it at the
point of data creation, organizations can
boost confidence in their big data.
A solid information integration and
governance program should include
automated discovery, profiling and
understanding of diverse datasets to
provide context and enable employees
to make informed decisions. It must be
agile to accommodate a wide variety of
data and seamlessly integrate with various
technologies, from data marts to Apache
Hadoop systems. And it must automatically
discover, protect and monitor sensitive
information as part of big data applications.
IBM® InfoSphere® is designed to do all
of these things by evolving information
integration and governance to meet the
challenges presented by big data.
1	Introduction
Understanding big data so you can act with confidence
5
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
Every second of every day, businesses
generate more data. Researchers at
IDC estimate that by the end of 2020,
the amount of data created and copied
annually will exceed 44 zettabytes, or
44 trillion gigabytes. That’s a 40 percent
annual data growth rate every year
through the end of the decade.1
All of
that big data represents a big opportunity
for organizations.
Companies recognize that their big data
contains valuable information. They are
eager to analyze it to obtain actionable
insights that could help them deepen
customer relationships, prevent threats
and fraud, optimize operations and identify
new revenue opportunities.
Unfortunately, extracting valuable
information from big data isn’t as easy
as it sounds. Big data amplifies any
existing problems in your infrastructure,
processes or even the data itself. To make
the most of the information in their systems,
companies must successfully deal with the
The four Vs: Challenges inherent to big data
“four Vs” that distinguish big data:
variety, volume, velocity and veracity.
The first three—variety, volume and velocity—
define big data; when you have a large
volume of data coming in from a wide
variety of applications and formats and it’s
moving and changing at a rapid velocity,
that’s when you know you have big data.
The fourth V, veracity, is a measure of the
accuracy and trustworthiness of your data.
Veracity is a goal—one that the variety,
volume and velocity of big data make
harder to achieve.
Your company can take advantage of the
opportunities available in big data only when
you have processes and solutions that can
handle all four Vs.
2	 The four Vs: Challenges
inherent to big data
According to IBM,“Every day, we create 2.5 quintillion bytes of data—so
much that 90 percent of the data in the world today has been created in the
last two years alone.”2
Understanding big data so you can act with confidence
6
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
Volume
Increasing data volume
is at the heart of the
big data challenge.
Large data volumes can
cause many obvious
technical problems,
such as excessive batch processing times,
bottlenecks and so on.
However, the problem goes deeper than
that. What data should an organization care
about? How will it find that data? How
will it know how to leverage that data for
business use? All of these questions point
to a fundamental need to understand
the data—and manual techniques of
examining data as it comes in are simply
unequal to the task of working with big data.
Variety
Enterprise data
comes from many
sources. Customers
and salespeople
generate data every
time they make a sale.
Accountants produce data every time they
create a spreadsheet. Marketers create
data every time they write a report.
In addition, machine data—from sources
such as meter readings and IT system
logs—can increase data volumes
enormously. Organizations also get
constant streams of data from external
sources such as suppliers, consultants,
research firms and news feeds.
All of this data is stored in a dizzying array
of formats. Some gets stored in structured
formats in databases or enterprise resource
planning (ERP) applications, but much of it
is in documents, email messages or even
photos and videos—unstructured formats
that are challenging to manage.
And data doesn’t rest once it is in storage.
It must be moved from application to
application and from system to system so
managers and executives can interpret the
data and come to meaningful conclusions.
2	 The four Vs: Challenges
inherent to big data
What data should an organization
care about? How will it find
that data?
Understanding big data so you can act with confidence
7
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
Veracity
The first three Vs—
variety, volume and
velocity—define big
data. In contrast, the
fourth V—veracity—
is a desirable aspect
of data that organizations want, but
don’t always get.
Veracity is directly related to the
trustworthiness of an organization’s data.
The first step toward building trust into data
is to understand which data is needed and
for what business purpose. Can knowledge
workers be sure that the data is accurate,
current and complete—and therefore be
confident in any decisions that are based
on that data?
2	 The four Vs: Challenges
inherent to big data
Velocity
In almost every industry,
data is coming in
faster than ever. In
activities such as
stock trades, fraud
detection or patient
health monitoring, seconds—sometimes
microseconds—can be critically important.
Organizations must be able to understand
data as it is streaming in. In most cases,
getting that all-important information
requires moving data quickly from one
application or repository to another,
where it can be processed and analyzed.
Unfortunately, many data integration
solutions lack the high performance that
big data projects require. There is not
enough time to collect the data and
process it in batches—particularly if
a real-time or near–real-time response
is warranted.
In most cases, getting that all-important information requires moving
data quickly from one application or repository to another, where it can be
processed and analyzed.
Understanding big data so you can act with confidence
8
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
Big data presents significant opportunities
for increased growth and profitability—and
forward-thinking organizations are already
realizing some of these benefits:
•	 A retailer actively uses social media
data to analyze customer sentiment
and improve loyalty, helping to drive
revenue growth
•	 A transportation company uses
machine data to optimize logistics,
thereby reducing shipping costs
•	 A telecommunications company
slashed developer costs by over
50 percent by articulating clear terms
and policies for the data it needed
•	 A manufacturing company enhanced
the trustworthiness of its reports after
it discovered and eliminated 37 unique
definitions of “customer” across its
enterprise, and agreed to a single,
standard definition
Capturing these sorts of benefits from big
data requires knowing what the business
needs and being able to find key items
within the larger mass of big data.
Begin by articulating the goals of the
business. Determine the analytics and
reports necessary to support those
business objectives. Now you can make
decisions about the data needed to inform
the reports.
Big data, big opportunities
3	 Big data, big opportunities
Standard terminology is a critical piece of this
process. For example, there should be no
misunderstanding about what constitutes a
“sale” or a “customer.” With standard terms,
organizations can create standard policies,
such as defining the data that comprises a
complete customer record.
Armed with objectives and standard terms
and policies, teams can create data
processing flows that automate the extraction
of relevant data from the mass of big data.
Importantly, this understanding also includes
knowing at all times where data is coming
from, how it is being manipulated and where
it is going. This chain of understanding builds
trust into information—and promotes
confident decision making.
Understanding big data so you can act with confidence
9
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
You can’t govern what you don’t
understand
It’s an unfortunately common refrain across
business and IT users when it comes to
discussing data: “That’s not what I meant!”
Organizations, departments and sectors often
use different terms or labels for the same
information, leading to confusion over details
such as what constitutes a customer (is it a
household or an individual?).
In one case, a large manufacturer had 37 different
definitions of the term “Employee ID” across
multiple divisions. Using spreadsheets to record
and compare the representations, the company
discovered at least 15 different data types
and formats, with different characteristics and
usages—a result of multiple legacy systems,
acquired companies with their own systems
and homegrown applications. This meant that
IT spent extra time and used undocumented
knowledge to determine which version to use
in which reports. Developers and analysts then
spent more time determining which data to
use in which tests, examples and tasks. And
business users were hampered by data quality
issues coming from duplicated or missed data.
A large European telecom company faced
similar issues when it discovered not just
terminology problems but also an inability to
trace data back to its source. Using InfoSphere
Information Governance Catalog, the company
standardized business terms and policies,
which helped increase confidence in reports.
Next, it redesigned integration processes and
built a clear blueprint, reducing time required
to write actual integration jobs by 50 percent.
Finally, it used the InfoSphere tool to establish a
clear metadata lineage—from source to target,
including all transformations—that uncovered
redundant data. Consolidating and archiving
that data led to a 75 percent reduction in
storage space.
3	 Big data, big opportunities
Understanding big data so you can act with confidence
10
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?4	 Understanding: The key to
successfully leveraging big data
Understanding: The key to successfully leveraging big data
Successful businesses depend on
trusted information. IBM InfoSphere
Information Governance Catalog helps
organizations create an understanding of
big data that enables it to be converted
into trusted information. It allows business
users to play an active role in information-
centric projects and to collaborate with
technical teams—all without the need
for technical training. The end result:
an organization with a consistent
understanding of information, what
it means, how it is used and why it
can be trusted. Decisions are more
accurate and business opportunities
can be readily captured.
Defining “truth”
In a philosophy class, students might
consider the idea that truth is ultimately
unknowable or that truth is constantly
changing. Most organizations do not have
that luxury. Enterprises must have agreed-
upon definitions for important terms so they
can monitor and act on key metrics.
Consider a global financial institution that
has grown rapidly through mergers and
acquisitions. Because each of the
company’s various lines of business grew
independently, each has its own unique
processes, IT systems and definitions for
important terms. It isn’t uncommon for
the CIO, CFO and CMO to use different
definitions, which can lead to confusion
when it comes to analytics and reports.
InfoSphere Information Governance Catalog
includes a business glossary that enables
business and IT to create and agree on
definitions, rules and policies. InfoSphere
Information Governance Catalog also
includes data modeling capabilities that
allow data architects to determine where
each piece of data will come from and
where it will go. This capability helps ensure
that everyone involved in the big data
project knows exactly what key metrics
mean and where the data should originate—
establishing the “truth” as it relates to their
business data.
Understanding big data so you can act with confidence
11
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
Defining “trustworthy”
Unfortunately, it isn’t enough just to
establish policies and definitions and hope
that people will follow them. To be truly
confident that their data is trustworthy,
organizations must be able to trace its
path through their systems, see where
it came from and understand how it
was manipulated.
InfoSphere Information Governance Catalog
provides this level of transparency. It
includes data lineage capabilities that
enable users to track data back to its
original source and to see every calculation
performed on it along the way, increasing
the data’s accuracy and trustworthiness.
Defining “good”
The “goodness” of data depends upon
its usefulness in analysis, reporting and
decision making. Therefore, developing
an understanding of big data requires
companies to separate useful “good” data
from unhelpful “bad” data. In other words,
organizations must be able to extract only
those bits of data necessary to support a
particular business objective, and set aside
the rest. By filtering data in this way,
unnecessary data is kept out of data
warehouses and Hadoop file systems,
creating a more efficient processing
4	 Understanding: The key to
successfully leveraging big data
environment and lessening hardware
and software costs.
The data quality and data governance
capabilities of InfoSphere Information
Governance Catalog help companies know
for certain that their data is good,
trustworthy and true. That certainty allows
them to be more confident in the results of
their analytics activities and to take action
quickly and assertively. When they have a
solid foundation of good data, business
leaders can build a more flexible and agile
company that will be better able to identify
and capitalize on business opportunities.
Developing an understanding of big data requires companies to separate
useful “good” data from unhelpful “bad” data.
Understanding big data so you can act with confidence
12
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
InfoSphere Information Governance Catalog:
The foundation for understanding big data
The first step toward proper information
governance involves establishing correct
data definitions that the entire organization
can use to better understand information.
While a full understanding of business context
and meaning resolves ambiguity and leads
to more accurate decisions, users often require
more detail behind their data.
InfoSphere Information Governance Catalog
offers a solution to this problem. It enables
organizations to create and then link business
terms to technical artifacts.
Here are some key features:
•	Web-based management of business terms,
definitions and categories enables the creation
of an authoritative and common business
vocabulary for technical and business users
•	Integration with InfoSphere Information
Server metadata helps ensure that technical and
business information is always connected
and consistent
•	Security permissions help protect sensitive
business terms and definitions from
unauthorized users
•	Customizable features and attributes enable
business users to define unique parameters
for their specific organizational and business
environment
4	 Understanding: The key to
successfully leveraging big data
Understanding big data so you can act with confidence
13
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
•	Collaborative environment and feedback
mechanisms encourage organic growth and
allow different glossary users to jointly develop
or improve the glossary content
•	Powerful glossary import and export
capabilities enable administrators to combine
existing fragmented and home-grown glossaries
into a single enterprise glossary for use by a
wider business audience
•	Data stewardship empowers ownership of
business-term integrity and its governance
•	Terms can be linked in an easy-to-use
web interface to create policies that govern
information objects
•	Metadata lineage is captured and maintained
so that information contained in reports and
applications can be easily traced back to original
sources for validation (a critical step in meeting
compliance requirements for regulations such
as Basel II and the Sarbanes-Oxley Act)
•	Globalization and translation support
for simplified Chinese, traditional Chinese,
Japanese, Korean, French, German, Italian,
Spanish and Brazilian Portuguese allows
customers around the world to use InfoSphere
Information Governance Catalog in their
native language
4	 Understanding: The key to
successfully leveraging big data
To make matters even simpler for businesses
embarking on a data governance and business
glossary solution, IBM provides packaged
business terms and definitions for six major
industries: banking, financial services, retail,
telecommunications, healthcare and insurance.
These content offerings help accelerate the
implementation and deployment of your business
glossary by immediately providing rich, industry-
specific business terms and definitions.
Understanding big data so you can act with confidence
14
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?
Why InfoSphere?
5	 Why InfoSphere?
The IBM InfoSphere platform for information
integration and governance combines the
capabilities necessary for creating trusted
information from traditional and new
sources of big data. It provides an
enterprise-class foundation for information-
intensive projects, offering the performance,
scalability, reliability and acceleration
needed to deliver trusted information faster.
These capabilities include:
•	 Metadata, business glossary and
policy management: Defining both
metadata and governance policies with a
common component used by all integration
and governance engines is a critical task.
InfoSphere Information Governance Catalog
contains capabilities for metadata
management, governance policy definition
and management, and governance project
blueprint design, as well as a business
glossary of terms and definitions.
•	 Data integration: The InfoSphere platform
offers multiple integration capabilities for
batch data transformation and movement
(InfoSphere Information Server), real-time
replication (InfoSphere Data Replication)
and data federation (InfoSphere
Federation Server).
•	 Data quality: InfoSphere Information
Server for Data Quality has the ability to
parse, standardize, validate and match
enterprise data.
•	 Master data management (MDM):
InfoSphere MDM manages multiple data
domains, including customer, product,
account, location, reference data and
more. It handles any domain or style and
offers the flexibility to define custom
domains as required.
•	 Data lifecycle management: IBM
InfoSphere Optim™ manages the data
lifecycle from test data creation through
the retirement and archiving of data from
enterprise systems.
Understanding big data so you can act with confidence
15
1	Introduction 2	 The four Vs: Challenges
inherent to big data
3	 Big data, big opportunities 4	 Understanding: The key to
successfully leveraging big data
5	 Why InfoSphere?5	 Why InfoSphere?
•	 Data security and privacy: IBM
InfoSphere Guardium® activity monitoring
solutions continuously monitor data
access and protect repositories to prevent
data breaches and support compliance.
The InfoSphere Optim Data Masking
solution masks data in applications to help
protect sensitive data.
As a critical element of IBM Watson™
Foundations, the IBM big data and
analytics platform, InfoSphere Information
Integration and Governance (IIG) provides
market-leading functionality to handle the
challenges of big data. InfoSphere provides
optimal scalability and performance for
massive data volumes, agile and right-
sized integration and governance for the
increasing velocity of data, and support
and protection for a wide variety of data
types and big data systems. InfoSphere
helps make big data and analytics projects
successful by delivering business users the
confidence to act on insight.
IMM14123-USEN-01
© Copyright IBM Corporation 2014
IBM Corporation
Software Group
Route 100
Somers, NY 10589
Produced in the United States of America
June 2014
IBM, the IBM logo, ibm.com, Guardium, IBM Watson, InfoSphere,
and Optim are trademarks of International Business Machines Corp.,
registered in many jurisdictions worldwide. Other product and
service names might be trademarks of IBM or other companies. A
current list of IBM trademarks is available on the web at “Copyright
and trademark information” at ibm.com/legal/copytrade.shtml
This document is current as of the initial date of publication and
may be changed by IBM at any time. Not all offerings are available
in every country in which IBM operates.
The performance data discussed herein is presented as derived
under specific operating conditions. Actual results may vary.
THE INFORMATION IN THIS DOCUMENT IS PROVIDED
“AS IS” WITHOUT ANY WARRANTY, EXPRESS OR
IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE AND ANY WARRANTY OR CONDITION OF
NON-INFRINGEMENT. IBM products are warranted according
to the terms and conditions of the agreements under which they
are provided.
The client is responsible for ensuring compliance with laws and
regulations applicable to it. IBM does not provide legal advice or
represent or warrant that its services or products will ensure that the
client is in compliance with any law or regulation.
Actual available storage capacity may be reported for both
uncompressed and compressed data and will vary and may be less
than stated.
1 
IDC. “The Digital Universe of Opportunities: Rich Data and the
Increasing Value of the Internet of Things.” April 2014.
2
ibm.com/software/data/bigdata
Please Recycle
Additional resources
To learn more about the IBM approach to information integration and governance and
the IBM InfoSphere platform, please contact your IBM representative or IBM Business
Partner, or visit: ibm.com/software/data/information-integration-governance
Please also check out the other e-books in this series:
•	 Mastering big data with MDM
•	 Integrating trusted and protected information for big data
•	 Managing the lifecycle of big data
•	 Protecting and monitoring sensitive big data

More Related Content

What's hot

Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Data Con LA
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InSnapLogic
 
Hybrid Cloud Essential for Success
Hybrid Cloud Essential for SuccessHybrid Cloud Essential for Success
Hybrid Cloud Essential for SuccessNetApp
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data ArchitectureEd Thewlis
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonJeffrey T. Pollock
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldDATAVERSITY
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Postgres Vision 2018: The Pragmatic Cloud
Postgres Vision 2018:  The Pragmatic CloudPostgres Vision 2018:  The Pragmatic Cloud
Postgres Vision 2018: The Pragmatic CloudEDB
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisNetAppUK
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data JourneyPaul Boal
 
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...Vasu S
 
Predictive and Prescriptive Analytics Expert Session Webinar
Predictive  and Prescriptive Analytics Expert Session Webinar Predictive  and Prescriptive Analytics Expert Session Webinar
Predictive and Prescriptive Analytics Expert Session Webinar ibi
 
How Data Saves Time
How Data Saves TimeHow Data Saves Time
How Data Saves TimeNetApp
 
The Path to Digital Transformation
The Path to Digital TransformationThe Path to Digital Transformation
The Path to Digital TransformationPrecisely
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...Capgemini
 
Postgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAPostgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAEDB
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineersIBM Analytics
 
You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...Kai Wähner
 

What's hot (20)

Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
Modernizing the Analytics and Data Science Lifecycle for the Scalable Enterpr...
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump In
 
Hybrid Cloud Essential for Success
Hybrid Cloud Essential for SuccessHybrid Cloud Essential for Success
Hybrid Cloud Essential for Success
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lon
 
The Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud WorldThe Importance of DataOps in a Multi-Cloud World
The Importance of DataOps in a Multi-Cloud World
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Postgres Vision 2018: The Pragmatic Cloud
Postgres Vision 2018:  The Pragmatic CloudPostgres Vision 2018:  The Pragmatic Cloud
Postgres Vision 2018: The Pragmatic Cloud
 
Climbing the AI Ladder
Climbing the AI LadderClimbing the AI Ladder
Climbing the AI Ladder
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data Journey
 
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
 
Predictive and Prescriptive Analytics Expert Session Webinar
Predictive  and Prescriptive Analytics Expert Session Webinar Predictive  and Prescriptive Analytics Expert Session Webinar
Predictive and Prescriptive Analytics Expert Session Webinar
 
How Data Saves Time
How Data Saves TimeHow Data Saves Time
How Data Saves Time
 
The Path to Digital Transformation
The Path to Digital TransformationThe Path to Digital Transformation
The Path to Digital Transformation
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
 
Postgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAPostgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IA
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineers
 
On Demand BI
On Demand BIOn Demand BI
On Demand BI
 
You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...
 

Viewers also liked

Better Business Outcomes with Big Data Analytics
Better Business Outcomes with Big Data AnalyticsBetter Business Outcomes with Big Data Analytics
Better Business Outcomes with Big Data AnalyticsIBM Software India
 
IBM Solutions Connect 2013 Leadership Meet Keynote
IBM Solutions Connect 2013 Leadership Meet KeynoteIBM Solutions Connect 2013 Leadership Meet Keynote
IBM Solutions Connect 2013 Leadership Meet KeynoteIBM Software India
 
Securely Adopting Mobile Technology Innovations
Securely Adopting Mobile Technology InnovationsSecurely Adopting Mobile Technology Innovations
Securely Adopting Mobile Technology InnovationsIBM Software India
 
Exponentially influencing business outcomes with Big Data Analytics
Exponentially influencing business outcomes with Big Data AnalyticsExponentially influencing business outcomes with Big Data Analytics
Exponentially influencing business outcomes with Big Data AnalyticsSaama
 
IBM X Force threat intelligence quarterly 1Q 2014
IBM X Force threat intelligence quarterly 1Q 2014IBM X Force threat intelligence quarterly 1Q 2014
IBM X Force threat intelligence quarterly 1Q 2014IBM Software India
 

Viewers also liked (6)

Better Business Outcomes with Big Data Analytics
Better Business Outcomes with Big Data AnalyticsBetter Business Outcomes with Big Data Analytics
Better Business Outcomes with Big Data Analytics
 
IBM Solutions Connect 2013 Leadership Meet Keynote
IBM Solutions Connect 2013 Leadership Meet KeynoteIBM Solutions Connect 2013 Leadership Meet Keynote
IBM Solutions Connect 2013 Leadership Meet Keynote
 
Securely Adopting Mobile Technology Innovations
Securely Adopting Mobile Technology InnovationsSecurely Adopting Mobile Technology Innovations
Securely Adopting Mobile Technology Innovations
 
Big data
Big dataBig data
Big data
 
Exponentially influencing business outcomes with Big Data Analytics
Exponentially influencing business outcomes with Big Data AnalyticsExponentially influencing business outcomes with Big Data Analytics
Exponentially influencing business outcomes with Big Data Analytics
 
IBM X Force threat intelligence quarterly 1Q 2014
IBM X Force threat intelligence quarterly 1Q 2014IBM X Force threat intelligence quarterly 1Q 2014
IBM X Force threat intelligence quarterly 1Q 2014
 

Similar to Understanding Big Data so you can act with confidence

Big Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperBig Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperVasu S
 
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxstilliegeorgiana
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellenceMudit Mangal
 
Barry Ooi; Big Data lookb4YouLeap
Barry Ooi; Big Data lookb4YouLeapBarry Ooi; Big Data lookb4YouLeap
Barry Ooi; Big Data lookb4YouLeapBarry Ooi
 
Building an Effective Data Management Strategy
Building an Effective Data Management StrategyBuilding an Effective Data Management Strategy
Building an Effective Data Management StrategyHarley Capewell
 
Identify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big dataIdentify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big dataTheInnovantes
 
The value of big data analytics
The value of big data analyticsThe value of big data analytics
The value of big data analyticsMarc Vael
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
Group 2 Handling and Processing of big data (1).pptx
Group 2 Handling and Processing of big data (1).pptxGroup 2 Handling and Processing of big data (1).pptx
Group 2 Handling and Processing of big data (1).pptxNATASHABANO
 
Big Data - Bridging Technology and Humans
Big Data - Bridging Technology and HumansBig Data - Bridging Technology and Humans
Big Data - Bridging Technology and HumansMark Laurance
 
Information-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptxInformation-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptxAhimsaBhardwaj
 
Big Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperBig Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperExperian
 
Big data's impact on online marketing
Big data's impact on online marketingBig data's impact on online marketing
Big data's impact on online marketingPros Global Inc
 
Module 1 the power of data
Module 1 the power of dataModule 1 the power of data
Module 1 the power of datacaniceconsulting
 
yellowibm
yellowibmyellowibm
yellowibmKay Orr
 

Similar to Understanding Big Data so you can act with confidence (20)

Big Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - WhitepaperBig Data Trends and Challenges Report - Whitepaper
Big Data Trends and Challenges Report - Whitepaper
 
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
Barry Ooi; Big Data lookb4YouLeap
Barry Ooi; Big Data lookb4YouLeapBarry Ooi; Big Data lookb4YouLeap
Barry Ooi; Big Data lookb4YouLeap
 
Building an Effective Data Management Strategy
Building an Effective Data Management StrategyBuilding an Effective Data Management Strategy
Building an Effective Data Management Strategy
 
Identify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big dataIdentify and analyze the greatest insights from big data
Identify and analyze the greatest insights from big data
 
Big Data at a Glance
Big Data at a GlanceBig Data at a Glance
Big Data at a Glance
 
The value of big data analytics
The value of big data analyticsThe value of big data analytics
The value of big data analytics
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
Big data baddata-gooddata
Big data baddata-gooddataBig data baddata-gooddata
Big data baddata-gooddata
 
Group 2 Handling and Processing of big data (1).pptx
Group 2 Handling and Processing of big data (1).pptxGroup 2 Handling and Processing of big data (1).pptx
Group 2 Handling and Processing of big data (1).pptx
 
Big Data - Bridging Technology and Humans
Big Data - Bridging Technology and HumansBig Data - Bridging Technology and Humans
Big Data - Bridging Technology and Humans
 
Information-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptxInformation-Systems-and-Technology.pptx
Information-Systems-and-Technology.pptx
 
Big Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White PaperBig Data is Here for Financial Services White Paper
Big Data is Here for Financial Services White Paper
 
Bidata
BidataBidata
Bidata
 
Big data's impact on online marketing
Big data's impact on online marketingBig data's impact on online marketing
Big data's impact on online marketing
 
Data Literacy.docx
Data Literacy.docxData Literacy.docx
Data Literacy.docx
 
Module 1 the power of data
Module 1 the power of dataModule 1 the power of data
Module 1 the power of data
 
yellowibm
yellowibmyellowibm
yellowibm
 
yellowibm
yellowibmyellowibm
yellowibm
 

More from IBM Software India

Analytics and Cricket World Cup 2015
Analytics and Cricket World Cup 2015Analytics and Cricket World Cup 2015
Analytics and Cricket World Cup 2015IBM Software India
 
The Rise of Private Modular Cloud
The Rise of Private Modular CloudThe Rise of Private Modular Cloud
The Rise of Private Modular CloudIBM Software India
 
Achieving Scalability and Speed with Softlayer
Achieving Scalability and Speed with SoftlayerAchieving Scalability and Speed with Softlayer
Achieving Scalability and Speed with SoftlayerIBM Software India
 
Build your own Cloud & Infrastructure
Build your own Cloud & InfrastructureBuild your own Cloud & Infrastructure
Build your own Cloud & InfrastructureIBM Software India
 
Web version-ab cs-book-bangalore
Web version-ab cs-book-bangaloreWeb version-ab cs-book-bangalore
Web version-ab cs-book-bangaloreIBM Software India
 
Maa s360 10command_ebook-bangalore[1]
Maa s360 10command_ebook-bangalore[1]Maa s360 10command_ebook-bangalore[1]
Maa s360 10command_ebook-bangalore[1]IBM Software India
 
Maa s360 10command_ebook-bangalore
Maa s360 10command_ebook-bangaloreMaa s360 10command_ebook-bangalore
Maa s360 10command_ebook-bangaloreIBM Software India
 
Web version-ab cs-book-bangalore
Web version-ab cs-book-bangaloreWeb version-ab cs-book-bangalore
Web version-ab cs-book-bangaloreIBM Software India
 
White paper native, web or hybrid mobile app development
White paper  native, web or hybrid mobile app developmentWhite paper  native, web or hybrid mobile app development
White paper native, web or hybrid mobile app developmentIBM Software India
 
Buyer’s checklist for mobile application platforms
Buyer’s checklist for mobile application platformsBuyer’s checklist for mobile application platforms
Buyer’s checklist for mobile application platformsIBM Software India
 
Social business for innovation
Social business for innovationSocial business for innovation
Social business for innovationIBM Software India
 
The Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data HadoopThe Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data HadoopIBM Software India
 
Forrester Wave - Big data streaming analytics platforms
Forrester Wave - Big data streaming analytics platformsForrester Wave - Big data streaming analytics platforms
Forrester Wave - Big data streaming analytics platformsIBM Software India
 
Analytics - The speed advantage
Analytics - The speed advantageAnalytics - The speed advantage
Analytics - The speed advantageIBM Software India
 

More from IBM Software India (20)

Analytics and Cricket World Cup 2015
Analytics and Cricket World Cup 2015Analytics and Cricket World Cup 2015
Analytics and Cricket World Cup 2015
 
Why analytics?
Why analytics?Why analytics?
Why analytics?
 
The Rise of Private Modular Cloud
The Rise of Private Modular CloudThe Rise of Private Modular Cloud
The Rise of Private Modular Cloud
 
Achieving Scalability and Speed with Softlayer
Achieving Scalability and Speed with SoftlayerAchieving Scalability and Speed with Softlayer
Achieving Scalability and Speed with Softlayer
 
Build your own Cloud & Infrastructure
Build your own Cloud & InfrastructureBuild your own Cloud & Infrastructure
Build your own Cloud & Infrastructure
 
Web version-ab cs-book-bangalore
Web version-ab cs-book-bangaloreWeb version-ab cs-book-bangalore
Web version-ab cs-book-bangalore
 
Maa s360 10command_ebook-bangalore[1]
Maa s360 10command_ebook-bangalore[1]Maa s360 10command_ebook-bangalore[1]
Maa s360 10command_ebook-bangalore[1]
 
Maa s360 10command_ebook-bangalore
Maa s360 10command_ebook-bangaloreMaa s360 10command_ebook-bangalore
Maa s360 10command_ebook-bangalore
 
Web version-ab cs-book-bangalore
Web version-ab cs-book-bangaloreWeb version-ab cs-book-bangalore
Web version-ab cs-book-bangalore
 
White paper native, web or hybrid mobile app development
White paper  native, web or hybrid mobile app developmentWhite paper  native, web or hybrid mobile app development
White paper native, web or hybrid mobile app development
 
Buyer’s checklist for mobile application platforms
Buyer’s checklist for mobile application platformsBuyer’s checklist for mobile application platforms
Buyer’s checklist for mobile application platforms
 
SoftLayer Overview
SoftLayer OverviewSoftLayer Overview
SoftLayer Overview
 
Standing apart in the cloud
Standing apart in the cloudStanding apart in the cloud
Standing apart in the cloud
 
Social business for innovation
Social business for innovationSocial business for innovation
Social business for innovation
 
Liking to leading
Liking to leadingLiking to leading
Liking to leading
 
Focus on work. Not on inbox
Focus on work. Not on inboxFocus on work. Not on inbox
Focus on work. Not on inbox
 
The Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data HadoopThe Forrester Wave - Big Data Hadoop
The Forrester Wave - Big Data Hadoop
 
Forrester Wave - Big data streaming analytics platforms
Forrester Wave - Big data streaming analytics platformsForrester Wave - Big data streaming analytics platforms
Forrester Wave - Big data streaming analytics platforms
 
Analytics - The speed advantage
Analytics - The speed advantageAnalytics - The speed advantage
Analytics - The speed advantage
 
The Future Data Center
The Future Data CenterThe Future Data Center
The Future Data Center
 

Recently uploaded

28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
DOC-20240326-WA0008..pptx by Abhishek lal
DOC-20240326-WA0008..pptx by Abhishek lalDOC-20240326-WA0008..pptx by Abhishek lal
DOC-20240326-WA0008..pptx by Abhishek lalnikkicross391
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Transcription in living organisms- An overview.pptx
Transcription in living organisms- An overview.pptxTranscription in living organisms- An overview.pptx
Transcription in living organisms- An overview.pptxShubhrangshuHalder2
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
Appreciating how time and practice affect IELTS learners in their journey to ...
Appreciating how time and practice affect IELTS learners in their journey to ...Appreciating how time and practice affect IELTS learners in their journey to ...
Appreciating how time and practice affect IELTS learners in their journey to ...PrithaVashisht1
 
Data Visualization Report For Business Analytics.docx
Data Visualization Report For Business Analytics.docxData Visualization Report For Business Analytics.docx
Data Visualization Report For Business Analytics.docxahmeds551
 
Cement class 12 notes of cement chapter.pdf
Cement class 12 notes of cement chapter.pdfCement class 12 notes of cement chapter.pdf
Cement class 12 notes of cement chapter.pdfSafalPoudel6
 
Protein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.pptProtein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.ppt60BT119YAZHINIK
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Hong Ong
 
AppliedGenAI 3 days workshop_GSV_updated.pdf
AppliedGenAI 3 days workshop_GSV_updated.pdfAppliedGenAI 3 days workshop_GSV_updated.pdf
AppliedGenAI 3 days workshop_GSV_updated.pdfObjectAutomation2
 

Recently uploaded (16)

28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
DOC-20240326-WA0008..pptx by Abhishek lal
DOC-20240326-WA0008..pptx by Abhishek lalDOC-20240326-WA0008..pptx by Abhishek lal
DOC-20240326-WA0008..pptx by Abhishek lal
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Transcription in living organisms- An overview.pptx
Transcription in living organisms- An overview.pptxTranscription in living organisms- An overview.pptx
Transcription in living organisms- An overview.pptx
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
Appreciating how time and practice affect IELTS learners in their journey to ...
Appreciating how time and practice affect IELTS learners in their journey to ...Appreciating how time and practice affect IELTS learners in their journey to ...
Appreciating how time and practice affect IELTS learners in their journey to ...
 
Data Visualization Report For Business Analytics.docx
Data Visualization Report For Business Analytics.docxData Visualization Report For Business Analytics.docx
Data Visualization Report For Business Analytics.docx
 
Cement class 12 notes of cement chapter.pdf
Cement class 12 notes of cement chapter.pdfCement class 12 notes of cement chapter.pdf
Cement class 12 notes of cement chapter.pdf
 
Protein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.pptProtein struc pred-Ab initio and other methods as a short introduction.ppt
Protein struc pred-Ab initio and other methods as a short introduction.ppt
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...Feast Feature Store - An In-depth Overview Experimentation and Application in...
Feast Feature Store - An In-depth Overview Experimentation and Application in...
 
AppliedGenAI 3 days workshop_GSV_updated.pdf
AppliedGenAI 3 days workshop_GSV_updated.pdfAppliedGenAI 3 days workshop_GSV_updated.pdf
AppliedGenAI 3 days workshop_GSV_updated.pdf
 

Understanding Big Data so you can act with confidence

  • 1. IBM Software Understanding big data so you can act with confidence More data, more problems? Not if you have an agile, automated information integration and governance program in place
  • 2. Understanding big data so you can act with confidence 1 2 3 4 5Introduction The four Vs: Challenges inherent to big data Big data, big opportunities Understanding: The key to successfully leveraging big data Why InfoSphere?
  • 3. Understanding big data so you can act with confidence 3 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? Introduction Business leaders are eager to harness the power of big data. However, as the opportunity increases, it becomes exponentially more difficult to ensure that source information is trustworthy and protected. If this trustworthiness issue is not addressed directly, end users may lose confidence in the insights generated from their data—which can result in a failure to act on opportunities or against threats. Take, for example, the case of a large multinational corporation whose CFO asked a very simple question every month: How did sales this month compare to the same month last year? The answer turned out to be quite complicated. Sales were calculated and reported in various systems: material planning, financial, rent and royalty, real estate and so forth. Each system reported a different number because each system defined a “sale” slightly differently, and therefore extracted dissimilar data from the millions of daily transactions generated around the world. So what was the truth? No one actually knew. 1 Introduction
  • 4. Understanding big data so you can act with confidence 4 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? The sheer volume and complexity of big data means that the traditional method of discovering, governing and correcting information using manual stewardship may not apply. Information integration and governance must be implemented within big data applications, providing appropriate governance and rapid integration from the start. By automating information integration and governance and employing it at the point of data creation, organizations can boost confidence in their big data. A solid information integration and governance program should include automated discovery, profiling and understanding of diverse datasets to provide context and enable employees to make informed decisions. It must be agile to accommodate a wide variety of data and seamlessly integrate with various technologies, from data marts to Apache Hadoop systems. And it must automatically discover, protect and monitor sensitive information as part of big data applications. IBM® InfoSphere® is designed to do all of these things by evolving information integration and governance to meet the challenges presented by big data. 1 Introduction
  • 5. Understanding big data so you can act with confidence 5 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? Every second of every day, businesses generate more data. Researchers at IDC estimate that by the end of 2020, the amount of data created and copied annually will exceed 44 zettabytes, or 44 trillion gigabytes. That’s a 40 percent annual data growth rate every year through the end of the decade.1 All of that big data represents a big opportunity for organizations. Companies recognize that their big data contains valuable information. They are eager to analyze it to obtain actionable insights that could help them deepen customer relationships, prevent threats and fraud, optimize operations and identify new revenue opportunities. Unfortunately, extracting valuable information from big data isn’t as easy as it sounds. Big data amplifies any existing problems in your infrastructure, processes or even the data itself. To make the most of the information in their systems, companies must successfully deal with the The four Vs: Challenges inherent to big data “four Vs” that distinguish big data: variety, volume, velocity and veracity. The first three—variety, volume and velocity— define big data; when you have a large volume of data coming in from a wide variety of applications and formats and it’s moving and changing at a rapid velocity, that’s when you know you have big data. The fourth V, veracity, is a measure of the accuracy and trustworthiness of your data. Veracity is a goal—one that the variety, volume and velocity of big data make harder to achieve. Your company can take advantage of the opportunities available in big data only when you have processes and solutions that can handle all four Vs. 2 The four Vs: Challenges inherent to big data According to IBM,“Every day, we create 2.5 quintillion bytes of data—so much that 90 percent of the data in the world today has been created in the last two years alone.”2
  • 6. Understanding big data so you can act with confidence 6 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? Volume Increasing data volume is at the heart of the big data challenge. Large data volumes can cause many obvious technical problems, such as excessive batch processing times, bottlenecks and so on. However, the problem goes deeper than that. What data should an organization care about? How will it find that data? How will it know how to leverage that data for business use? All of these questions point to a fundamental need to understand the data—and manual techniques of examining data as it comes in are simply unequal to the task of working with big data. Variety Enterprise data comes from many sources. Customers and salespeople generate data every time they make a sale. Accountants produce data every time they create a spreadsheet. Marketers create data every time they write a report. In addition, machine data—from sources such as meter readings and IT system logs—can increase data volumes enormously. Organizations also get constant streams of data from external sources such as suppliers, consultants, research firms and news feeds. All of this data is stored in a dizzying array of formats. Some gets stored in structured formats in databases or enterprise resource planning (ERP) applications, but much of it is in documents, email messages or even photos and videos—unstructured formats that are challenging to manage. And data doesn’t rest once it is in storage. It must be moved from application to application and from system to system so managers and executives can interpret the data and come to meaningful conclusions. 2 The four Vs: Challenges inherent to big data What data should an organization care about? How will it find that data?
  • 7. Understanding big data so you can act with confidence 7 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? Veracity The first three Vs— variety, volume and velocity—define big data. In contrast, the fourth V—veracity— is a desirable aspect of data that organizations want, but don’t always get. Veracity is directly related to the trustworthiness of an organization’s data. The first step toward building trust into data is to understand which data is needed and for what business purpose. Can knowledge workers be sure that the data is accurate, current and complete—and therefore be confident in any decisions that are based on that data? 2 The four Vs: Challenges inherent to big data Velocity In almost every industry, data is coming in faster than ever. In activities such as stock trades, fraud detection or patient health monitoring, seconds—sometimes microseconds—can be critically important. Organizations must be able to understand data as it is streaming in. In most cases, getting that all-important information requires moving data quickly from one application or repository to another, where it can be processed and analyzed. Unfortunately, many data integration solutions lack the high performance that big data projects require. There is not enough time to collect the data and process it in batches—particularly if a real-time or near–real-time response is warranted. In most cases, getting that all-important information requires moving data quickly from one application or repository to another, where it can be processed and analyzed.
  • 8. Understanding big data so you can act with confidence 8 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? Big data presents significant opportunities for increased growth and profitability—and forward-thinking organizations are already realizing some of these benefits: • A retailer actively uses social media data to analyze customer sentiment and improve loyalty, helping to drive revenue growth • A transportation company uses machine data to optimize logistics, thereby reducing shipping costs • A telecommunications company slashed developer costs by over 50 percent by articulating clear terms and policies for the data it needed • A manufacturing company enhanced the trustworthiness of its reports after it discovered and eliminated 37 unique definitions of “customer” across its enterprise, and agreed to a single, standard definition Capturing these sorts of benefits from big data requires knowing what the business needs and being able to find key items within the larger mass of big data. Begin by articulating the goals of the business. Determine the analytics and reports necessary to support those business objectives. Now you can make decisions about the data needed to inform the reports. Big data, big opportunities 3 Big data, big opportunities Standard terminology is a critical piece of this process. For example, there should be no misunderstanding about what constitutes a “sale” or a “customer.” With standard terms, organizations can create standard policies, such as defining the data that comprises a complete customer record. Armed with objectives and standard terms and policies, teams can create data processing flows that automate the extraction of relevant data from the mass of big data. Importantly, this understanding also includes knowing at all times where data is coming from, how it is being manipulated and where it is going. This chain of understanding builds trust into information—and promotes confident decision making.
  • 9. Understanding big data so you can act with confidence 9 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? You can’t govern what you don’t understand It’s an unfortunately common refrain across business and IT users when it comes to discussing data: “That’s not what I meant!” Organizations, departments and sectors often use different terms or labels for the same information, leading to confusion over details such as what constitutes a customer (is it a household or an individual?). In one case, a large manufacturer had 37 different definitions of the term “Employee ID” across multiple divisions. Using spreadsheets to record and compare the representations, the company discovered at least 15 different data types and formats, with different characteristics and usages—a result of multiple legacy systems, acquired companies with their own systems and homegrown applications. This meant that IT spent extra time and used undocumented knowledge to determine which version to use in which reports. Developers and analysts then spent more time determining which data to use in which tests, examples and tasks. And business users were hampered by data quality issues coming from duplicated or missed data. A large European telecom company faced similar issues when it discovered not just terminology problems but also an inability to trace data back to its source. Using InfoSphere Information Governance Catalog, the company standardized business terms and policies, which helped increase confidence in reports. Next, it redesigned integration processes and built a clear blueprint, reducing time required to write actual integration jobs by 50 percent. Finally, it used the InfoSphere tool to establish a clear metadata lineage—from source to target, including all transformations—that uncovered redundant data. Consolidating and archiving that data led to a 75 percent reduction in storage space. 3 Big data, big opportunities
  • 10. Understanding big data so you can act with confidence 10 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere?4 Understanding: The key to successfully leveraging big data Understanding: The key to successfully leveraging big data Successful businesses depend on trusted information. IBM InfoSphere Information Governance Catalog helps organizations create an understanding of big data that enables it to be converted into trusted information. It allows business users to play an active role in information- centric projects and to collaborate with technical teams—all without the need for technical training. The end result: an organization with a consistent understanding of information, what it means, how it is used and why it can be trusted. Decisions are more accurate and business opportunities can be readily captured. Defining “truth” In a philosophy class, students might consider the idea that truth is ultimately unknowable or that truth is constantly changing. Most organizations do not have that luxury. Enterprises must have agreed- upon definitions for important terms so they can monitor and act on key metrics. Consider a global financial institution that has grown rapidly through mergers and acquisitions. Because each of the company’s various lines of business grew independently, each has its own unique processes, IT systems and definitions for important terms. It isn’t uncommon for the CIO, CFO and CMO to use different definitions, which can lead to confusion when it comes to analytics and reports. InfoSphere Information Governance Catalog includes a business glossary that enables business and IT to create and agree on definitions, rules and policies. InfoSphere Information Governance Catalog also includes data modeling capabilities that allow data architects to determine where each piece of data will come from and where it will go. This capability helps ensure that everyone involved in the big data project knows exactly what key metrics mean and where the data should originate— establishing the “truth” as it relates to their business data.
  • 11. Understanding big data so you can act with confidence 11 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? Defining “trustworthy” Unfortunately, it isn’t enough just to establish policies and definitions and hope that people will follow them. To be truly confident that their data is trustworthy, organizations must be able to trace its path through their systems, see where it came from and understand how it was manipulated. InfoSphere Information Governance Catalog provides this level of transparency. It includes data lineage capabilities that enable users to track data back to its original source and to see every calculation performed on it along the way, increasing the data’s accuracy and trustworthiness. Defining “good” The “goodness” of data depends upon its usefulness in analysis, reporting and decision making. Therefore, developing an understanding of big data requires companies to separate useful “good” data from unhelpful “bad” data. In other words, organizations must be able to extract only those bits of data necessary to support a particular business objective, and set aside the rest. By filtering data in this way, unnecessary data is kept out of data warehouses and Hadoop file systems, creating a more efficient processing 4 Understanding: The key to successfully leveraging big data environment and lessening hardware and software costs. The data quality and data governance capabilities of InfoSphere Information Governance Catalog help companies know for certain that their data is good, trustworthy and true. That certainty allows them to be more confident in the results of their analytics activities and to take action quickly and assertively. When they have a solid foundation of good data, business leaders can build a more flexible and agile company that will be better able to identify and capitalize on business opportunities. Developing an understanding of big data requires companies to separate useful “good” data from unhelpful “bad” data.
  • 12. Understanding big data so you can act with confidence 12 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? InfoSphere Information Governance Catalog: The foundation for understanding big data The first step toward proper information governance involves establishing correct data definitions that the entire organization can use to better understand information. While a full understanding of business context and meaning resolves ambiguity and leads to more accurate decisions, users often require more detail behind their data. InfoSphere Information Governance Catalog offers a solution to this problem. It enables organizations to create and then link business terms to technical artifacts. Here are some key features: • Web-based management of business terms, definitions and categories enables the creation of an authoritative and common business vocabulary for technical and business users • Integration with InfoSphere Information Server metadata helps ensure that technical and business information is always connected and consistent • Security permissions help protect sensitive business terms and definitions from unauthorized users • Customizable features and attributes enable business users to define unique parameters for their specific organizational and business environment 4 Understanding: The key to successfully leveraging big data
  • 13. Understanding big data so you can act with confidence 13 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? • Collaborative environment and feedback mechanisms encourage organic growth and allow different glossary users to jointly develop or improve the glossary content • Powerful glossary import and export capabilities enable administrators to combine existing fragmented and home-grown glossaries into a single enterprise glossary for use by a wider business audience • Data stewardship empowers ownership of business-term integrity and its governance • Terms can be linked in an easy-to-use web interface to create policies that govern information objects • Metadata lineage is captured and maintained so that information contained in reports and applications can be easily traced back to original sources for validation (a critical step in meeting compliance requirements for regulations such as Basel II and the Sarbanes-Oxley Act) • Globalization and translation support for simplified Chinese, traditional Chinese, Japanese, Korean, French, German, Italian, Spanish and Brazilian Portuguese allows customers around the world to use InfoSphere Information Governance Catalog in their native language 4 Understanding: The key to successfully leveraging big data To make matters even simpler for businesses embarking on a data governance and business glossary solution, IBM provides packaged business terms and definitions for six major industries: banking, financial services, retail, telecommunications, healthcare and insurance. These content offerings help accelerate the implementation and deployment of your business glossary by immediately providing rich, industry- specific business terms and definitions.
  • 14. Understanding big data so you can act with confidence 14 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere? Why InfoSphere? 5 Why InfoSphere? The IBM InfoSphere platform for information integration and governance combines the capabilities necessary for creating trusted information from traditional and new sources of big data. It provides an enterprise-class foundation for information- intensive projects, offering the performance, scalability, reliability and acceleration needed to deliver trusted information faster. These capabilities include: • Metadata, business glossary and policy management: Defining both metadata and governance policies with a common component used by all integration and governance engines is a critical task. InfoSphere Information Governance Catalog contains capabilities for metadata management, governance policy definition and management, and governance project blueprint design, as well as a business glossary of terms and definitions. • Data integration: The InfoSphere platform offers multiple integration capabilities for batch data transformation and movement (InfoSphere Information Server), real-time replication (InfoSphere Data Replication) and data federation (InfoSphere Federation Server). • Data quality: InfoSphere Information Server for Data Quality has the ability to parse, standardize, validate and match enterprise data. • Master data management (MDM): InfoSphere MDM manages multiple data domains, including customer, product, account, location, reference data and more. It handles any domain or style and offers the flexibility to define custom domains as required. • Data lifecycle management: IBM InfoSphere Optim™ manages the data lifecycle from test data creation through the retirement and archiving of data from enterprise systems.
  • 15. Understanding big data so you can act with confidence 15 1 Introduction 2 The four Vs: Challenges inherent to big data 3 Big data, big opportunities 4 Understanding: The key to successfully leveraging big data 5 Why InfoSphere?5 Why InfoSphere? • Data security and privacy: IBM InfoSphere Guardium® activity monitoring solutions continuously monitor data access and protect repositories to prevent data breaches and support compliance. The InfoSphere Optim Data Masking solution masks data in applications to help protect sensitive data. As a critical element of IBM Watson™ Foundations, the IBM big data and analytics platform, InfoSphere Information Integration and Governance (IIG) provides market-leading functionality to handle the challenges of big data. InfoSphere provides optimal scalability and performance for massive data volumes, agile and right- sized integration and governance for the increasing velocity of data, and support and protection for a wide variety of data types and big data systems. InfoSphere helps make big data and analytics projects successful by delivering business users the confidence to act on insight.
  • 16. IMM14123-USEN-01 © Copyright IBM Corporation 2014 IBM Corporation Software Group Route 100 Somers, NY 10589 Produced in the United States of America June 2014 IBM, the IBM logo, ibm.com, Guardium, IBM Watson, InfoSphere, and Optim are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at ibm.com/legal/copytrade.shtml This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings are available in every country in which IBM operates. The performance data discussed herein is presented as derived under specific operating conditions. Actual results may vary. THE INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided. The client is responsible for ensuring compliance with laws and regulations applicable to it. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the client is in compliance with any law or regulation. Actual available storage capacity may be reported for both uncompressed and compressed data and will vary and may be less than stated. 1 IDC. “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things.” April 2014. 2 ibm.com/software/data/bigdata Please Recycle Additional resources To learn more about the IBM approach to information integration and governance and the IBM InfoSphere platform, please contact your IBM representative or IBM Business Partner, or visit: ibm.com/software/data/information-integration-governance Please also check out the other e-books in this series: • Mastering big data with MDM • Integrating trusted and protected information for big data • Managing the lifecycle of big data • Protecting and monitoring sensitive big data