SlideShare a Scribd company logo
1 of 4
Download to read offline
WHITE PAPER
Dealing with
Dark Data
We’re in the difficult middle years of the
information age, where a nexus of factors
like cheap storage, rich HD media, ubiquitous
connectivity and more sophisticated SaaS
products are generating more data than we
can affordably store or meaningfully process.
Why are we growing so much?
Data is flooding in from a multitude of sources
– some known and some invisible – which
organizations today have neither the time nor
the resources to effectively manage, let alone
benefit from.
The trouble is, whilst big data and analytics
remain in vogue, neither the volume of
data produced, nor the impulse to store it
all, will change. In the pursuit of business
intelligence, many organizations are hoarding
– often unconsciously - useless data with
the expectation that its potential value will
eventually offset the costs of a bloated and
unnavigable storage environment.
Dark Data
The main culprit behind this trend is something
Gartner has called “dark data” – data which
accumulates through automatic and manual
processes, but which remains invisible to the
business: idle, unanalyzed and without a clear
owner. Being invisible, quantifying exactly how
much dark data organizations are struggling
with is problematic, but that hasn’t stopped the
major analysts trying.
First, a 2013 survey conducted by IDG
Research Services found that only 28% of
stored data presents any value to the day-
to-day operations of a business, suggesting a
massive 72% is non-essential.
Second, IDC’s “Top 10 predictions for CMOs in
2014” corroborates these figures, suggesting
that organizations will fail to realize any value
from 80% of the customer data they hold
because of “immature enterprise value chains”.
Just in case you don’t speak analyst, that
means current data management practices
aren’t capable of locating and extracting the
supposedly valuable information hidden
amongst terabytes of collected data.
It’s expensive to maintain that much unused
data, as Gartner rightly points out: “…
organizations that fail to optimize the way they
manage and retain their data will be forced
to deal with constant increases in storage
costs”. But financial cost is only a part of the
reason dark data is so damaging. Perhaps
more importantly, dark data has become so
ubiquitous that it obscures the useful stuff.
It’s not just that organizations don’t have an
adequate tool to sift through the data heap;
it’s that in worshipping at the altar of analytics
prematurely, we are actively hoarding
useless data in the hope of one day extracting
enormous value from it.
As IDC’s CMO of Advisory Services put it,
whilst big data analytics is a hot topic, most of
this collected data: “[is] garbage. IDC’s data
group researchers say that some 80% of data
collected has no meaning whatsoever.” Or at
least it won’t, until organizations are “smart
enough [to have] a tool be able to differentiate
between the signal and the noise.”
A survey by IDG Research Services found
that only 28% of stored data presents any
value to the day-to-day operations of a
business.
02
What does dark data look like?
Before we go on to look at what these tools
might look like, we should think about the scale
of the problem we expect them to fix. We must
categorize the types of dark data organizations
possess, and for each category, reconcile its
potential value against the cost of its storage.
For instance, server log files are individually
small and unobtrusive, and may contain
useful insights into customer behaviour when
processed together. Even if they’re dark, they
don’t represent a significant burden on the
storage environment.
Unstructured data, on the other hand, is
without exception the single biggest driver
in dark data growth. It’s a broad category of
storage, which can include almost anything
that exists outside of semantically tagged
field forms and databases, and is estimated
to constitute around 70-80% of all data in an
average organization.
It’s often human-generated information in the
form of documents, presentations, reports,
graphics, videos and audio that all begin as
potentially valuable, but end up as half-finished
ideas, discarded early-drafts or simply assets
that serve their purpose and are no longer
useful.
Why is there so much of it?
The answer to the spiraling growth of
unstructured data is the same as its cause –
data management practices (or rather, the
lack of them). We’ll go on to look at the way
tools can encourage better policy-based
management of the data lifecycle shortly, but it
is briefly worth reiterating that the solution to
dark data is not technology – it is management.
There’s no single cause behind the volume and
variety of unstructured data organizations
produce. Some of it is just a symptom of
technological progress. We are using,
producing and sharing more stuff - whether
that is documents, presentations, emails, or
media – because both the tools (and therefore
output) have become more sophisticated and
the quality of connectivity between us is faster
and more reliable.
There is one common thread though:
standards of data management have not kept
up with the pace of data growth. Not by a long
shot.
One of the most common problems is poorly
maintained folder structures. In organizations
where users are free to create data and
folders within shared file stores, duplication
of both content and the effort required to
create it is incredibly common. Users become
less productive because they can’t find the
information they need, and the file stores
become a tangled mess of non-standardized
naming conventions, leading to massive
amounts of erroneous data putting a great
strain on storage.
Another common problem is that old and
unused file data is not actively retired once
it is updated or has become irrelevant. In the
Databarracks Data Health Check 2014, 49%
of 401 respondents did not actively distinguish
between unused and recently accessed file
data despite it being the largest cause of
storage growth.
Unstructured data is estimated to
constitute around 70-80% of all data in an
average organization.
03
What do we do about it?
There is an appetite for tools able to shed some light on dark data. IDG’s report found that whilst
77% of enterprises expressed interest in a single platform solution that automatically manages
data, only 10% actually had a completely automated process in place.
Of course, organizations struggling with dark data (which, to be clear, is everyone) must first
identify what they hope to achieve in finding it. Is it that there may be hidden value in documents
long forgotten about, or that they hope to retire useless data to enable more cost-effective
storage?
In truth, this is a bit of a false dilemma – the answer is probably a combination of the two.
However, it remains a useful distinction to make, if only to make a more informed decision about
the capabilities they require from their chosen solution.
Prospective data analytics tools must offer three core capabilities to reveal the location and
condition of dark data, and minimize preventable growth in future.
Search
First, organizations need a strong search capability that scrapes
both metadata and the actual content of unstructured data.
This increases visibility into the dark areas of your storage
environment and connects users to the information they need
more quickly.
Analyze
Secondly, organizations need powerful analytics and reporting
capabilities in order to extract actionable intelligence from large
volumes of dark data. This is a twofold challenge: half technical
and half design. The analytics must be accurate, responsive
and exhaustive, but they must also be beautifully visualized to
increase usability, comprehension and insight.
Archive
Finally, to address the problem of dark data in the long term,
data analytics tools must facilitate the transfer of old and unused
data to cheaper archive storage platforms. Cloud-based object
storage is a cheap and highly scalable alternative to costly
primary storage, and with the creation of management policies
based on usage-rates and compliance obligations, organizations
can automate the process of retiring inactive data.
651 results found
in 10 ms
Server-1
Software
MarketingEvents
Sales
To find out more visit www.kazoup.com.
Kazoup brings unstructured file data back under control in 3 steps: search, analyze and archive. Leveraging
beautiful data visualization, policy-based lifecycle management and cheap cloud object storage, Kazoup
helps you realize more value from your data whilst lowering the cost of storage.

More Related Content

What's hot

Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark DataAhmed Banafa
 
What's the Big Deal About Big Data?
What's the Big Deal About Big Data?What's the Big Deal About Big Data?
What's the Big Deal About Big Data?Logi Analytics
 
BRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATIONBRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATIONijmnct
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analyticsThe Marketing Distillery
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Brad Culbert
 
Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery Attivio
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Dataijtsrd
 
Dark Data Discovery & Governance with File Analysis
Dark Data Discovery & Governance with File AnalysisDark Data Discovery & Governance with File Analysis
Dark Data Discovery & Governance with File AnalysisCraig Adams
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Denodo
 
Slow Data Kills Business eBook - Improve the Customer Experience
Slow Data Kills Business eBook - Improve the Customer ExperienceSlow Data Kills Business eBook - Improve the Customer Experience
Slow Data Kills Business eBook - Improve the Customer ExperienceInterSystems
 
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartPaul Boal
 

What's hot (20)

Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark Data
 
What's the Big Deal About Big Data?
What's the Big Deal About Big Data?What's the Big Deal About Big Data?
What's the Big Deal About Big Data?
 
BRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATIONBRIDGING DATA SILOS USING BIG DATA INTEGRATION
BRIDGING DATA SILOS USING BIG DATA INTEGRATION
 
Dark data
Dark dataDark data
Dark data
 
TierPoint_ColocationWhitepaper-Six_Reasons
TierPoint_ColocationWhitepaper-Six_ReasonsTierPoint_ColocationWhitepaper-Six_Reasons
TierPoint_ColocationWhitepaper-Six_Reasons
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analytics
 
7 trends-for-big-data
7 trends-for-big-data7 trends-for-big-data
7 trends-for-big-data
 
Gartner Predicts 2018
Gartner Predicts 2018Gartner Predicts 2018
Gartner Predicts 2018
 
Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015Business Analytics & Big Data Trends and Predictions 2014 - 2015
Business Analytics & Big Data Trends and Predictions 2014 - 2015
 
Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Data
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Dealing with Dark Data
Dealing with Dark DataDealing with Dark Data
Dealing with Dark Data
 
Dark Data Discovery & Governance with File Analysis
Dark Data Discovery & Governance with File AnalysisDark Data Discovery & Governance with File Analysis
Dark Data Discovery & Governance with File Analysis
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
The ABCs of Big Data
The ABCs of Big DataThe ABCs of Big Data
The ABCs of Big Data
 
Big Data
Big DataBig Data
Big Data
 
Slow Data Kills Business eBook - Improve the Customer Experience
Slow Data Kills Business eBook - Improve the Customer ExperienceSlow Data Kills Business eBook - Improve the Customer Experience
Slow Data Kills Business eBook - Improve the Customer Experience
 
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and Smart
 

Similar to Dealing with Dark Data

Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paperJohn Enoch
 
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docxHow Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docxpooleavelina
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? ScaleFocus
 
Security issues in big data
Security issues in big data Security issues in big data
Security issues in big data Shallote Dsouza
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data FundamentalsSmarak Das
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big dataDigimark
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAudrey Britton
 
Stream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White PaperStream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White PaperImpetus Technologies
 
Guide to big data analytics
Guide to big data analyticsGuide to big data analytics
Guide to big data analyticsGahya Pandian
 
Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Gregg Barrett
 
Information economics and big data
Information economics and big dataInformation economics and big data
Information economics and big dataMark Albala
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 
Small data vs. Big data : back to the basics
Small data vs. Big data : back to the basicsSmall data vs. Big data : back to the basics
Small data vs. Big data : back to the basicsAhmed Banafa
 
Mejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMiguel Ángel Gómez
 
Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...
Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...
Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...Happiest Minds Technologies
 
Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Minds
 Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Minds Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Minds
Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Mindshappiestmindstech
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 

Similar to Dealing with Dark Data (20)

Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paper
 
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docxHow Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
How Analytics Has Changed in the Last 10 Years (and How It’s Staye.docx
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it?
 
2. Smart Data Discovery
2. Smart Data Discovery2. Smart Data Discovery
2. Smart Data Discovery
 
Security issues in big data
Security issues in big data Security issues in big data
Security issues in big data
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big data
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data Analytics
 
Stream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White PaperStream Meets Batch for Smarter Analytics- Impetus White Paper
Stream Meets Batch for Smarter Analytics- Impetus White Paper
 
Guide to big data analytics
Guide to big data analyticsGuide to big data analytics
Guide to big data analytics
 
Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...
 
Information economics and big data
Information economics and big dataInformation economics and big data
Information economics and big data
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Small data vs. Big data : back to the basics
Small data vs. Big data : back to the basicsSmall data vs. Big data : back to the basics
Small data vs. Big data : back to the basics
 
Mejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big Data
 
Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...
Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...
Whitepaper: Big Data 101 - Creating Real Value from the Data Lifecycle - Happ...
 
Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Minds
 Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Minds Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Minds
Big Data 101 - Creating Real Value from the Data Lifecycle - Happiest Minds
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 

Recently uploaded

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutionsmonugehlot87
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?Watsoo Telematics
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 

Recently uploaded (20)

Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 

Dealing with Dark Data

  • 2. We’re in the difficult middle years of the information age, where a nexus of factors like cheap storage, rich HD media, ubiquitous connectivity and more sophisticated SaaS products are generating more data than we can affordably store or meaningfully process. Why are we growing so much? Data is flooding in from a multitude of sources – some known and some invisible – which organizations today have neither the time nor the resources to effectively manage, let alone benefit from. The trouble is, whilst big data and analytics remain in vogue, neither the volume of data produced, nor the impulse to store it all, will change. In the pursuit of business intelligence, many organizations are hoarding – often unconsciously - useless data with the expectation that its potential value will eventually offset the costs of a bloated and unnavigable storage environment. Dark Data The main culprit behind this trend is something Gartner has called “dark data” – data which accumulates through automatic and manual processes, but which remains invisible to the business: idle, unanalyzed and without a clear owner. Being invisible, quantifying exactly how much dark data organizations are struggling with is problematic, but that hasn’t stopped the major analysts trying. First, a 2013 survey conducted by IDG Research Services found that only 28% of stored data presents any value to the day- to-day operations of a business, suggesting a massive 72% is non-essential. Second, IDC’s “Top 10 predictions for CMOs in 2014” corroborates these figures, suggesting that organizations will fail to realize any value from 80% of the customer data they hold because of “immature enterprise value chains”. Just in case you don’t speak analyst, that means current data management practices aren’t capable of locating and extracting the supposedly valuable information hidden amongst terabytes of collected data. It’s expensive to maintain that much unused data, as Gartner rightly points out: “… organizations that fail to optimize the way they manage and retain their data will be forced to deal with constant increases in storage costs”. But financial cost is only a part of the reason dark data is so damaging. Perhaps more importantly, dark data has become so ubiquitous that it obscures the useful stuff. It’s not just that organizations don’t have an adequate tool to sift through the data heap; it’s that in worshipping at the altar of analytics prematurely, we are actively hoarding useless data in the hope of one day extracting enormous value from it. As IDC’s CMO of Advisory Services put it, whilst big data analytics is a hot topic, most of this collected data: “[is] garbage. IDC’s data group researchers say that some 80% of data collected has no meaning whatsoever.” Or at least it won’t, until organizations are “smart enough [to have] a tool be able to differentiate between the signal and the noise.” A survey by IDG Research Services found that only 28% of stored data presents any value to the day-to-day operations of a business. 02
  • 3. What does dark data look like? Before we go on to look at what these tools might look like, we should think about the scale of the problem we expect them to fix. We must categorize the types of dark data organizations possess, and for each category, reconcile its potential value against the cost of its storage. For instance, server log files are individually small and unobtrusive, and may contain useful insights into customer behaviour when processed together. Even if they’re dark, they don’t represent a significant burden on the storage environment. Unstructured data, on the other hand, is without exception the single biggest driver in dark data growth. It’s a broad category of storage, which can include almost anything that exists outside of semantically tagged field forms and databases, and is estimated to constitute around 70-80% of all data in an average organization. It’s often human-generated information in the form of documents, presentations, reports, graphics, videos and audio that all begin as potentially valuable, but end up as half-finished ideas, discarded early-drafts or simply assets that serve their purpose and are no longer useful. Why is there so much of it? The answer to the spiraling growth of unstructured data is the same as its cause – data management practices (or rather, the lack of them). We’ll go on to look at the way tools can encourage better policy-based management of the data lifecycle shortly, but it is briefly worth reiterating that the solution to dark data is not technology – it is management. There’s no single cause behind the volume and variety of unstructured data organizations produce. Some of it is just a symptom of technological progress. We are using, producing and sharing more stuff - whether that is documents, presentations, emails, or media – because both the tools (and therefore output) have become more sophisticated and the quality of connectivity between us is faster and more reliable. There is one common thread though: standards of data management have not kept up with the pace of data growth. Not by a long shot. One of the most common problems is poorly maintained folder structures. In organizations where users are free to create data and folders within shared file stores, duplication of both content and the effort required to create it is incredibly common. Users become less productive because they can’t find the information they need, and the file stores become a tangled mess of non-standardized naming conventions, leading to massive amounts of erroneous data putting a great strain on storage. Another common problem is that old and unused file data is not actively retired once it is updated or has become irrelevant. In the Databarracks Data Health Check 2014, 49% of 401 respondents did not actively distinguish between unused and recently accessed file data despite it being the largest cause of storage growth. Unstructured data is estimated to constitute around 70-80% of all data in an average organization. 03
  • 4. What do we do about it? There is an appetite for tools able to shed some light on dark data. IDG’s report found that whilst 77% of enterprises expressed interest in a single platform solution that automatically manages data, only 10% actually had a completely automated process in place. Of course, organizations struggling with dark data (which, to be clear, is everyone) must first identify what they hope to achieve in finding it. Is it that there may be hidden value in documents long forgotten about, or that they hope to retire useless data to enable more cost-effective storage? In truth, this is a bit of a false dilemma – the answer is probably a combination of the two. However, it remains a useful distinction to make, if only to make a more informed decision about the capabilities they require from their chosen solution. Prospective data analytics tools must offer three core capabilities to reveal the location and condition of dark data, and minimize preventable growth in future. Search First, organizations need a strong search capability that scrapes both metadata and the actual content of unstructured data. This increases visibility into the dark areas of your storage environment and connects users to the information they need more quickly. Analyze Secondly, organizations need powerful analytics and reporting capabilities in order to extract actionable intelligence from large volumes of dark data. This is a twofold challenge: half technical and half design. The analytics must be accurate, responsive and exhaustive, but they must also be beautifully visualized to increase usability, comprehension and insight. Archive Finally, to address the problem of dark data in the long term, data analytics tools must facilitate the transfer of old and unused data to cheaper archive storage platforms. Cloud-based object storage is a cheap and highly scalable alternative to costly primary storage, and with the creation of management policies based on usage-rates and compliance obligations, organizations can automate the process of retiring inactive data. 651 results found in 10 ms Server-1 Software MarketingEvents Sales To find out more visit www.kazoup.com. Kazoup brings unstructured file data back under control in 3 steps: search, analyze and archive. Leveraging beautiful data visualization, policy-based lifecycle management and cheap cloud object storage, Kazoup helps you realize more value from your data whilst lowering the cost of storage.