Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Dark Data Revelation and its
Potential Benefits
What is dark
data?
The information assets organizations
collect, process and store during
regular business activities, but...
In simple terms, dark data is all that useful data an
organization possesses, but doesn’t actually
meaningfully use or ana...
The enormous digital universe
2013
2020 44 ZB 37% 27% 10%
4.4 ZB 22% 17% 2%
Total size of
digital
universe
Data useful
If
...
According to IDC (a research firm), up to 90 percent of the
digital universe is unstructured data.
Traditional sources of dark data
Server log files
Networking machine data
Point-of-sale feeds
Customer queries recorded in...
Why is it
important?
Businesses are heavily invested when
it comes to collection of data;
however, tangible value can be
d...
It is also a sensible step for any company which is getting
started with big data and building a data warehouse.
In this c...
3 facets of dark data
Existing
unstructured data
01
Nontraditional
unstructured data
02
Data in the deep
web
03
Existing unstructured data
Many businesses already have large collections of both structured and
unstructured data.
Unstructured data such as emails, notes, messages, documents, logs,
and notifications (including from IoT devices) are con...
Nontraditional unstructured data
Data present in the web pages, audio and video files and still
images are largely untappe...
This can help businesses perform advanced analytics on data
present in nontraditional formats to better understand their
c...
Data present in the deep web
The deep web presents the largest pool of
unused information—data curated by
academics, conso...
Companies can potentially curate competitive intelligence
using a type of emerging search tools developed to help users
ta...
An example of such tool can be Stanford University’s search
engine called Hidden Web Exposer that scrapes the deep web
for...
Potential risks
associated with dark
data
Legal and
regulatory
issues
If the data stored is covered by legal
regulations such as credit card data,
exposure of such ...
Intelligence risk
Companies could intentionally or
unintentionally disclose proprietary
or sensitive data on business
oper...
PR disaster
Companies are considered as
protector of data they collect. So, any
loss of data, especially sensitive and
con...
Opportunity
costs
If a company avoids analysis and
processing of dark data but its competitors
do, then its competitors wi...
Practical applications
of dark data
Stitch Fix, an online subscription shopping service, uses images from
social media and other sources to track emerging fas...
A financial services firm wanted to gain insight from its trading terminal data to find
correlations between trading patte...
Approaching dark data
Instead of attempting to discover and
collect all of the dark data hidden
within and outside your organization,
work with ...
Source data from the web to
augment your own data with publicly
available demographic, location, and
statistical informati...
Data scientists are valuable
resources, especially those who have
the skills to combine deep modeling
and statistical tech...
Advanced visualization software can boost
business intelligence by repackaging big data into
smaller, more meaningful chun...
Future of dark data
Most of the companies in general will learn to better tap
into their dark data, it’s the way connected...
Reach out to PromptCloud — a pioneer in custom, managed and cloud-based web
extraction services.
https://www.promptcloud.c...
Upcoming SlideShare
Loading in …5
×

Dark Data Revelation and its Potential Benefits

This presentation covers benefits, use cases, practical examples, potential issues and the approach that needs to be taken when it comes to harnessing the power of dark data (a largely untapped strategic play in the big data realm).

  • Be the first to comment

  • Be the first to like this

Dark Data Revelation and its Potential Benefits

  1. 1. Dark Data Revelation and its Potential Benefits
  2. 2. What is dark data? The information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes. - IT Glossary by Gartner
  3. 3. In simple terms, dark data is all that useful data an organization possesses, but doesn’t actually meaningfully use or analyze for the improvement of the business.
  4. 4. The enormous digital universe 2013 2020 44 ZB 37% 27% 10% 4.4 ZB 22% 17% 2% Total size of digital universe Data useful If analyzed Data from mobile devices Data from Embedded systems
  5. 5. According to IDC (a research firm), up to 90 percent of the digital universe is unstructured data.
  6. 6. Traditional sources of dark data Server log files Networking machine data Point-of-sale feeds Customer queries recorded in calls, emails, forms Underused employee data Meeting notes Unstructured information arising out of business mails and presentations Unused data resulting from business research and surveys
  7. 7. Why is it important? Businesses are heavily invested when it comes to collection of data; however, tangible value can be derived only after companies start to understand their dark data and how it can be applied.
  8. 8. It is also a sensible step for any company which is getting started with big data and building a data warehouse. In this case, dark data can be a reliable source of historical data.
  9. 9. 3 facets of dark data Existing unstructured data 01 Nontraditional unstructured data 02 Data in the deep web 03
  10. 10. Existing unstructured data Many businesses already have large collections of both structured and unstructured data.
  11. 11. Unstructured data such as emails, notes, messages, documents, logs, and notifications (including from IoT devices) are confined to the organization and remain largely unused (due to lack of tools and techniques or their absence in the database). These data assets could be potentially having valuable insights related to competitors, pricing and consumer behavior.
  12. 12. Nontraditional unstructured data Data present in the web pages, audio and video files and still images are largely untapped data that can be mined via data extraction solutions, computer vision, advanced pattern recognition, and video and sound analytics.
  13. 13. This can help businesses perform advanced analytics on data present in nontraditional formats to better understand their customers, employees, operations, and markets.
  14. 14. Data present in the deep web The deep web presents the largest pool of unused information—data curated by academics, consortia, government agencies, communities, and other third- party domains.
  15. 15. Companies can potentially curate competitive intelligence using a type of emerging search tools developed to help users target scientific research, activist data, or even hobbyist threads found in the deep web.
  16. 16. An example of such tool can be Stanford University’s search engine called Hidden Web Exposer that scrapes the deep web for information using a task-specific, human-assisted approach.
  17. 17. Potential risks associated with dark data
  18. 18. Legal and regulatory issues If the data stored is covered by legal regulations such as credit card data, exposure of such data could expose companies into financial and legal liabilities.
  19. 19. Intelligence risk Companies could intentionally or unintentionally disclose proprietary or sensitive data on business operations, products, financial status and business plans.
  20. 20. PR disaster Companies are considered as protector of data they collect. So, any loss of data, especially sensitive and confidential data, can lead to loss of reputation.
  21. 21. Opportunity costs If a company avoids analysis and processing of dark data but its competitors do, then its competitors will be in a better position to capture more market share by leveraging the insights from dark data.
  22. 22. Practical applications of dark data
  23. 23. Stitch Fix, an online subscription shopping service, uses images from social media and other sources to track emerging fashion trends and evolving customer preferences. Personalization in retail Questionnaire filled by clients Customer’s Pinterest board and social media scanned Data augmentation Deeper insight of customer’s style preference Appropriate clothing shipped to the customer
  24. 24. A financial services firm wanted to gain insight from its trading terminal data to find correlations between trading patterns and abuses like money laundering and other fraudulent activities. Most of the data was dark owing to the volume and geographically scattered storage. After the customer was able to utilize what was previously underutilized, and completed the data prep and analysis process to determine suspect patterns in transactional records, they took that analyzed data and created sophisticated predictive models that can identify activities that indicate the potential for fraud, and take measures to prevent fraud before it occurs. Fraud detection
  25. 25. Approaching dark data
  26. 26. Instead of attempting to discover and collect all of the dark data hidden within and outside your organization, work with the business team to find answers for specific business problems. Getting the right data
  27. 27. Source data from the web to augment your own data with publicly available demographic, location, and statistical information. Being open to third party data
  28. 28. Data scientists are valuable resources, especially those who have the skills to combine deep modeling and statistical techniques with industry or function-specific insights. Building data talent
  29. 29. Advanced visualization software can boost business intelligence by repackaging big data into smaller, more meaningful chunks, delivering value to users much faster. This is crucial since information can be more easily consumed when presented as an infographic, a dashboard, or another type of visual representation. Utilizing advanced visualization tools
  30. 30. Future of dark data Most of the companies in general will learn to better tap into their dark data, it’s the way connected and measurable world is progressing. The real value will be delivered to those business that would open their data sources in a secure and responsible manner within their business so that the workforce is empowered enough to become problem solvers in own right.
  31. 31. Reach out to PromptCloud — a pioneer in custom, managed and cloud-based web extraction services. https://www.promptcloud.com | sales@promptcloud.com Looking to augment data assets with web data?

×