SlideShare a Scribd company logo
1 of 42
Download to read offline
Amir Sedighi
February 2017
Dark Data
Risks and Opportunities
@amirsedighi
Speaker
Amir Sedighi
Software Engineer

Data Solutions Architect
Founder at recommender.ir
twitter: @amirsedighi
By even the most conservative estimates, the amount
of data in the world doubles every two years.
Data Era
May Venn Diagram helps us!
Big Data
May Venn Diagram helps us!
Tabular/
Relational/
RDBMS
Data
Big Data
May Venn Diagram helps us!
Dark Data
Tabular/
Relational/
RDBMS
Data
Big Data
May Venn Diagram helps us!
Dark Data
Tabular/
Relational/
RDBMS
Data
(Structured/Unstructured)
(Almost Unstructured)
(Structured)
Big Data
May Venn Diagram helps us!
Dark Data
Tabular/
Relational/
RDBMS
Data
(Structured/Unstructured)
(Almost Unstructured)
(Structured)
Big Data
Almost can’t be
processed or analyzed
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Dark Data Definition by Gartner
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Similar to dark matter in physics, dark data often
comprises most organizations’ universe of
information assets.
Dark Data Definition by Gartner
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Similar to dark matter in physics, dark data often
comprises most organizations’ universe of
information assets.
Thus, organizations often retain dark data for
compliance purposes only. Storing and securing
data typically incurs more expense (and sometimes
greater risk) than value.
Dark Data Definition by Gartner
Gartner defines dark data as the information assets
organizations collect, process and store during
regular business activities, but generally fail to use
for other purposes (for example, analytics, business
relationships and direct monetizing).
Similar to dark matter in physics, dark data often
comprises most organizations’ universe of
information assets.
Thus, organizations often retain dark data for
compliance purposes only. Storing and securing
data typically incurs more expense (and sometimes
greater risk) than value.
Dark Data Definition by Gartner
Dark Data - A more Sensible Definition
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
A large portion of the
collected data are
never even analyzed!
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
A large portion of the
collected data are
never even analyzed!
90% of the data are
never analyzed
Dark Data - A more Sensible Definition
Organizations Generate
and Gather Data
A large portion of the
collected data are
never even analyzed!
90% of the data are
never analysed.
• Customer Information
• Log Files
• Previous Employee Information
• Previous Webpages
• Sensor Data
• Email Correspondences
• Account Information
• Notes or Presentations
• Old Versions of Relevant
Documents
80%..90% is Dark Data
Does Your Org have any Dark Data?
I am just going to
check if we have
any dark data in
the cellar…
Brining Dark Data into Light
1. Gathering
2. Storing/Processing
3. Analyzing and Bringing it into decisions
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Brining Dark Data into Light
Question
All companies know data is going to provide value.
Why there is so much of dark data?
Why there is so much of dark data?
• Lack of insight about data
• Lack of ambitions to improve
• Disconnect among departments
• Lopsided priorities
• Lack of technologies to Capture and Store
• Lack of resources/infrastructures to make it available
• Lack of CPU and technics to analyze the data
The issues you face with Dark Data
• Legal and Regulatory Issues
• Loss of Reputation
• Intelligence Risk
• Operation Costs
• Opportunity Costs
Some essential questions
• What can we gather?
• What may we extract from it?
• How we may prune it?
• How long should we keep it?
• What are the storage options?
• What are the processing options?
• How much is the value of each block of data
(Approximately)
• Running limited boundary scenarios
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Log Management
Software Tools & Frameworks on DD
Indexing and Search
Software Tools & Frameworks on DD
Data Streaming
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Software Tools & Frameworks on DD
Machine Learning and Graph Processing
• Mahout
• MLLib
• FlinkMK
• Theano
• Torch
• TensorFlow
• GraphX
• Gelly
A common Pipeline
Machine
Learning
Steam Processing
Query
Already Processed Data
Real World RT Events
A common Pipeline
Machine
Learning
Steam Processing
Query
Already Processed Data
Real World RT Events
New Pipeline
Questions?
Keep in touch:
twitter: @amirsedighi
1. http://www.gartner.com/it-glossary/dark-data/
2. http://www.itproportal.com/2016/03/07/5-benefits-of-putting-dark-data-to-work/
3. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data-world.html
4. https://www.youtube.com/watch?v=_fBMmQo-Z4E
5. http://confluent.io
6. https://www.ecmconnection.com/doc/the-various-shades-of-dark-data-0001
7. https://www.datanami.com/2015/11/30/spark-streaming-what-is-it-and-whos-using-it/
References

More Related Content

What's hot

Cloud Journey Roadmap: Capgemini's Cloud Readiness Assessment
Cloud Journey Roadmap: Capgemini's Cloud Readiness AssessmentCloud Journey Roadmap: Capgemini's Cloud Readiness Assessment
Cloud Journey Roadmap: Capgemini's Cloud Readiness AssessmentCapgemini
 
Should I move my database to the cloud?
Should I move my database to the cloud?Should I move my database to the cloud?
Should I move my database to the cloud?James Serra
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Riccardo Zamana
 
Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...
Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...
Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...Amazon Web Services
 
Cloud computing simple ppt
Cloud computing simple pptCloud computing simple ppt
Cloud computing simple pptAgarwaljay
 
Cyber Physical System: Architecture, Applications and Research Challenges
Cyber Physical System: Architecture, Applicationsand Research ChallengesCyber Physical System: Architecture, Applicationsand Research Challenges
Cyber Physical System: Architecture, Applications and Research ChallengesSyed Hassan Ahmed
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 
Introduction to snowflake
Introduction to snowflakeIntroduction to snowflake
Introduction to snowflakeSunil Gurav
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSAmazon Web Services
 
Getting Started with Amazon QuickSight
Getting Started with Amazon QuickSightGetting Started with Amazon QuickSight
Getting Started with Amazon QuickSightAmazon Web Services
 
Splunk Overview
Splunk OverviewSplunk Overview
Splunk OverviewSplunk
 
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud AppsSecure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud AppsVignesh Ganesan I Microsoft MVP
 
12 Ways to Manage Cloud Costs and Optimize Cloud Spend
12 Ways to Manage Cloud Costs and Optimize Cloud Spend12 Ways to Manage Cloud Costs and Optimize Cloud Spend
12 Ways to Manage Cloud Costs and Optimize Cloud SpendRightScale
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 

What's hot (20)

Big data ppt
Big data pptBig data ppt
Big data ppt
 
Cloud Journey Roadmap: Capgemini's Cloud Readiness Assessment
Cloud Journey Roadmap: Capgemini's Cloud Readiness AssessmentCloud Journey Roadmap: Capgemini's Cloud Readiness Assessment
Cloud Journey Roadmap: Capgemini's Cloud Readiness Assessment
 
Should I move my database to the cloud?
Should I move my database to the cloud?Should I move my database to the cloud?
Should I move my database to the cloud?
 
Cost Optimization on AWS
Cost Optimization on AWSCost Optimization on AWS
Cost Optimization on AWS
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
 
Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...
Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...
Amazon Elastic MapReduce Deep Dive and Best Practices (BDT404) | AWS re:Inven...
 
Data Lifecycle Management
Data Lifecycle ManagementData Lifecycle Management
Data Lifecycle Management
 
Cloud computing simple ppt
Cloud computing simple pptCloud computing simple ppt
Cloud computing simple ppt
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cyber Physical System: Architecture, Applications and Research Challenges
Cyber Physical System: Architecture, Applicationsand Research ChallengesCyber Physical System: Architecture, Applicationsand Research Challenges
Cyber Physical System: Architecture, Applications and Research Challenges
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
Introduction to snowflake
Introduction to snowflakeIntroduction to snowflake
Introduction to snowflake
 
Big Data
Big DataBig Data
Big Data
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWS
 
Getting Started with Amazon QuickSight
Getting Started with Amazon QuickSightGetting Started with Amazon QuickSight
Getting Started with Amazon QuickSight
 
Splunk Overview
Splunk OverviewSplunk Overview
Splunk Overview
 
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud AppsSecure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
Secure your Access to Cloud Apps using Microsoft Defender for Cloud Apps
 
12 Ways to Manage Cloud Costs and Optimize Cloud Spend
12 Ways to Manage Cloud Costs and Optimize Cloud Spend12 Ways to Manage Cloud Costs and Optimize Cloud Spend
12 Ways to Manage Cloud Costs and Optimize Cloud Spend
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 

Viewers also liked

Big Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsBig Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsFarzad Nozarian
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Amir Sedighi
 
تحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعیتحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعیHamed Azizi
 
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)khalooei
 
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوبتحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوبHamed Azizi
 

Viewers also liked (6)

Big Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsBig Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing Environments
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
 
Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM
 
تحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعیتحلیل احساسات در شبکه های اجتماعی
تحلیل احساسات در شبکه های اجتماعی
 
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
تحلیل با رویکرد یادگیری ژرف بر بستر کلان‌داده (2)
 
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوبتحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
تحلیل احساسات شبکه اجتماعی متن کاوی نظرکاوی حامد عزیزی تهران جنوب
 

Similar to Dark data

INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
Discovering Big Data in the Fog: Why Catalogs Matter
 Discovering Big Data in the Fog: Why Catalogs Matter Discovering Big Data in the Fog: Why Catalogs Matter
Discovering Big Data in the Fog: Why Catalogs MatterEric Kavanagh
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?Albert Hoitingh
 
Dealing with Dark Data
Dealing with Dark DataDealing with Dark Data
Dealing with Dark DataKazoup
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark DataAhmed Banafa
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalogSteven Meister
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfcedrinemadera
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...Priyanka Aash
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Dataijtsrd
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data scienceThinkful
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It? Caserta
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCTJ Stalcup
 

Similar to Dark data (20)

INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
Discovering Big Data in the Fog: Why Catalogs Matter
 Discovering Big Data in the Fog: Why Catalogs Matter Discovering Big Data in the Fog: Why Catalogs Matter
Discovering Big Data in the Fog: Why Catalogs Matter
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
ExpertsLive NL 2022 - Microsoft Purview - What's in it for my organization?
 
Dealing with Dark Data
Dealing with Dark DataDealing with Dark Data
Dealing with Dark Data
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark Data
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Big datarevealed hadoop catalog
Big datarevealed hadoop catalogBig datarevealed hadoop catalog
Big datarevealed hadoop catalog
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdf
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
(SACON) Ramkumar Narayanan - Personal Data Discovery & Mapping - Challenges f...
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Data
 
Thilga
ThilgaThilga
Thilga
 
2017 06-14-getting started with data science
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data science
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 

More from Amir Sedighi

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Amir Sedighi
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMAmir Sedighi
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranAmir Sedighi
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionAmir Sedighi
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Amir Sedighi
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupAmir Sedighi
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingAmir Sedighi
 
Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Amir Sedighi
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)Amir Sedighi
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache SparkAmir Sedighi
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUAmir Sedighi
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAmir Sedighi
 
An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAmir Sedighi
 

More from Amir Sedighi (18)

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACM
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACM
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACM
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACM
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACM
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 
Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache Spark
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBU
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoop
 
An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for Beginners
 

Recently uploaded

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 

Recently uploaded (20)

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 

Dark data

  • 1. Amir Sedighi February 2017 Dark Data Risks and Opportunities @amirsedighi
  • 2. Speaker Amir Sedighi Software Engineer
 Data Solutions Architect Founder at recommender.ir twitter: @amirsedighi
  • 3. By even the most conservative estimates, the amount of data in the world doubles every two years. Data Era
  • 4. May Venn Diagram helps us! Big Data
  • 5. May Venn Diagram helps us! Tabular/ Relational/ RDBMS Data Big Data
  • 6. May Venn Diagram helps us! Dark Data Tabular/ Relational/ RDBMS Data Big Data
  • 7. May Venn Diagram helps us! Dark Data Tabular/ Relational/ RDBMS Data (Structured/Unstructured) (Almost Unstructured) (Structured) Big Data
  • 8. May Venn Diagram helps us! Dark Data Tabular/ Relational/ RDBMS Data (Structured/Unstructured) (Almost Unstructured) (Structured) Big Data Almost can’t be processed or analyzed
  • 9. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Dark Data Definition by Gartner
  • 10. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Dark Data Definition by Gartner
  • 11. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value. Dark Data Definition by Gartner
  • 12. Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data typically incurs more expense (and sometimes greater risk) than value. Dark Data Definition by Gartner
  • 13. Dark Data - A more Sensible Definition
  • 14. Dark Data - A more Sensible Definition Organizations Generate and Gather Data
  • 15. Dark Data - A more Sensible Definition Organizations Generate and Gather Data A large portion of the collected data are never even analyzed!
  • 16. Dark Data - A more Sensible Definition Organizations Generate and Gather Data A large portion of the collected data are never even analyzed! 90% of the data are never analyzed
  • 17. Dark Data - A more Sensible Definition Organizations Generate and Gather Data A large portion of the collected data are never even analyzed! 90% of the data are never analysed. • Customer Information • Log Files • Previous Employee Information • Previous Webpages • Sensor Data • Email Correspondences • Account Information • Notes or Presentations • Old Versions of Relevant Documents
  • 19. Does Your Org have any Dark Data? I am just going to check if we have any dark data in the cellar…
  • 20. Brining Dark Data into Light 1. Gathering 2. Storing/Processing 3. Analyzing and Bringing it into decisions
  • 21. Brining Dark Data into Light
  • 22. Brining Dark Data into Light
  • 23. Brining Dark Data into Light
  • 24. Brining Dark Data into Light
  • 25. Brining Dark Data into Light
  • 26. Brining Dark Data into Light
  • 27. Question All companies know data is going to provide value. Why there is so much of dark data?
  • 28. Why there is so much of dark data? • Lack of insight about data • Lack of ambitions to improve • Disconnect among departments • Lopsided priorities • Lack of technologies to Capture and Store • Lack of resources/infrastructures to make it available • Lack of CPU and technics to analyze the data
  • 29. The issues you face with Dark Data • Legal and Regulatory Issues • Loss of Reputation • Intelligence Risk • Operation Costs • Opportunity Costs
  • 30. Some essential questions • What can we gather? • What may we extract from it? • How we may prune it? • How long should we keep it? • What are the storage options? • What are the processing options? • How much is the value of each block of data (Approximately) • Running limited boundary scenarios
  • 31. Software Tools & Frameworks on DD
  • 32. Software Tools & Frameworks on DD
  • 33. Software Tools & Frameworks on DD Log Management
  • 34. Software Tools & Frameworks on DD Indexing and Search
  • 35. Software Tools & Frameworks on DD Data Streaming
  • 36. Software Tools & Frameworks on DD
  • 37. Software Tools & Frameworks on DD
  • 38. Software Tools & Frameworks on DD Machine Learning and Graph Processing • Mahout • MLLib • FlinkMK • Theano • Torch • TensorFlow • GraphX • Gelly
  • 39. A common Pipeline Machine Learning Steam Processing Query Already Processed Data Real World RT Events
  • 40. A common Pipeline Machine Learning Steam Processing Query Already Processed Data Real World RT Events New Pipeline
  • 42. 1. http://www.gartner.com/it-glossary/dark-data/ 2. http://www.itproportal.com/2016/03/07/5-benefits-of-putting-dark-data-to-work/ 3. http://www.kdnuggets.com/2015/11/importance-dark-data-big-data-world.html 4. https://www.youtube.com/watch?v=_fBMmQo-Z4E 5. http://confluent.io 6. https://www.ecmconnection.com/doc/the-various-shades-of-dark-data-0001 7. https://www.datanami.com/2015/11/30/spark-streaming-what-is-it-and-whos-using-it/ References