SlideShare a Scribd company logo
Towards a Big Data Taxonomy
Bill Mandrick, PhD
Data Tactics
Version 26_August_2013
Scientific Taxonomies Represent
• Types of Processes
• Types of Objects
– Physical Objects
– Information Artifacts
• Types of Characteristics
– Qualities
– Roles
• Relationships
– Between Processes
– Between Objects
– Between Characteristics
2
Big Data Taxonomy
• Big Data Related Processes
• Big Data Characteristics
• Big Data Information Artifacts
• Big Data Information Bearers
• Relationships between Big Data Elements
• Mapping Instances to the Taxonomy
• Creating Situational Awareness
3
Relations Between Processes
• Processes A <relation> Processes B
– Complex Process <has part> Sub-Process
– Sub-Process <part of> Complex Process
– Process A <precedes> Process B
– Process A <follows> Process B
Examples:
Data Curation Process <has part> Data Selection Process
Data Curation Process <has part> Data Collection Process
Data Curation Process <has part> Data Archiving Process
4
Information Artifact Lifecycle Processes
• Collecting
• Curating
• Representing
• Storing
– Cluster Storing
• Managing
– Processing
• Distributed Processing
– Map Reduce
• Analyzing
– Data Mining
– Causal Analysis
– Probabilistic Analysis
– Correlation Analysis
• Data Collection Process
• Data Curation Process
• Data Representation Process
• Data Storing Process
– Cluster Storing Process
• Data Management Process
– Processing
• Distributed Data Process
– Map Reduce Process
• Data Analytics Process
– Data Mining Process
– Causal Analysis Process
– Probabilistic Analysis Process
– Correlation Analysis Process
Common Labels Taxonomy Labels
5
Big Data Processes
6
Big Data Processes can be
decomposed and related to
other (sub)processes
…as well as to their outputs
(Information Artifacts).
Relating Processes to Products
7
Big Data Information Artifacts
8
9
10
Information Content Entities
11
Use Case
Data Characteristics
12
Information Bearers
13
Partial Taxonomy
14
Human Genome Data
15
Terms from Human Genome Data Use Case
Use Case Term:
Genomic Measurements
Reference Materials
Reference Data
Reference Methods
Assess Performance
Genome Sequencing
Integrate Data
Sequencing Technologies
Sequencing Methods
Characterization
Whole Human Genomes
Assess Performance
Genome Sequencing Run
Computer System
Storage
Networking
Processing
Software
Open Source Sequencing Bioinformatics Software
Data Source
Sequencer
Volume
Variety
Variability
Veracity
Visualization
Data Quality
Data Types
Data Analytics
Taxonomical Term:
Genomic Measurement Result (Measurement Result)
Reference Material Role
Reference Data Role
Reference Method
Performance Assessment Process
Genome Sequencing Process
Data Integration Process
Data Sequencing Technology (Tool)
Sequencing Method (Process)
Characterization (Data Characterization, IA or ICE)
Whole Human Genome Characterization (IA or ICE?)
Performance Assessment Process
Genome Sequencing Run
Computer System
Data Storage Process
Computer Networking Process
Data Processing Process
Software (IAO placement?)
Bioinformatics Sequencing Software
Data Source Role
Sequencer
Data Volume (Characteristic)
Data Variety (Characteristic)
Data Variability (Characteristic)
Data Veracity (Characteristic)
Data Visualization Process
Data Quality (Characteristic)
Data Type
Data Analytics Process
16
Information Artifacts:
Human Genome Data Measurement Result
Characterization (Data Characterization, IA or ICE)
Whole Human Genome Characterization (IA or ICE?)
Performance Assessment
Genome Sequence
Software (IAO placement?)
Data Visualization
Processes:
Human Genome Data Measurement Process
Reference Method
Performance Assessment Process
Genome Sequencing Process
Data Integration Process
Sequencing Method (Process)
Data Characterization Process
Performance Assessment Process
Genome Sequencing Run
Data Storage Process
Computer Networking Process
Data Processing Process
Data Visualization Process
Data Analytics Process
Roles and Characteristics:
Reference Material Role
Reference Data Role
Data Source Role
Data Volume (Characteristic)
Data Variety (Characteristic)
Data Variability (Characteristic)
Data Veracity (Characteristic)
Data Visualization Process
Data Quality (Characteristic)
Artifacts/Tools:
Data Sequencing Technology (Tool)
Computer System
Computer Network
Software (IAO placement?)
Bioinformatics Sequencing Software
Sequencer
17
Terms from Human Genome Data Use Case
Genomic Research Organizations
18
Instances
DNA Data Sets
19
Instances
DNA Organizational Roles
20Instances
Agent Roles
21
DNA Visualization
22
Instances
Conclusion
• This method can be done for any part of the
Big Data Taxonomy
• Need SME input for various areas/domains
• Need to add definitions in owl
• Need to expand set of standardized relations
• Link instances to the taxonomy (e.g. actual
data sets, organizations, etc.)
23

More Related Content

What's hot

Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Kathmandu Living Labs
 
Data mining techniques unit 2
Data mining techniques unit 2Data mining techniques unit 2
Data mining techniques unit 2
malathieswaran29
 
Reading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesReading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesJürgen Umbrich
 
Role of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big DataRole of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big Data
Manish Mathuria
 
Data mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPData mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPjaya lakshmi
 
Discovery informaticsstanton
Discovery informaticsstantonDiscovery informaticsstanton
Discovery informaticsstanton
Syracuse University
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
DataONE
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
DHIVYADEVAKI
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
Big Data Value Association
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
csandit
 
Data mining
Data miningData mining
Data mining
ShwetA Kumari
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
Phoenix Bioinformatics
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
Alasdair Gray
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingsuganmca14
 
Data pre processing
Data pre processingData pre processing
Data pre processingpommurajopt
 
Resume xiaodan(vinci)
Resume xiaodan(vinci)Resume xiaodan(vinci)
Resume xiaodan(vinci)
vinci105
 
Data Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | EdurekaData Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | Edureka
Edureka!
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, Members
Crossref
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
DataminingTools Inc
 

What's hot (19)

Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
Prof. Melinda Laituri, Colorado State University | Map Data Integrity | SotM ...
 
Data mining techniques unit 2
Data mining techniques unit 2Data mining techniques unit 2
Data mining techniques unit 2
 
Reading Group: From Database to Dataspaces
Reading Group: From Database to DataspacesReading Group: From Database to Dataspaces
Reading Group: From Database to Dataspaces
 
Role of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big DataRole of Biometric in Reducing the Size of Big Data
Role of Biometric in Reducing the Size of Big Data
 
Data mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAPData mining course learning outcomes,Data Mining CMAP
Data mining course learning outcomes,Data Mining CMAP
 
Discovery informaticsstanton
Discovery informaticsstantonDiscovery informaticsstanton
Discovery informaticsstanton
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 
Issues, challenges, and solutions
Issues, challenges, and solutionsIssues, challenges, and solutions
Issues, challenges, and solutions
 
Data mining
Data miningData mining
Data mining
 
TAIR ICAR 2010 Presentation
TAIR ICAR 2010 PresentationTAIR ICAR 2010 Presentation
TAIR ICAR 2010 Presentation
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data pre processing
Data pre processingData pre processing
Data pre processing
 
Resume xiaodan(vinci)
Resume xiaodan(vinci)Resume xiaodan(vinci)
Resume xiaodan(vinci)
 
Data Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | EdurekaData Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | Edureka
 
Introduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, MembersIntroduction to Crossref: History, Mission, Members
Introduction to Crossref: History, Mission, Members
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 

Viewers also liked

Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...William LaPorte
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomiesawesomesos
 
Topic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageTopic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and Storage
Zubair Nabi
 
Global taxonomy initiative ppt
Global taxonomy initiative  pptGlobal taxonomy initiative  ppt
Global taxonomy initiative ppt
Krishnapriya Priya
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesJoão Gabriel Lima
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
Theresa Putkey
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Design
sarakirsten
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
David Champeau
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
Access Innovations, Inc.
 
Database mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyDatabase mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomy
Alexander Falk
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata managementOpen Data Support
 

Viewers also liked (12)

Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
Taxonomy Management, Automatic Metadata Tagging & Auto Classification in Shar...
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
 
Topic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageTopic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and Storage
 
Global taxonomy initiative ppt
Global taxonomy initiative  pptGlobal taxonomy initiative  ppt
Global taxonomy initiative ppt
 
A comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notesA comparison between several no sql databases with comments and notes
A comparison between several no sql databases with comments and notes
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Design
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
 
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and MethodologyEnterprise Knowledge - Taxonomy Design Best Practices and Methodology
Enterprise Knowledge - Taxonomy Design Best Practices and Methodology
 
Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
Database mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyDatabase mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomy
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 

Similar to Big Data Taxonomy 8/26/2013

Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
Scott Edmunds
 
data mining
data miningdata mining
data mining
manasa polu
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and Trends
VijayasankariS
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
VijayasankariS
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
Salah Amean
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
LizLyon
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
Dhilsath Fathima
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
SamPrem3
 
Chaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberChaper 13 trend, Han & Kamber
Chaper 13 trend, Han & Kamber
Houw Liong The
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
thamizh arasi
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
PalaniKumarR2
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
hktripathy
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
Venkatesh Prasad Ranganath
 
Provinance in scientific workflows in e science
Provinance in scientific workflows in e scienceProvinance in scientific workflows in e science
Provinance in scientific workflows in e science
bdemchak
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
Devakumar Jain
 

Similar to Big Data Taxonomy 8/26/2013 (20)

Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
data mining
data miningdata mining
data mining
 
Data Mining Application and Trends
Data Mining Application and TrendsData Mining Application and Trends
Data Mining Application and Trends
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Chaper 13 trend, Han & Kamber
Chaper 13 trend, Han & KamberChaper 13 trend, Han & Kamber
Chaper 13 trend, Han & Kamber
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
13 trend
13 trend13 trend
13 trend
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
 
Provinance in scientific workflows in e science
Provinance in scientific workflows in e scienceProvinance in scientific workflows in e science
Provinance in scientific workflows in e science
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 

More from DataTactics

NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
DataTactics
 
C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic Presentation
DataTactics
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka
DataTactics
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
DataTactics
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics PracticeDataTactics
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data ConferenceDataTactics
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities DemoDataTactics
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag DataTactics
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and ReportsDataTactics
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionDataTactics
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013DataTactics
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataDataTactics
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2DataTactics
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013DataTactics
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief AnalyticsDataTactics
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcDataTactics
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1DataTactics
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3DataTactics
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source BriefDataTactics
 

More from DataTactics (20)

NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
 
C Star Analytic Presentation
C Star Analytic PresentationC Star Analytic Presentation
C Star Analytic Presentation
 
Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka Text Analysis Using Twitter: A Case Study in Dhaka
Text Analysis Using Twitter: A Case Study in Dhaka
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Data Tactics Analytics Practice
Data Tactics Analytics PracticeData Tactics Analytics Practice
Data Tactics Analytics Practice
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
 
Discontinuities Demo
Discontinuities DemoDiscontinuities Demo
Discontinuities Demo
 
DLISA
DLISADLISA
DLISA
 
Analytics Brownbag
Analytics Brownbag Analytics Brownbag
Analytics Brownbag
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and Reports
 
Data Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and DescriptionData Tactics Unified Dataspace Architecture and Description
Data Tactics Unified Dataspace Architecture and Description
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence Data
 
Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2Bill Ontology Summit (08 feb 1400hrs) v2
Bill Ontology Summit (08 feb 1400hrs) v2
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
 
Capabilities Brief Analytics
Capabilities Brief AnalyticsCapabilities Brief Analytics
Capabilities Brief Analytics
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3
 
Data Tactics Open Source Brief
Data Tactics Open Source BriefData Tactics Open Source Brief
Data Tactics Open Source Brief
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 

Big Data Taxonomy 8/26/2013

  • 1. Towards a Big Data Taxonomy Bill Mandrick, PhD Data Tactics Version 26_August_2013
  • 2. Scientific Taxonomies Represent • Types of Processes • Types of Objects – Physical Objects – Information Artifacts • Types of Characteristics – Qualities – Roles • Relationships – Between Processes – Between Objects – Between Characteristics 2
  • 3. Big Data Taxonomy • Big Data Related Processes • Big Data Characteristics • Big Data Information Artifacts • Big Data Information Bearers • Relationships between Big Data Elements • Mapping Instances to the Taxonomy • Creating Situational Awareness 3
  • 4. Relations Between Processes • Processes A <relation> Processes B – Complex Process <has part> Sub-Process – Sub-Process <part of> Complex Process – Process A <precedes> Process B – Process A <follows> Process B Examples: Data Curation Process <has part> Data Selection Process Data Curation Process <has part> Data Collection Process Data Curation Process <has part> Data Archiving Process 4
  • 5. Information Artifact Lifecycle Processes • Collecting • Curating • Representing • Storing – Cluster Storing • Managing – Processing • Distributed Processing – Map Reduce • Analyzing – Data Mining – Causal Analysis – Probabilistic Analysis – Correlation Analysis • Data Collection Process • Data Curation Process • Data Representation Process • Data Storing Process – Cluster Storing Process • Data Management Process – Processing • Distributed Data Process – Map Reduce Process • Data Analytics Process – Data Mining Process – Causal Analysis Process – Probabilistic Analysis Process – Correlation Analysis Process Common Labels Taxonomy Labels 5
  • 6. Big Data Processes 6 Big Data Processes can be decomposed and related to other (sub)processes …as well as to their outputs (Information Artifacts).
  • 8. Big Data Information Artifacts 8
  • 9. 9
  • 10. 10
  • 16. Terms from Human Genome Data Use Case Use Case Term: Genomic Measurements Reference Materials Reference Data Reference Methods Assess Performance Genome Sequencing Integrate Data Sequencing Technologies Sequencing Methods Characterization Whole Human Genomes Assess Performance Genome Sequencing Run Computer System Storage Networking Processing Software Open Source Sequencing Bioinformatics Software Data Source Sequencer Volume Variety Variability Veracity Visualization Data Quality Data Types Data Analytics Taxonomical Term: Genomic Measurement Result (Measurement Result) Reference Material Role Reference Data Role Reference Method Performance Assessment Process Genome Sequencing Process Data Integration Process Data Sequencing Technology (Tool) Sequencing Method (Process) Characterization (Data Characterization, IA or ICE) Whole Human Genome Characterization (IA or ICE?) Performance Assessment Process Genome Sequencing Run Computer System Data Storage Process Computer Networking Process Data Processing Process Software (IAO placement?) Bioinformatics Sequencing Software Data Source Role Sequencer Data Volume (Characteristic) Data Variety (Characteristic) Data Variability (Characteristic) Data Veracity (Characteristic) Data Visualization Process Data Quality (Characteristic) Data Type Data Analytics Process 16
  • 17. Information Artifacts: Human Genome Data Measurement Result Characterization (Data Characterization, IA or ICE) Whole Human Genome Characterization (IA or ICE?) Performance Assessment Genome Sequence Software (IAO placement?) Data Visualization Processes: Human Genome Data Measurement Process Reference Method Performance Assessment Process Genome Sequencing Process Data Integration Process Sequencing Method (Process) Data Characterization Process Performance Assessment Process Genome Sequencing Run Data Storage Process Computer Networking Process Data Processing Process Data Visualization Process Data Analytics Process Roles and Characteristics: Reference Material Role Reference Data Role Data Source Role Data Volume (Characteristic) Data Variety (Characteristic) Data Variability (Characteristic) Data Veracity (Characteristic) Data Visualization Process Data Quality (Characteristic) Artifacts/Tools: Data Sequencing Technology (Tool) Computer System Computer Network Software (IAO placement?) Bioinformatics Sequencing Software Sequencer 17 Terms from Human Genome Data Use Case
  • 23. Conclusion • This method can be done for any part of the Big Data Taxonomy • Need SME input for various areas/domains • Need to add definitions in owl • Need to expand set of standardized relations • Link instances to the taxonomy (e.g. actual data sets, organizations, etc.) 23