SlideShare a Scribd company logo
Kaloyan Haralampiev
Angel Marchev 2.0
Relationship “Research tasks Data structure”
Research tasks Data structure
Descriptive statistics Variables only
Segmentation
Dimension reduction
Measures of associations Dependent variable(s) and independent
variable(s)Decision making under risk
Time series analysis Time as independent variable
Examples for different data structures
 Variables only  Dependent variable(s) and
independent variable(s)
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Descriptive statistics
Scale Statistical method
Nominal Percent
Ordinal, rank or score + Cumulative frequencies
+ Cumulative percent
Interval + Central tendency (mean, median, mode)
+ Dispersion (variance, standard deviation, range)
+ Skewness
+ Kurtosis
+ Quartiles, deciles, percentiles
Example for nominal and ordinal scales
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Example for interval scales
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Segmentation
Scale Statistical method
Nominal
Binary
Ordinal
Rank
Score
Interval
TwoStep cluster
Binary
Score
Interval
+ Hierarchical cluster
+ K-means cluster
+ Multidimensional scaling
Clustering
8
Example for hierarchical cluster
Data: PISA (http://www.oecd.org/pisa/)
Software: Orange (https://orange.biolab.si/)
Example for K-means cluster
Data: PISA (http://www.oecd.org/pisa/)
Software: Orange (https://orange.biolab.si/)
Example for multidimensional scaling
Data: PISA (http://www.oecd.org/pisa/)
Software: Orange (https://orange.biolab.si/)
Dimension reduction
Scale Statistical method
Binary Likert scale + Cronbach’s Alpha
Factor analysisScore
Interval
Example for factor analysis
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Measures of association and/or decision making under risk
Independent variable(s) Dependent variable(s)
Nominal
Ordinal
Interval
Nominal
Ordinal
Contingency tables
Logistic regression
Classification trees
ANOVA
Classification trees
Interval Logistic regression
Classification trees
Regression
Correlation
Classification trees
Mix of nominal and/or ordinal and/or
interval
Logistic regression
Classification trees
Classification trees
Example for contingency tables
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Example for ANOVA
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Advantages
 Quick method (relatively
short computing time)
 Coefficients easy to interpret
Disadvantages
 Uses only one possible function
 Don’t work with quantitative
target
 Ignores cases with missing
values
17
Applications
• Analysis of associations
• Classification
• Pattern recognition
• etc.
Example for regression and correlation
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Advantages
 Quick method (relatively
short computing time)
 Coefficients easy to
interpret
 Works with different
functions
Disadvantages
 Don’t work with qualitative
target
 Ignores cases with missing
values
19
Applications
• Analysis of associations
• Pattern recognition
• etc.
Example for logistic regression
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Advantages
 Quick method (relatively
short computing time)
 Coefficients easy to
interpret
Disadvantages
 Uses only one possible
function
 Don’t work with quantitative
target
 Ignores cases with missing
values
21
Applications
• Analysis of associations
• Classification
• Pattern recognition
• etc.
Example for classification trees
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Advantages
 Non-parametric method
which fits data as they are
 Describes in detail the
positive and the negative
targets
 Works with qualitative and
quantitative targets
 Works with missing values
Disadvantages
 Slow method (relatively
large computing time)
 Generates very large trees
and pruning is needed
23
Applications
• Analysis of associations
• Classification
• Pattern recognition
• etc.
Another example for classification trees
Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/)
Software: Orange (https://orange.biolab.si/)
Advantages
 Non-parametric method
which fits data as they are
 Describes in detail the
positive and the negative
targets
 Works with qualitative and
quantitative targets
 Works with missing values
Disadvantages
 Slow method (relatively large
computing time)
25
Applications
• Analysis of associations
• Classification
• Pattern recognition
• etc.

More Related Content

Similar to Relationships between research tasks and data structure (basic methods and application by free softwares)

Profiling Linked Open Data
Profiling Linked Open DataProfiling Linked Open Data
Profiling Linked Open Data
Blerina Spahiu
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Jisc
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
Carole Goble
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Computational Qualitative Data Analytics
Computational Qualitative Data AnalyticsComputational Qualitative Data Analytics
Computational Qualitative Data Analytics
Shalin Hai-Jew
 
Educ 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection ToolsEduc 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection ToolsTeacher Pauline
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Sri Ambati
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
OSTHUS
 
Learning Analytics – Research challenges arising from a current review of LA use
Learning Analytics – Research challenges arising from a current review of LA useLearning Analytics – Research challenges arising from a current review of LA use
Learning Analytics – Research challenges arising from a current review of LA use
Riina Vuorikari
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USAL
Eduworks Network
 
Model Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – SolutionsModel Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – Solutions
Martin Scharm
 
Evidence-based Semantic Web Just a Dream or the Way to Go?
Evidence-based Semantic WebJust a Dream or the Way to Go?Evidence-based Semantic WebJust a Dream or the Way to Go?
Evidence-based Semantic Web Just a Dream or the Way to Go?
Dragan Gasevic
 
grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013 grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013 adrianheilbut
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use Cases
Rothamsted Research, UK
 
Learning Analytics – Challenges arising from a current review of LA use
Learning Analytics – Challenges arising from a current review of LA useLearning Analytics – Challenges arising from a current review of LA use
Learning Analytics – Challenges arising from a current review of LA use
Riina Vuorikari
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
SSSW
 
Model management for systems biology projects
Model management for systems biology projectsModel management for systems biology projects
Model management for systems biology projects
University Medicine Greifswald
 
Data Science for Good - “A Data-Driven Approach“
Data Science for Good -  “A Data-Driven Approach“Data Science for Good -  “A Data-Driven Approach“
Data Science for Good - “A Data-Driven Approach“
Marco Marchetti
 
Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
FAIRDOM
 
Basics Terminology Nat Lib Stats Lib Meter Zbw Hh Workshop
Basics Terminology Nat Lib Stats Lib Meter Zbw Hh WorkshopBasics Terminology Nat Lib Stats Lib Meter Zbw Hh Workshop
Basics Terminology Nat Lib Stats Lib Meter Zbw Hh Workshop
LibMeter
 

Similar to Relationships between research tasks and data structure (basic methods and application by free softwares) (20)

Profiling Linked Open Data
Profiling Linked Open DataProfiling Linked Open Data
Profiling Linked Open Data
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Computational Qualitative Data Analytics
Computational Qualitative Data AnalyticsComputational Qualitative Data Analytics
Computational Qualitative Data Analytics
 
Educ 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection ToolsEduc 190_Data Analysis and Collection Tools
Educ 190_Data Analysis and Collection Tools
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
 
Learning Analytics – Research challenges arising from a current review of LA use
Learning Analytics – Research challenges arising from a current review of LA useLearning Analytics – Research challenges arising from a current review of LA use
Learning Analytics – Research challenges arising from a current review of LA use
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USAL
 
Model Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – SolutionsModel Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – Solutions
 
Evidence-based Semantic Web Just a Dream or the Way to Go?
Evidence-based Semantic WebJust a Dream or the Way to Go?Evidence-based Semantic WebJust a Dream or the Way to Go?
Evidence-based Semantic Web Just a Dream or the Way to Go?
 
grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013 grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use Cases
 
Learning Analytics – Challenges arising from a current review of LA use
Learning Analytics – Challenges arising from a current review of LA useLearning Analytics – Challenges arising from a current review of LA use
Learning Analytics – Challenges arising from a current review of LA use
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
Model management for systems biology projects
Model management for systems biology projectsModel management for systems biology projects
Model management for systems biology projects
 
Data Science for Good - “A Data-Driven Approach“
Data Science for Good -  “A Data-Driven Approach“Data Science for Good -  “A Data-Driven Approach“
Data Science for Good - “A Data-Driven Approach“
 
Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
 
Basics Terminology Nat Lib Stats Lib Meter Zbw Hh Workshop
Basics Terminology Nat Lib Stats Lib Meter Zbw Hh WorkshopBasics Terminology Nat Lib Stats Lib Meter Zbw Hh Workshop
Basics Terminology Nat Lib Stats Lib Meter Zbw Hh Workshop
 

More from Data Science Society

[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance
Data Science Society
 
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
Data Science Society
 
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
Data Science Society
 
Computer Vision in Real Estate
Computer Vision in Real EstateComputer Vision in Real Estate
Computer Vision in Real Estate
Data Science Society
 
ML in Proptech - Concept to Production
ML in Proptech  -  Concept to ProductionML in Proptech  -  Concept to Production
ML in Proptech - Concept to Production
Data Science Society
 
Lessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use CasesLessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use Cases
Data Science Society
 
AI methods for localization in noisy environment
AI methods for localization in noisy environment AI methods for localization in noisy environment
AI methods for localization in noisy environment
Data Science Society
 
Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution
Data Science Society
 
Data Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large CorporationsData Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large Corporations
Data Science Society
 
Air Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi teamAir Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi team
Data Science Society
 
Machine Learning in Astrophysics
Machine Learning in AstrophysicsMachine Learning in Astrophysics
Machine Learning in Astrophysics
Data Science Society
 
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
Data Science Society
 
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Data Science Society
 
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 SolutionDNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
Data Science Society
 
Data science tools - A.Marchev and K.Haralampiev
Data science tools - A.Marchev and K.HaralampievData science tools - A.Marchev and K.Haralampiev
Data science tools - A.Marchev and K.Haralampiev
Data Science Society
 
Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel
Data Science Society
 
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Data Science Society
 
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav NakovIntelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Data Science Society
 
Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg
Data Science Society
 
Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017
Data Science Society
 

More from Data Science Society (20)

[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance[Data Meetup] Data Science in Finance - Factor Models in Finance
[Data Meetup] Data Science in Finance - Factor Models in Finance
 
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline[Data Meetup] Data Science in Finance -  Building a Quant ML pipeline
[Data Meetup] Data Science in Finance - Building a Quant ML pipeline
 
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
[Data Meetup] Data Science in Journalism - Tanbih, QCRI and MIT
 
Computer Vision in Real Estate
Computer Vision in Real EstateComputer Vision in Real Estate
Computer Vision in Real Estate
 
ML in Proptech - Concept to Production
ML in Proptech  -  Concept to ProductionML in Proptech  -  Concept to Production
ML in Proptech - Concept to Production
 
Lessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use CasesLessons Learned: Linked Open Data implemented in 2 Use Cases
Lessons Learned: Linked Open Data implemented in 2 Use Cases
 
AI methods for localization in noisy environment
AI methods for localization in noisy environment AI methods for localization in noisy environment
AI methods for localization in noisy environment
 
Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution Object Identification and Detection Hackathon Solution
Object Identification and Detection Hackathon Solution
 
Data Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large CorporationsData Science for Open Innovation in SMEs and Large Corporations
Data Science for Open Innovation in SMEs and Large Corporations
 
Air Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi teamAir Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi team
 
Machine Learning in Astrophysics
Machine Learning in AstrophysicsMachine Learning in Astrophysics
Machine Learning in Astrophysics
 
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
#AcademiaDatathon Finlists' Solution of Crypto Datathon Case
 
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
Coreference Extraction from Identric’s Documents - Solution of Datathon 2018
 
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 SolutionDNA Analytics - What does really goes into Sausages - Datathon2018 Solution
DNA Analytics - What does really goes into Sausages - Datathon2018 Solution
 
Data science tools - A.Marchev and K.Haralampiev
Data science tools - A.Marchev and K.HaralampievData science tools - A.Marchev and K.Haralampiev
Data science tools - A.Marchev and K.Haralampiev
 
Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel Problems of Application of Machine Learning in the CRM - panel
Problems of Application of Machine Learning in the CRM - panel
 
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
Disruptive as Usual: New Technologies and Data Value Professor Severino Mereg...
 
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav NakovIntelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
Intelligent Question Answering Using the Wisdom of the Crowd, Preslav Nakov
 
Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg Master class Hristo Hadjitchonev - Aubg
Master class Hristo Hadjitchonev - Aubg
 
Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017Open Data reveals corruption practices - case from Datathon 2017
Open Data reveals corruption practices - case from Datathon 2017
 

Recently uploaded

Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Red blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptxRed blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptx
muralinath2
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 

Recently uploaded (20)

Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Red blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptxRed blood cells- genesis-maturation.pptx
Red blood cells- genesis-maturation.pptx
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 

Relationships between research tasks and data structure (basic methods and application by free softwares)

  • 2. Relationship “Research tasks Data structure” Research tasks Data structure Descriptive statistics Variables only Segmentation Dimension reduction Measures of associations Dependent variable(s) and independent variable(s)Decision making under risk Time series analysis Time as independent variable
  • 3. Examples for different data structures  Variables only  Dependent variable(s) and independent variable(s) Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 4. Descriptive statistics Scale Statistical method Nominal Percent Ordinal, rank or score + Cumulative frequencies + Cumulative percent Interval + Central tendency (mean, median, mode) + Dispersion (variance, standard deviation, range) + Skewness + Kurtosis + Quartiles, deciles, percentiles
  • 5. Example for nominal and ordinal scales Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 6. Example for interval scales Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 7. Segmentation Scale Statistical method Nominal Binary Ordinal Rank Score Interval TwoStep cluster Binary Score Interval + Hierarchical cluster + K-means cluster + Multidimensional scaling
  • 9. Example for hierarchical cluster Data: PISA (http://www.oecd.org/pisa/) Software: Orange (https://orange.biolab.si/)
  • 10. Example for K-means cluster Data: PISA (http://www.oecd.org/pisa/) Software: Orange (https://orange.biolab.si/)
  • 11. Example for multidimensional scaling Data: PISA (http://www.oecd.org/pisa/) Software: Orange (https://orange.biolab.si/)
  • 12. Dimension reduction Scale Statistical method Binary Likert scale + Cronbach’s Alpha Factor analysisScore Interval
  • 13. Example for factor analysis Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 14. Measures of association and/or decision making under risk Independent variable(s) Dependent variable(s) Nominal Ordinal Interval Nominal Ordinal Contingency tables Logistic regression Classification trees ANOVA Classification trees Interval Logistic regression Classification trees Regression Correlation Classification trees Mix of nominal and/or ordinal and/or interval Logistic regression Classification trees Classification trees
  • 15. Example for contingency tables Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 16. Example for ANOVA Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 17. Advantages  Quick method (relatively short computing time)  Coefficients easy to interpret Disadvantages  Uses only one possible function  Don’t work with quantitative target  Ignores cases with missing values 17 Applications • Analysis of associations • Classification • Pattern recognition • etc.
  • 18. Example for regression and correlation Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 19. Advantages  Quick method (relatively short computing time)  Coefficients easy to interpret  Works with different functions Disadvantages  Don’t work with qualitative target  Ignores cases with missing values 19 Applications • Analysis of associations • Pattern recognition • etc.
  • 20. Example for logistic regression Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 21. Advantages  Quick method (relatively short computing time)  Coefficients easy to interpret Disadvantages  Uses only one possible function  Don’t work with quantitative target  Ignores cases with missing values 21 Applications • Analysis of associations • Classification • Pattern recognition • etc.
  • 22. Example for classification trees Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 23. Advantages  Non-parametric method which fits data as they are  Describes in detail the positive and the negative targets  Works with qualitative and quantitative targets  Works with missing values Disadvantages  Slow method (relatively large computing time)  Generates very large trees and pruning is needed 23 Applications • Analysis of associations • Classification • Pattern recognition • etc.
  • 24. Another example for classification trees Data: European social survey (ESS), Round 6, 2016, Bulgaria (http://www.europeansocialsurvey.org/) Software: Orange (https://orange.biolab.si/)
  • 25. Advantages  Non-parametric method which fits data as they are  Describes in detail the positive and the negative targets  Works with qualitative and quantitative targets  Works with missing values Disadvantages  Slow method (relatively large computing time) 25 Applications • Analysis of associations • Classification • Pattern recognition • etc.