SlideShare a Scribd company logo
Methodological Principles in
Dealing with Big Data
Reijo Sund
University of Helsinki, Centre for Research Methods, Faculty of Social Sciences
Big Data seminar
Statistics Finland, Helsinki 2.6.2014
1. kesäkuuta 14
Big Data
Data have been produced for hundreds of years
The reasons for such production were originally
administrative in nature
There was a need for systematically collected numerical facts on a
particular subject
Advances in information technology have made it possible to
more effectively collect and store larger and larger data sets
1. kesäkuuta 14
From data to information
As far as there has been data, there has been a challenge to
transform it into useful information
Too much data in an unusable form has always been a common complain
Well known hierarchy:
Data - Information - Knowledge -
Wisdom - Intelligence
1. kesäkuuta 14
Secondary data
There are more and more ”big data”, but the emphasis has
been on technical aspects and not on the information itself
Data without explanations are useless
Big Data are often secondary data
Not tailored to specific research question at hand
More (detailed) data would not solve the basic problems
More background information is required for utilization
1. kesäkuuta 14
Fundamental problem
The belief that big data consist of autonomous, atom-like
building blocks is fundamentally erroneous
Raw register data as such are of little value
No simple magic tricks to overcome problems arising from
the fundamental limitations of empirical research
More general aspects of scientific research are needed in order to
understand the related methodological challenges
1. kesäkuuta 14
Knowledge discovery process
Process consists of several main phases:
Understanding the phenomenon, Understanding the problem,
Understanding data, Data preprocessing, Modeling,
Evaluation, Reporting
The main difference to the ”traditional” research process is
the additional interpretation-operationalization phase
Context
Debate
Idea
Theory
	
  
Problem
Data
Analysis
Question
Answer
Perspective
1. kesäkuuta 14
Prerequisites
Effective use of big data presumes skills in various areas:
Measurement
Data modeling (information sciences)
Statistical computing (statistics)
Theory of the subject matter
1. kesäkuuta 14
Principles of measurement
Reality can be confronted by recording observations that
reflect the phenomenon of interest
Measurement aims to create data as symbolic
representations of the observations
Operationalization determines how the phenomenon P that becomes
visible via observations O is mapped to data D ?
Successful if it becomes possible to make valid interpretations I of
symbolic data D in regard to the phenomenon P
1. kesäkuuta 14
Infological equation
Information is something that has to be produced from the
data and the pre-knowledge
Infological equation:
I = i(D,S,t)
Information I is produced from the data D and the pre-knowledge S
(at time t using the interpretation process i)
1. kesäkuuta 14
Data modeling
Data modeling can be used to construct (computer-based) symbol structures which
capture the meaning of data and organize it in ways that make it understandable
Only what is (or can be) represented is considered to exist
	
  
Phenomenon
⇓
Concept
⇓
Object
Host Attributes
Time Place Realized observation
Data component
Knowledge component
Logical component
Taxonomy
Partonomy
Theoretical measurement properties
1. kesäkuuta 14
Data preprocessing
Data cleaning and reduction
Correction of “global” deficiencies in the data
Dropping of “uninteresting” data
Data abstraction
“Intelligent enrichment” of data using background knowledge
This kind of preprocessing reminds much more qualitative than
quantitative analysis
Each rule reflects the instability of the concept and is a step further from
the "objectivity" of the study
1. kesäkuuta 14
Preprocessing in practice
Need for conceptual representation of each object
Two main classes for concept-data relation:
Factual = minimal background knowledge
Abstracted = cognitive fit acceptable
A sophisticated (and subjective) preprocessing aiming to
scale matters down to a size more suitable for specific
analyses is the most important and time-consuming part of
the (big) data analysis
1. kesäkuuta 14
Greater statistics
Statistics offers not only a set of tools for problem- solving,
but also a formal way of thinking about the modeling of the
actual problem
Rather than trying to squeeze the data into a predefined
model or saying too much on what can and cannot be done,
data analysis should work to achieve an appropriate
compromise between the practical problems and the data
1. kesäkuuta 14
Challenges
How to analyze massive data effectively when manual
management is unfeasible?
How to avoid ‘snooping/dredging/fishing/shopping’ without
assuming that data are automatically in concordance with the
theory?
How to deal with data that include total populations without
traditional meaning for sampling error and statistical
significance?
1. kesäkuuta 14
Thank you!
For more information:
http://www.helsinki.fi/~sund
1. kesäkuuta 14
How to calculate the annual number of
hip fractures in Finland?
Background knowledge: All hip fractures in Hospital Discharge
Register
Data challenge: Difficult to separate new admissions from the care of
old fractures
Change of theory: Consider only first hip fractures instead of all hip
fractures
Solution in terms of data: Easy to determine the number of first
hip fractures from the register if enough old data are available and
deterministic record linkage can be used
1. kesäkuuta 14
Is there more hip fractures during
winter? How to define winter?
Based on the data, ”Winter” is from November to April
5/98 11/98 5/99 11/99 5/00 11/00 5/01 11/01 5/02 11/02
1/98 7/98 1/99 7/99 1/00 7/00 1/01 7/01 1/02 7/02 1/03
0
5
10
15
20
Institutionalized
5/98 11/98 5/99 11/99 5/00 11/00 5/01 11/01 5/02 11/02
1/98 7/98 1/99 7/99 1/00 7/00 1/01 7/01 1/02 7/02 1/03
0
5
10
15
20
Over 50 years old
1. kesäkuuta 14
Data abstracted outcomes
Commonly used outcomes measuring effectiveness of (hip
fracture) surgery are death and complication
These are medical concepts, but must be abstracted from
individual level register-based data by using some ‘rules’,
such as a list of some particular diagnosis codes recorded in
the data
1. kesäkuuta 14
Stabile and complex outcomes
It is easy typically straightforward to extract the event of
death from the data by using "one line rule“
Extraction of complications may require tens of
different rules which are justified by using domain
knowledge and evaluation of rules with concrete data until
saturation point is reached
1. kesäkuuta 14
1. kesäkuuta 14

More Related Content

What's hot

"Data as the Fuel and Analytics as the Engine of the Digital Transformation -...
"Data as the Fuel and Analytics as the Engine of the Digital Transformation -..."Data as the Fuel and Analytics as the Engine of the Digital Transformation -...
"Data as the Fuel and Analytics as the Engine of the Digital Transformation -...
Prof. Dr. Diego Kuonen
 
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...
Prof. Dr. Diego Kuonen
 
Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...
Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...
Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...
Prof. Dr. Diego Kuonen
 
Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...
Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...
Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...
Prof. Dr. Diego Kuonen
 
A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...
A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...
A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...
Prof. Dr. Diego Kuonen
 
Big Data as the Fuel and Analytics as the Engine of the Digital Transformation
Big Data as the Fuel and Analytics as the Engine of the Digital TransformationBig Data as the Fuel and Analytics as the Engine of the Digital Transformation
Big Data as the Fuel and Analytics as the Engine of the Digital Transformation
Prof. Dr. Diego Kuonen
 
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)
Prof. Dr. Diego Kuonen
 
A Statistician's View on Big Data and Data Science (Version 3)
A Statistician's View on Big Data and Data Science (Version 3)A Statistician's View on Big Data and Data Science (Version 3)
A Statistician's View on Big Data and Data Science (Version 3)
Prof. Dr. Diego Kuonen
 
S4 pn
S4 pnS4 pn
Overview of Big Data, Data Science and Statistics, along with Digitalisation,...
Overview of Big Data, Data Science and Statistics, along with Digitalisation,...Overview of Big Data, Data Science and Statistics, along with Digitalisation,...
Overview of Big Data, Data Science and Statistics, along with Digitalisation,...
Prof. Dr. Diego Kuonen
 
A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...
A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...
A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...
Prof. Dr. Diego Kuonen
 
A Statistician's View on Big Data and Data Science (Version 2)
A Statistician's View on Big Data and Data Science (Version 2)A Statistician's View on Big Data and Data Science (Version 2)
A Statistician's View on Big Data and Data Science (Version 2)
Prof. Dr. Diego Kuonen
 
A Statistician's Introductory View on Big Data and Data Science (Version 7)
A Statistician's Introductory View on Big Data and Data Science (Version 7)A Statistician's Introductory View on Big Data and Data Science (Version 7)
A Statistician's Introductory View on Big Data and Data Science (Version 7)
Prof. Dr. Diego Kuonen
 

What's hot (13)

"Data as the Fuel and Analytics as the Engine of the Digital Transformation -...
"Data as the Fuel and Analytics as the Engine of the Digital Transformation -..."Data as the Fuel and Analytics as the Engine of the Digital Transformation -...
"Data as the Fuel and Analytics as the Engine of the Digital Transformation -...
 
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...
 
Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...
Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...
Big Data as the Fuel and Visual Analytics as the Engine Mount of the Digital ...
 
Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...
Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...
Big Data, Data Science, Machine Intelligence and Learning: Demystification, C...
 
A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...
A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...
A Swiss Statistician's 'Big Tent' Overview of Big Data and Data Science in Ph...
 
Big Data as the Fuel and Analytics as the Engine of the Digital Transformation
Big Data as the Fuel and Analytics as the Engine of the Digital TransformationBig Data as the Fuel and Analytics as the Engine of the Digital Transformation
Big Data as the Fuel and Analytics as the Engine of the Digital Transformation
 
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 5)
 
A Statistician's View on Big Data and Data Science (Version 3)
A Statistician's View on Big Data and Data Science (Version 3)A Statistician's View on Big Data and Data Science (Version 3)
A Statistician's View on Big Data and Data Science (Version 3)
 
S4 pn
S4 pnS4 pn
S4 pn
 
Overview of Big Data, Data Science and Statistics, along with Digitalisation,...
Overview of Big Data, Data Science and Statistics, along with Digitalisation,...Overview of Big Data, Data Science and Statistics, along with Digitalisation,...
Overview of Big Data, Data Science and Statistics, along with Digitalisation,...
 
A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...
A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...
A Statistician's View on Big Data and Data Science in Pharmaceutical Developm...
 
A Statistician's View on Big Data and Data Science (Version 2)
A Statistician's View on Big Data and Data Science (Version 2)A Statistician's View on Big Data and Data Science (Version 2)
A Statistician's View on Big Data and Data Science (Version 2)
 
A Statistician's Introductory View on Big Data and Data Science (Version 7)
A Statistician's Introductory View on Big Data and Data Science (Version 7)A Statistician's Introductory View on Big Data and Data Science (Version 7)
A Statistician's Introductory View on Big Data and Data Science (Version 7)
 

Viewers also liked

Research on data journalism: What is there to investigate? Insights from a st...
Research on data journalism: What is there to investigate? Insights from a st...Research on data journalism: What is there to investigate? Insights from a st...
Research on data journalism: What is there to investigate? Insights from a st...
Julian Ausserhofer
 
Doing Digital Methods: Some Recent Highlights from Winter and Summer Schools
Doing Digital Methods: Some Recent Highlights from Winter and Summer SchoolsDoing Digital Methods: Some Recent Highlights from Winter and Summer Schools
Doing Digital Methods: Some Recent Highlights from Winter and Summer Schools
Liliana Bounegru
 
Redistributing journalism: Journalism as a data public and the politics of qu...
Redistributing journalism: Journalism as a data public and the politics of qu...Redistributing journalism: Journalism as a data public and the politics of qu...
Redistributing journalism: Journalism as a data public and the politics of qu...
Liliana Bounegru
 
C6 deploying applications to your private cloud 7 to 10 times faster
C6   deploying applications to your private cloud 7 to 10 times fasterC6   deploying applications to your private cloud 7 to 10 times faster
C6 deploying applications to your private cloud 7 to 10 times faster
Dr. Wilfred Lin (Ph.D.)
 
Git Internals
Git InternalsGit Internals
Git Internals
Pedro Melo
 
Big Data: Implications for Marketing and Strategy
Big Data: Implications for Marketing and StrategyBig Data: Implications for Marketing and Strategy
Big Data: Implications for Marketing and Strategy
C.K. Kumar
 
Chapter 02
Chapter 02Chapter 02
Chapter 02
mcastro284
 
A7 getting value from big data how to get there quickly and leverage your c...
A7   getting value from big data how to get there quickly and leverage your c...A7   getting value from big data how to get there quickly and leverage your c...
A7 getting value from big data how to get there quickly and leverage your c...
Dr. Wilfred Lin (Ph.D.)
 
Large-scale digitisation options at the Natural History Museum, London.
Large-scale digitisation options at the Natural History Museum, London.Large-scale digitisation options at the Natural History Museum, London.
Large-scale digitisation options at the Natural History Museum, London.
Vince Smith
 
Privacy in a digital world
Privacy in a digital worldPrivacy in a digital world
Privacy in a digital world
robkitchin
 
Chapter 02 The Internet
Chapter 02 The InternetChapter 02 The Internet
Chapter 02 The Internet
xtin101
 
Chapter 06 Inside Computers and Mobile Devices
Chapter 06 Inside Computers and Mobile DevicesChapter 06 Inside Computers and Mobile Devices
Chapter 06 Inside Computers and Mobile Devices
xtin101
 
Big Search with Big Data Principles
Big Search with Big Data PrinciplesBig Search with Big Data Principles
Big Search with Big Data Principles
OpenSource Connections
 
Chapter 05 Digital Safety and Security
Chapter 05 Digital Safety and SecurityChapter 05 Digital Safety and Security
Chapter 05 Digital Safety and Security
xtin101
 
»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...
»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...
»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...
Nele Heise
 

Viewers also liked (15)

Research on data journalism: What is there to investigate? Insights from a st...
Research on data journalism: What is there to investigate? Insights from a st...Research on data journalism: What is there to investigate? Insights from a st...
Research on data journalism: What is there to investigate? Insights from a st...
 
Doing Digital Methods: Some Recent Highlights from Winter and Summer Schools
Doing Digital Methods: Some Recent Highlights from Winter and Summer SchoolsDoing Digital Methods: Some Recent Highlights from Winter and Summer Schools
Doing Digital Methods: Some Recent Highlights from Winter and Summer Schools
 
Redistributing journalism: Journalism as a data public and the politics of qu...
Redistributing journalism: Journalism as a data public and the politics of qu...Redistributing journalism: Journalism as a data public and the politics of qu...
Redistributing journalism: Journalism as a data public and the politics of qu...
 
C6 deploying applications to your private cloud 7 to 10 times faster
C6   deploying applications to your private cloud 7 to 10 times fasterC6   deploying applications to your private cloud 7 to 10 times faster
C6 deploying applications to your private cloud 7 to 10 times faster
 
Git Internals
Git InternalsGit Internals
Git Internals
 
Big Data: Implications for Marketing and Strategy
Big Data: Implications for Marketing and StrategyBig Data: Implications for Marketing and Strategy
Big Data: Implications for Marketing and Strategy
 
Chapter 02
Chapter 02Chapter 02
Chapter 02
 
A7 getting value from big data how to get there quickly and leverage your c...
A7   getting value from big data how to get there quickly and leverage your c...A7   getting value from big data how to get there quickly and leverage your c...
A7 getting value from big data how to get there quickly and leverage your c...
 
Large-scale digitisation options at the Natural History Museum, London.
Large-scale digitisation options at the Natural History Museum, London.Large-scale digitisation options at the Natural History Museum, London.
Large-scale digitisation options at the Natural History Museum, London.
 
Privacy in a digital world
Privacy in a digital worldPrivacy in a digital world
Privacy in a digital world
 
Chapter 02 The Internet
Chapter 02 The InternetChapter 02 The Internet
Chapter 02 The Internet
 
Chapter 06 Inside Computers and Mobile Devices
Chapter 06 Inside Computers and Mobile DevicesChapter 06 Inside Computers and Mobile Devices
Chapter 06 Inside Computers and Mobile Devices
 
Big Search with Big Data Principles
Big Search with Big Data PrinciplesBig Search with Big Data Principles
Big Search with Big Data Principles
 
Chapter 05 Digital Safety and Security
Chapter 05 Digital Safety and SecurityChapter 05 Digital Safety and Security
Chapter 05 Digital Safety and Security
 
»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...
»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...
»Big data – small problems?« - Ethische Perspektiven auf Forschung unter Zuhi...
 

Similar to Methodological principles in dealing with Big Data, Reijo Sund

Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
jybufgofasfbkpoovh
 
Data Science definition
Data Science definitionData Science definition
Data Science definition
CarloLauro1
 
Let's talk about Data Science
Let's talk about Data ScienceLet's talk about Data Science
Let's talk about Data Science
Carlo Lauro
 
Principles of data_science
Principles of data_sciencePrinciples of data_science
Principles of data_science
tvk66866
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
LizLyon
 
50 Years of Data Science
50 Years of Data Science50 Years of Data Science
50 Years of Data Science
Nafiseh Navabpour
 
20080719 Esof Open Data Voegler
20080719 Esof Open Data Voegler20080719 Esof Open Data Voegler
S2
S2S2
S2
kimsee
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
Joanne Luciano
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
ANOOP V S
 
Statistical thinking and development planning
Statistical thinking and development planningStatistical thinking and development planning
Statistical thinking and development planning
Southern Range, Berhampur, Odisha
 
CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...
Johann van Wyk
 
A Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: ChallengesA Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: Challenges
Dr. Amarjeet Singh
 
Research Data Management for Econometrics
Research Data Management for EconometricsResearch Data Management for Econometrics
Research Data Management for Econometrics
Peter Löwe
 
Meaning and uses of statistics
Meaning and uses of statisticsMeaning and uses of statistics
Meaning and uses of statistics
RekhaChoudhary24
 
Lecture 1 PPT.ppt
Lecture 1 PPT.pptLecture 1 PPT.ppt
Lecture 1 PPT.ppt
RAJKAMAL282
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening Research
Martin Donnelly
 
Lecture 1 PPT.pdf
Lecture 1 PPT.pdfLecture 1 PPT.pdf
Lecture 1 PPT.pdf
RAJKAMAL282
 
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier
 
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
dri_ireland
 

Similar to Methodological principles in dealing with Big Data, Reijo Sund (20)

Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
Data; Data manipulation, sorting, grouping, rearranging. Plotting the data. D...
 
Data Science definition
Data Science definitionData Science definition
Data Science definition
 
Let's talk about Data Science
Let's talk about Data ScienceLet's talk about Data Science
Let's talk about Data Science
 
Principles of data_science
Principles of data_sciencePrinciples of data_science
Principles of data_science
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
50 Years of Data Science
50 Years of Data Science50 Years of Data Science
50 Years of Data Science
 
20080719 Esof Open Data Voegler
20080719 Esof Open Data Voegler20080719 Esof Open Data Voegler
20080719 Esof Open Data Voegler
 
S2
S2S2
S2
 
Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020Luciano uvi hackfest.28.10.2020
Luciano uvi hackfest.28.10.2020
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Statistical thinking and development planning
Statistical thinking and development planningStatistical thinking and development planning
Statistical thinking and development planning
 
CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...
 
A Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: ChallengesA Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: Challenges
 
Research Data Management for Econometrics
Research Data Management for EconometricsResearch Data Management for Econometrics
Research Data Management for Econometrics
 
Meaning and uses of statistics
Meaning and uses of statisticsMeaning and uses of statistics
Meaning and uses of statistics
 
Lecture 1 PPT.ppt
Lecture 1 PPT.pptLecture 1 PPT.ppt
Lecture 1 PPT.ppt
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening Research
 
Lecture 1 PPT.pdf
Lecture 1 PPT.pdfLecture 1 PPT.pdf
Lecture 1 PPT.pdf
 
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
 
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
 

More from Tilastokeskus

4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
Tilastokeskus
 
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
Tilastokeskus
 
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
Tilastokeskus
 
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
Tilastokeskus
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
Tilastokeskus
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
Tilastokeskus
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
Tilastokeskus
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
Tilastokeskus
 
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, TilastokeskusKasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
Tilastokeskus
 
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
Tilastokeskus
 
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, TilastokeskusMitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
Tilastokeskus
 
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
Tilastokeskus
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
Tilastokeskus
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
Tilastokeskus
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
Tilastokeskus
 

More from Tilastokeskus (20)

4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
 
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
 
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
 
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
4.6.2024 Tilastotietoa hyvinvointialueiden tueksi, Tilastokeskus
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
 
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
13.5.2024 Yrityksen digitalous -hanke: Säästöä hallinnollisiin kuluihin yhtei...
 
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, TilastokeskusKasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
Kasvoiko Suomen bruttokansantuote 2023? Yliaktuaari Samu Hakala, Tilastokeskus
 
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
Miten rakentaminen, teollisuus ja palvelut kehittyivät? Yliaktuaari Eljas Tuo...
 
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, TilastokeskusMitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
Mitä tapahtui ulkomaankaupassa? Yliaktuaari Reetta Karinluoma, Tilastokeskus
 
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
Millaisia muutoksia tapahtui yksityisessä kulutuksessa ja investoinneissa, yl...
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
 
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
13.2.2024 Datajournalismin pikakurssi, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus14.12.2023 Kiertotalous Suomessa, Tilastokeskus
14.12.2023 Kiertotalous Suomessa, Tilastokeskus
 

Recently uploaded

一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 

Recently uploaded (20)

一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 

Methodological principles in dealing with Big Data, Reijo Sund

  • 1. Methodological Principles in Dealing with Big Data Reijo Sund University of Helsinki, Centre for Research Methods, Faculty of Social Sciences Big Data seminar Statistics Finland, Helsinki 2.6.2014 1. kesäkuuta 14
  • 2. Big Data Data have been produced for hundreds of years The reasons for such production were originally administrative in nature There was a need for systematically collected numerical facts on a particular subject Advances in information technology have made it possible to more effectively collect and store larger and larger data sets 1. kesäkuuta 14
  • 3. From data to information As far as there has been data, there has been a challenge to transform it into useful information Too much data in an unusable form has always been a common complain Well known hierarchy: Data - Information - Knowledge - Wisdom - Intelligence 1. kesäkuuta 14
  • 4. Secondary data There are more and more ”big data”, but the emphasis has been on technical aspects and not on the information itself Data without explanations are useless Big Data are often secondary data Not tailored to specific research question at hand More (detailed) data would not solve the basic problems More background information is required for utilization 1. kesäkuuta 14
  • 5. Fundamental problem The belief that big data consist of autonomous, atom-like building blocks is fundamentally erroneous Raw register data as such are of little value No simple magic tricks to overcome problems arising from the fundamental limitations of empirical research More general aspects of scientific research are needed in order to understand the related methodological challenges 1. kesäkuuta 14
  • 6. Knowledge discovery process Process consists of several main phases: Understanding the phenomenon, Understanding the problem, Understanding data, Data preprocessing, Modeling, Evaluation, Reporting The main difference to the ”traditional” research process is the additional interpretation-operationalization phase Context Debate Idea Theory   Problem Data Analysis Question Answer Perspective 1. kesäkuuta 14
  • 7. Prerequisites Effective use of big data presumes skills in various areas: Measurement Data modeling (information sciences) Statistical computing (statistics) Theory of the subject matter 1. kesäkuuta 14
  • 8. Principles of measurement Reality can be confronted by recording observations that reflect the phenomenon of interest Measurement aims to create data as symbolic representations of the observations Operationalization determines how the phenomenon P that becomes visible via observations O is mapped to data D ? Successful if it becomes possible to make valid interpretations I of symbolic data D in regard to the phenomenon P 1. kesäkuuta 14
  • 9. Infological equation Information is something that has to be produced from the data and the pre-knowledge Infological equation: I = i(D,S,t) Information I is produced from the data D and the pre-knowledge S (at time t using the interpretation process i) 1. kesäkuuta 14
  • 10. Data modeling Data modeling can be used to construct (computer-based) symbol structures which capture the meaning of data and organize it in ways that make it understandable Only what is (or can be) represented is considered to exist   Phenomenon ⇓ Concept ⇓ Object Host Attributes Time Place Realized observation Data component Knowledge component Logical component Taxonomy Partonomy Theoretical measurement properties 1. kesäkuuta 14
  • 11. Data preprocessing Data cleaning and reduction Correction of “global” deficiencies in the data Dropping of “uninteresting” data Data abstraction “Intelligent enrichment” of data using background knowledge This kind of preprocessing reminds much more qualitative than quantitative analysis Each rule reflects the instability of the concept and is a step further from the "objectivity" of the study 1. kesäkuuta 14
  • 12. Preprocessing in practice Need for conceptual representation of each object Two main classes for concept-data relation: Factual = minimal background knowledge Abstracted = cognitive fit acceptable A sophisticated (and subjective) preprocessing aiming to scale matters down to a size more suitable for specific analyses is the most important and time-consuming part of the (big) data analysis 1. kesäkuuta 14
  • 13. Greater statistics Statistics offers not only a set of tools for problem- solving, but also a formal way of thinking about the modeling of the actual problem Rather than trying to squeeze the data into a predefined model or saying too much on what can and cannot be done, data analysis should work to achieve an appropriate compromise between the practical problems and the data 1. kesäkuuta 14
  • 14. Challenges How to analyze massive data effectively when manual management is unfeasible? How to avoid ‘snooping/dredging/fishing/shopping’ without assuming that data are automatically in concordance with the theory? How to deal with data that include total populations without traditional meaning for sampling error and statistical significance? 1. kesäkuuta 14
  • 15. Thank you! For more information: http://www.helsinki.fi/~sund 1. kesäkuuta 14
  • 16. How to calculate the annual number of hip fractures in Finland? Background knowledge: All hip fractures in Hospital Discharge Register Data challenge: Difficult to separate new admissions from the care of old fractures Change of theory: Consider only first hip fractures instead of all hip fractures Solution in terms of data: Easy to determine the number of first hip fractures from the register if enough old data are available and deterministic record linkage can be used 1. kesäkuuta 14
  • 17. Is there more hip fractures during winter? How to define winter? Based on the data, ”Winter” is from November to April 5/98 11/98 5/99 11/99 5/00 11/00 5/01 11/01 5/02 11/02 1/98 7/98 1/99 7/99 1/00 7/00 1/01 7/01 1/02 7/02 1/03 0 5 10 15 20 Institutionalized 5/98 11/98 5/99 11/99 5/00 11/00 5/01 11/01 5/02 11/02 1/98 7/98 1/99 7/99 1/00 7/00 1/01 7/01 1/02 7/02 1/03 0 5 10 15 20 Over 50 years old 1. kesäkuuta 14
  • 18. Data abstracted outcomes Commonly used outcomes measuring effectiveness of (hip fracture) surgery are death and complication These are medical concepts, but must be abstracted from individual level register-based data by using some ‘rules’, such as a list of some particular diagnosis codes recorded in the data 1. kesäkuuta 14
  • 19. Stabile and complex outcomes It is easy typically straightforward to extract the event of death from the data by using "one line rule“ Extraction of complications may require tens of different rules which are justified by using domain knowledge and evaluation of rules with concrete data until saturation point is reached 1. kesäkuuta 14