SlideShare a Scribd company logo
1 of 37
Download to read offline
data & content design
Frieda Brioschi - frieda.brioschi@gmail.com
Emma Tracanella - emma.tracanella@gmail.com
HOW TO COLLECT AND ORGANIZE DATA
LESSON 2 - 2019/20
A QUICK INTRO
LET’S START
data & content design
LESSON 2
3
PRESENT YOUR DATA
data & content design
DATA IS ALL AROUND US
LESSON 2
4
METHODS
DATA COLLECTION
data & content design
LESSON 2
WHAT ARE DATA
Data are individual units of information.
A datum describes a single quality or quantity of some object or phenomenon.
Data are measured, collected and reported, and analyzed, whereupon they can
be visualized using graphs, images or other analysis tools.
6
data & content design
LESSON 2
PRIMARY VS SECONDARY DATA
▸ Primary data is data that is observed or collected from first-hand sources
▸ Secondary data is data gathered from studies, surveys, or experiments that
have been run by other people
7
data & content design
LESSON 2
QUALITATIVE VS QUANTITATIVE
▸ Quantitative data comes in the form of numbers, quantities and values. 

Pro: it’s concrete and easily measurable.
▸ Qualitative data is descriptive, based on attributes. 

It helps to explain the “why” behind the information quantitative data
reveals.
8
data & content design
LESSON 2
PRIMARY DATA COLLECTION
▸ Observation
▸ Surveys & Questionnaire
▸ Interviews
▸ Focus Group
9
data & content design
LESSON 2
HOW
10
data & content design
LESSON 2
PRIMARY DATA COLLECTION
▸ In-Person Interviews

Pros: In-depth and a high degree of confidence on the data

Cons: Time consuming, expensive and can be dismissed as anedoctal
▸ Mail Surveys

Pros: Can reach anyone and everyone – no barrier

Cons: Expensive, data collection errors, lag time
▸ Phone Surveys

Pros: High degree of confidence on the data collected, reach almost anyone

Cons: Expensive, cannot self-administer, need to hire an agency
▸ Web/Online Surveys

Pros: Cheap, can self-administer, very low probability of data errors

Cons: Not all your customers might have an email address/be on the internet, customers may be wary of
divulging information online.
11
data & content design
LESSON 2
BIAS
Bias in data collection is a distortion which results in the information not being truly representative
of the situation you are trying to investigate. Bias occurs for example when systematic error is
introduced into sampling or testing by selecting or encouraging one outcome or answer over others.
It can results from:
▸ survey questions that are constructed with a particular slant
▸ choosing a known group with a particular background to respond to surveys
▸ reporting data in misleading categorical groupings
▸ non-random selections when sampling
▸ systematic measurement errors
12
data & content design
LESSON 2
CASE STUDY: TAY.AI
Tay was an articial intelligence chatter bot that was originally released by
Microsoft via Twitter on March 23, 2016.
It caused subsequent controversy when the bot began to post inflammatory and
offensive tweets through its Twitter account, causing Microsoft to shut down the
service only 16 hours after its launch.
13
data & content design
LESSON 2
SECONDARY DATA SOURCES
▸ Our data:
▸ Personal information, likes, activities and interests (Facebook, instagram,
Youtube, …)
▸ Personal data (from mobile phone)
14
data & content design
LESSON 2
APPLE DATA HEALTH
▸ Heart rate, sleeping habits, workouts,
steps and walking routines
▸ Introduced in September 2014 with iOS
8, the Apple Health app is pre-installed
on all iPhones.
▸ Low-energy sensors, constantly
collecting information about the user’s
physical activities. With optional extra
hardware (e.g. Apple Watch), Apple
Health can collect signicantly more
information. 
15
data & content design
LESSON 2
SECONDARY DATA SOURCES
▸ Other data:
▸ Public data sets
▸ Historical data
16
data & content design
LESSON 2
FLIGHTRADAR24
▸ Flightradar24 is a global flight tracking
service that provides you with real-time
information about thousands of aircraft
around the world.
▸ Flightradar24 tracks 180,000+ flights, from
1,200+ airlines, flying to or from 4,000+
airports around the world in real time.
▸ https://www.flightradar24.com
17
data & content design
LESSON 2
HISTORICAL CLIMATE DATA
▸ Many of the historical sources available to
climate historians mention weather in some
way, but these references are buried in a huge
volume of information.
▸ In recent years initiatives have transcribed,
quantified, and digitalized: 

a) historical observations, 

b) historical activities that must have been
strongly influenced by weather.
▸ https://www.historicalclimatology.com/
databases.html
18
data & content design
LESSON 2
ATLAS OF URBAN EXPANSION
▸ As of 2010, the world contained 4,231 cities with
100,000 or more people.
▸ The Atlas of Urban Expansion collects and analyzes
data on the quantity and quality of urban
expansion in a stratied global sample of 200
cities.
▸ The Atlas presents the output of the first two
phases of the Monitoring Global Urban Expansion
Program, an initiative that gathers data and
evidence on cities worldwide.
▸ http://atlasofurbanexpansion.org/cities/view/Milan
19
data & content design
LESSON 2
THE MOST POPULOUS CITY THROUGH TIME
▸ https://www.youtube.com/watch?v=pMs5xapBewM
20
data & content design
DATA COLLECTION MAY BE AFFECTED BY
THEIR USE!
We
LESSON 2
21
PROCESSING
DATA
data & content design
LESSON 2
STRUCTURED DATA
Structured data is usually contained in rows and columns and its elements can be mapped into xed pre-
dened model. Examples of sources:
▸ SQL Databases
▸ Spreadsheets such as Excel
▸ OLTP Systems
▸ Online forms
▸ Sensors such as GPS or RFID tags
▸ Network and Web server logs
▸ Medical devices
23
data & content design
LESSON 2
UNSTRUCTURED DATA
Unstructured data is data that cannot be contained in a row-column format and doesn’t have a data
model. Examples of sources:
▸ Web pages
▸ Images (JPEG, GIF, PNG, etc.)
▸ Videos
▸ Memos
▸ Reports
▸ Word documents and PowerPoint persentations
▸ Surveys
24
data & content design
LESSON 2
SEMI-STRUCTURED DATA
Basically it’s a mix between both of the previous ones. Semi-structured data has some defining or
consistent characteristics but doesn’t conform to a rigid structure. Examples of sources:
▸ E-mails
▸ XML and other markup languages
▸ Binary executables
▸ TCP/IP packets
▸ Zipped files
▸ JSON
▸ Web pages
25
data & content design
LESSON 2
DATA CLEANING - TIME
26
data & content design
LESSON 2
DATA CLEANING
27
data & content design
LESSON 2
DATA CLEANING - COUNTRY
28
data & content design
LESSON 2
DATA CLEANING
▸ Italy - 3
▸ Italy (with space) - 2
▸ Italia
▸ Pisa, Italy
▸ Milan
▸ Milan italy
▸ South Korea - 2
29
▸ South Korea
▸ Egypt
▸ Mexico
▸ Serbia
▸ The Netherlands
▸ Norway
▸ Taiwan
▸ Taiwan
▸ Costa Rica
▸ Macedonia
▸ Turkey
▸ Australia
data & content design
LESSON 2
DATA CLEANING - NAME
▸ Greta Scuso
▸ Vittoria
▸ Soonji Kwun
▸ Rewan
▸ Aurora
▸ Neithan
▸ Nadja
▸ Andrea
▸ Nadia van 't Klooster
▸ Yeso Lee
30
▸ Hanne Heimdal
▸ Hsin Yi Chen
▸ Yuri Michieletti
▸ Alessandro Calzoni
▸ Giulia Filippi
▸ Elena Fantini
▸ Stasha
▸ Eugenio Tonoli
▸ Ahmet Karan Oner
▸ Eileen
▸ Matteo
DATABASES
DON’T BE AFRAID OF
data & content design
LESSON 2
WHAT IS A DB?
According to Wikipedia “a database is an organized collection of data, generally
stored and accessed electronically from a computer system”.
Ideally it is organized in such a way that it can be easily accessed, managed, and
updated.
32
data & content design
LESSON 2
DB JARGON: QUERY
When you want to perform an operation on data stored in a db, you should run a
query. This is typically one of SELECT, INSERT, UPDATE, or DELETE.
SELECT wakeUpTime FROM dCDCourse
33
data & content design
LESSON 2
DB JARGON: TRANSACTION
When you need to perform a sequence of operations as a single unit of work,
that’s a transaction.
If one of you decide to withdraw from this course, then I need to update both the
list of students enrolled to this course and the total count of students. If I didn’t
operate inside a transaction, there’s a moment when one information (list of
students or total count) is wrong.
34
data & content design
LESSON 2
DB JARGON: ACID
Wikipedia: ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties of database
transactions intended to guarantee validity even in the event of errors, power failures, etc.
▸ Atomicity means that you guarantee that either all of the transaction succeeds or none of
it does.
▸ Consistency ensures that you guarantee that all data will be consistent.
▸ Isolation guarantees that all transactions will occur in isolation. No transaction will be
affected by any other transaction.
▸ Durability means that, once a transaction is committed, it will remain permanently in the
system.
35
DEAR DATA
GIORGIA LUPI
How to collect and organize data

More Related Content

What's hot

Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data managementMichael Day
 
Effects of Teachers Teaching Strategies and the Academic Performance at Grad...
Effects of  Teachers Teaching Strategies and the Academic Performance at Grad...Effects of  Teachers Teaching Strategies and the Academic Performance at Grad...
Effects of Teachers Teaching Strategies and the Academic Performance at Grad...Brandon King Albito
 
QUALITATIVE RESEARCH PROCESS
QUALITATIVE RESEARCH PROCESSQUALITATIVE RESEARCH PROCESS
QUALITATIVE RESEARCH PROCESSAIMS Education
 
RESEARCH APPROACH & DESIGN
RESEARCH APPROACH & DESIGNRESEARCH APPROACH & DESIGN
RESEARCH APPROACH & DESIGNMAHESWARI JAIKUMAR
 
Literature review in research
Literature review in researchLiterature review in research
Literature review in researchNursing Path
 
Research methodologies
Research methodologiesResearch methodologies
Research methodologieswtidwell
 
Research problem, hypothesis & conceptual framework
Research problem, hypothesis & conceptual frameworkResearch problem, hypothesis & conceptual framework
Research problem, hypothesis & conceptual frameworkMeghana Sudhir
 
Research assumptions, delimitations and limitations
Research assumptions, delimitations  and limitationsResearch assumptions, delimitations  and limitations
Research assumptions, delimitations and limitationsEMERENSIA X
 
Evaluation research-resty-samosa
Evaluation research-resty-samosaEvaluation research-resty-samosa
Evaluation research-resty-samosaResty Samosa
 
Theoretical Framework
Theoretical FrameworkTheoretical Framework
Theoretical FrameworkFarrukh Nazir
 
Qualitative Research Methods
Qualitative Research MethodsQualitative Research Methods
Qualitative Research MethodsJukka Peltokoski
 
Research proposal sample
Research proposal sampleResearch proposal sample
Research proposal sampleVanessa Cuesta
 
Quantitative research
Quantitative researchQuantitative research
Quantitative researchbellabellebell
 
Research assumption
Research assumptionResearch assumption
Research assumptionNursing Path
 
Descriptive research
Descriptive researchDescriptive research
Descriptive researchMarkquee Alceso
 
Historical research
Historical researchHistorical research
Historical researchProtik Roy
 

What's hot (20)

Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Effects of Teachers Teaching Strategies and the Academic Performance at Grad...
Effects of  Teachers Teaching Strategies and the Academic Performance at Grad...Effects of  Teachers Teaching Strategies and the Academic Performance at Grad...
Effects of Teachers Teaching Strategies and the Academic Performance at Grad...
 
Research designs Pt 1
Research designs Pt 1Research designs Pt 1
Research designs Pt 1
 
QUALITATIVE RESEARCH PROCESS
QUALITATIVE RESEARCH PROCESSQUALITATIVE RESEARCH PROCESS
QUALITATIVE RESEARCH PROCESS
 
Ethnographic research
Ethnographic researchEthnographic research
Ethnographic research
 
RESEARCH APPROACH & DESIGN
RESEARCH APPROACH & DESIGNRESEARCH APPROACH & DESIGN
RESEARCH APPROACH & DESIGN
 
Literature review in research
Literature review in researchLiterature review in research
Literature review in research
 
Research methodologies
Research methodologiesResearch methodologies
Research methodologies
 
Research problem, hypothesis & conceptual framework
Research problem, hypothesis & conceptual frameworkResearch problem, hypothesis & conceptual framework
Research problem, hypothesis & conceptual framework
 
Research assumptions, delimitations and limitations
Research assumptions, delimitations  and limitationsResearch assumptions, delimitations  and limitations
Research assumptions, delimitations and limitations
 
Evaluation research-resty-samosa
Evaluation research-resty-samosaEvaluation research-resty-samosa
Evaluation research-resty-samosa
 
Theoretical Framework
Theoretical FrameworkTheoretical Framework
Theoretical Framework
 
Qualitative Research Methods
Qualitative Research MethodsQualitative Research Methods
Qualitative Research Methods
 
Purpose of research
Purpose of researchPurpose of research
Purpose of research
 
Research proposal sample
Research proposal sampleResearch proposal sample
Research proposal sample
 
Quantitative research
Quantitative researchQuantitative research
Quantitative research
 
Research assumption
Research assumptionResearch assumption
Research assumption
 
Descriptive research
Descriptive researchDescriptive research
Descriptive research
 
Qualitative data analysis
Qualitative data analysisQualitative data analysis
Qualitative data analysis
 
Historical research
Historical researchHistorical research
Historical research
 

Similar to How to collect and organize data

Data science unit1
Data science unit1Data science unit1
Data science unit1varshakumar21
 
How to collect and organize data (v. ITA 2020)
How to collect and organize data (v. ITA 2020)How to collect and organize data (v. ITA 2020)
How to collect and organize data (v. ITA 2020)Frieda Brioschi
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3varshakumar21
 
Decision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great DataDecision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great DataDLT Solutions
 
Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater AlleneMcclendon878
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
U - 2 Emerging.pptx
U - 2 Emerging.pptxU - 2 Emerging.pptx
U - 2 Emerging.pptxMulukenTamrat2
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewIRJET Journal
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceMahir Haque
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsSherinMariamReji05
 
Big data and the data quality imperative
Big data and the data quality imperativeBig data and the data quality imperative
Big data and the data quality imperativeTrillium Software
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltoolssuresh sood
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Templatebutest
 
Make your data great now
Make your data great nowMake your data great now
Make your data great nowDaniel JACOB
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
 
Around Data Science
Around Data ScienceAround Data Science
Around Data ScienceFrieda Brioschi
 

Similar to How to collect and organize data (20)

Data science unit1
Data science unit1Data science unit1
Data science unit1
 
How to collect and organize data (v. ITA 2020)
How to collect and organize data (v. ITA 2020)How to collect and organize data (v. ITA 2020)
How to collect and organize data (v. ITA 2020)
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3
 
Decision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great DataDecision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great Data
 
Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater Confirming PagesLess managing. More teaching. Greater
Confirming PagesLess managing. More teaching. Greater
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
U - 2 Emerging.pptx
U - 2 Emerging.pptxU - 2 Emerging.pptx
U - 2 Emerging.pptx
 
Big Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A ReviewBig Data Processing with Hadoop : A Review
Big Data Processing with Hadoop : A Review
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
 
Big data and the data quality imperative
Big data and the data quality imperativeBig data and the data quality imperative
Big data and the data quality imperative
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Template
 
Make your data great now
Make your data great nowMake your data great now
Make your data great now
 
UNIT_1-BD.pptx
UNIT_1-BD.pptxUNIT_1-BD.pptx
UNIT_1-BD.pptx
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
Around Data Science
Around Data ScienceAround Data Science
Around Data Science
 

More from Frieda Brioschi

Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Frieda Brioschi
 
Storytelling with data (v. 2021 ITA)
Storytelling with data (v. 2021 ITA)Storytelling with data (v. 2021 ITA)
Storytelling with data (v. 2021 ITA)Frieda Brioschi
 
Visual communication of qualitative and quantitative data (v. 2021 ITA)
Visual communication of qualitative and quantitative data (v. 2021 ITA)Visual communication of qualitative and quantitative data (v. 2021 ITA)
Visual communication of qualitative and quantitative data (v. 2021 ITA)Frieda Brioschi
 
How we perceive information (v. 2021 ITA)
How we perceive information (v. 2021 ITA)How we perceive information (v. 2021 ITA)
How we perceive information (v. 2021 ITA)Frieda Brioschi
 
Around Data Science (v. 2021 ITA)
Around Data Science (v. 2021 ITA)Around Data Science (v. 2021 ITA)
Around Data Science (v. 2021 ITA)Frieda Brioschi
 
Data Lingo (v. ITA 2021)
Data Lingo (v. ITA 2021)Data Lingo (v. ITA 2021)
Data Lingo (v. ITA 2021)Frieda Brioschi
 
Information Classification (v. ITA 2021)
Information Classification (v. ITA 2021)Information Classification (v. ITA 2021)
Information Classification (v. ITA 2021)Frieda Brioschi
 
How to collect and organize data (v. ITA 2021)
How to collect and organize data (v. ITA 2021)How to collect and organize data (v. ITA 2021)
How to collect and organize data (v. ITA 2021)Frieda Brioschi
 
What are data and information, why they matter (v. ITA 2021)
What are data and information, why they matter (v. ITA 2021)What are data and information, why they matter (v. ITA 2021)
What are data and information, why they matter (v. ITA 2021)Frieda Brioschi
 
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)Frieda Brioschi
 
Digital communication (v. 2020 ITA)
Digital communication (v. 2020 ITA)Digital communication (v. 2020 ITA)
Digital communication (v. 2020 ITA)Frieda Brioschi
 
Storytelling with data (v. 2020 ITA)
Storytelling with data (v. 2020 ITA)Storytelling with data (v. 2020 ITA)
Storytelling with data (v. 2020 ITA)Frieda Brioschi
 
Visual communication of qualitative data (v. 2020 ITA)
Visual communication of qualitative data (v. 2020 ITA)Visual communication of qualitative data (v. 2020 ITA)
Visual communication of qualitative data (v. 2020 ITA)Frieda Brioschi
 
Visual communication of quantitative data (v. 2020 ITA)
Visual communication of quantitative data (v. 2020 ITA)Visual communication of quantitative data (v. 2020 ITA)
Visual communication of quantitative data (v. 2020 ITA)Frieda Brioschi
 
How we perceive information (v. 2020 ITA)
How we perceive information (v. 2020 ITA)How we perceive information (v. 2020 ITA)
How we perceive information (v. 2020 ITA)Frieda Brioschi
 
Data mining and data aggregation basics
Data mining and data aggregation basicsData mining and data aggregation basics
Data mining and data aggregation basicsFrieda Brioschi
 
Around Data Science (v. 2020 ITA)
Around Data Science (v. 2020 ITA)Around Data Science (v. 2020 ITA)
Around Data Science (v. 2020 ITA)Frieda Brioschi
 
Data Lingo (v. ITA 2020)
Data Lingo (v. ITA 2020)Data Lingo (v. ITA 2020)
Data Lingo (v. ITA 2020)Frieda Brioschi
 
Information Classification (v. ITA 2020)
Information Classification (v. ITA 2020)Information Classification (v. ITA 2020)
Information Classification (v. ITA 2020)Frieda Brioschi
 
What are data and information, why they matter (v. ITA 2020)
What are data and information, why they matter (v. ITA 2020)What are data and information, why they matter (v. ITA 2020)
What are data and information, why they matter (v. ITA 2020)Frieda Brioschi
 

More from Frieda Brioschi (20)

Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)
 
Storytelling with data (v. 2021 ITA)
Storytelling with data (v. 2021 ITA)Storytelling with data (v. 2021 ITA)
Storytelling with data (v. 2021 ITA)
 
Visual communication of qualitative and quantitative data (v. 2021 ITA)
Visual communication of qualitative and quantitative data (v. 2021 ITA)Visual communication of qualitative and quantitative data (v. 2021 ITA)
Visual communication of qualitative and quantitative data (v. 2021 ITA)
 
How we perceive information (v. 2021 ITA)
How we perceive information (v. 2021 ITA)How we perceive information (v. 2021 ITA)
How we perceive information (v. 2021 ITA)
 
Around Data Science (v. 2021 ITA)
Around Data Science (v. 2021 ITA)Around Data Science (v. 2021 ITA)
Around Data Science (v. 2021 ITA)
 
Data Lingo (v. ITA 2021)
Data Lingo (v. ITA 2021)Data Lingo (v. ITA 2021)
Data Lingo (v. ITA 2021)
 
Information Classification (v. ITA 2021)
Information Classification (v. ITA 2021)Information Classification (v. ITA 2021)
Information Classification (v. ITA 2021)
 
How to collect and organize data (v. ITA 2021)
How to collect and organize data (v. ITA 2021)How to collect and organize data (v. ITA 2021)
How to collect and organize data (v. ITA 2021)
 
What are data and information, why they matter (v. ITA 2021)
What are data and information, why they matter (v. ITA 2021)What are data and information, why they matter (v. ITA 2021)
What are data and information, why they matter (v. ITA 2021)
 
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)
Artificial Intelligence, Machine Learning & Tools (v. 2020 ITA)
 
Digital communication (v. 2020 ITA)
Digital communication (v. 2020 ITA)Digital communication (v. 2020 ITA)
Digital communication (v. 2020 ITA)
 
Storytelling with data (v. 2020 ITA)
Storytelling with data (v. 2020 ITA)Storytelling with data (v. 2020 ITA)
Storytelling with data (v. 2020 ITA)
 
Visual communication of qualitative data (v. 2020 ITA)
Visual communication of qualitative data (v. 2020 ITA)Visual communication of qualitative data (v. 2020 ITA)
Visual communication of qualitative data (v. 2020 ITA)
 
Visual communication of quantitative data (v. 2020 ITA)
Visual communication of quantitative data (v. 2020 ITA)Visual communication of quantitative data (v. 2020 ITA)
Visual communication of quantitative data (v. 2020 ITA)
 
How we perceive information (v. 2020 ITA)
How we perceive information (v. 2020 ITA)How we perceive information (v. 2020 ITA)
How we perceive information (v. 2020 ITA)
 
Data mining and data aggregation basics
Data mining and data aggregation basicsData mining and data aggregation basics
Data mining and data aggregation basics
 
Around Data Science (v. 2020 ITA)
Around Data Science (v. 2020 ITA)Around Data Science (v. 2020 ITA)
Around Data Science (v. 2020 ITA)
 
Data Lingo (v. ITA 2020)
Data Lingo (v. ITA 2020)Data Lingo (v. ITA 2020)
Data Lingo (v. ITA 2020)
 
Information Classification (v. ITA 2020)
Information Classification (v. ITA 2020)Information Classification (v. ITA 2020)
Information Classification (v. ITA 2020)
 
What are data and information, why they matter (v. ITA 2020)
What are data and information, why they matter (v. ITA 2020)What are data and information, why they matter (v. ITA 2020)
What are data and information, why they matter (v. ITA 2020)
 

Recently uploaded

Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 

Recently uploaded (20)

Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 

How to collect and organize data

  • 1. data & content design Frieda Brioschi - frieda.brioschi@gmail.com Emma Tracanella - emma.tracanella@gmail.com HOW TO COLLECT AND ORGANIZE DATA LESSON 2 - 2019/20
  • 3. data & content design LESSON 2 3 PRESENT YOUR DATA
  • 4. data & content design DATA IS ALL AROUND US LESSON 2 4
  • 6. data & content design LESSON 2 WHAT ARE DATA Data are individual units of information. A datum describes a single quality or quantity of some object or phenomenon. Data are measured, collected and reported, and analyzed, whereupon they can be visualized using graphs, images or other analysis tools. 6
  • 7. data & content design LESSON 2 PRIMARY VS SECONDARY DATA ▸ Primary data is data that is observed or collected from rst-hand sources ▸ Secondary data is data gathered from studies, surveys, or experiments that have been run by other people 7
  • 8. data & content design LESSON 2 QUALITATIVE VS QUANTITATIVE ▸ Quantitative data comes in the form of numbers, quantities and values. 
 Pro: it’s concrete and easily measurable. ▸ Qualitative data is descriptive, based on attributes. 
 It helps to explain the “why” behind the information quantitative data reveals. 8
  • 9. data & content design LESSON 2 PRIMARY DATA COLLECTION ▸ Observation ▸ Surveys & Questionnaire ▸ Interviews ▸ Focus Group 9
  • 10. data & content design LESSON 2 HOW 10
  • 11. data & content design LESSON 2 PRIMARY DATA COLLECTION ▸ In-Person Interviews
 Pros: In-depth and a high degree of condence on the data
 Cons: Time consuming, expensive and can be dismissed as anedoctal ▸ Mail Surveys
 Pros: Can reach anyone and everyone – no barrier
 Cons: Expensive, data collection errors, lag time ▸ Phone Surveys
 Pros: High degree of condence on the data collected, reach almost anyone
 Cons: Expensive, cannot self-administer, need to hire an agency ▸ Web/Online Surveys
 Pros: Cheap, can self-administer, very low probability of data errors
 Cons: Not all your customers might have an email address/be on the internet, customers may be wary of divulging information online. 11
  • 12. data & content design LESSON 2 BIAS Bias in data collection is a distortion which results in the information not being truly representative of the situation you are trying to investigate. Bias occurs for example when systematic error is introduced into sampling or testing by selecting or encouraging one outcome or answer over others. It can results from: ▸ survey questions that are constructed with a particular slant ▸ choosing a known group with a particular background to respond to surveys ▸ reporting data in misleading categorical groupings ▸ non-random selections when sampling ▸ systematic measurement errors 12
  • 13. data & content design LESSON 2 CASE STUDY: TAY.AI Tay was an articial intelligence chatter bot that was originally released by Microsoft via Twitter on March 23, 2016. It caused subsequent controversy when the bot began to post inflammatory and offensive tweets through its Twitter account, causing Microsoft to shut down the service only 16 hours after its launch. 13
  • 14. data & content design LESSON 2 SECONDARY DATA SOURCES ▸ Our data: ▸ Personal information, likes, activities and interests (Facebook, instagram, Youtube, …) ▸ Personal data (from mobile phone) 14
  • 15. data & content design LESSON 2 APPLE DATA HEALTH ▸ Heart rate, sleeping habits, workouts, steps and walking routines ▸ Introduced in September 2014 with iOS 8, the Apple Health app is pre-installed on all iPhones. ▸ Low-energy sensors, constantly collecting information about the user’s physical activities. With optional extra hardware (e.g. Apple Watch), Apple Health can collect signicantly more information.  15
  • 16. data & content design LESSON 2 SECONDARY DATA SOURCES ▸ Other data: ▸ Public data sets ▸ Historical data 16
  • 17. data & content design LESSON 2 FLIGHTRADAR24 ▸ Flightradar24 is a global flight tracking service that provides you with real-time information about thousands of aircraft around the world. ▸ Flightradar24 tracks 180,000+ flights, from 1,200+ airlines, flying to or from 4,000+ airports around the world in real time. ▸ https://www.flightradar24.com 17
  • 18. data & content design LESSON 2 HISTORICAL CLIMATE DATA ▸ Many of the historical sources available to climate historians mention weather in some way, but these references are buried in a huge volume of information. ▸ In recent years initiatives have transcribed, quantied, and digitalized: 
 a) historical observations, 
 b) historical activities that must have been strongly influenced by weather. ▸ https://www.historicalclimatology.com/ databases.html 18
  • 19. data & content design LESSON 2 ATLAS OF URBAN EXPANSION ▸ As of 2010, the world contained 4,231 cities with 100,000 or more people. ▸ The Atlas of Urban Expansion collects and analyzes data on the quantity and quality of urban expansion in a stratied global sample of 200 cities. ▸ The Atlas presents the output of the rst two phases of the Monitoring Global Urban Expansion Program, an initiative that gathers data and evidence on cities worldwide. ▸ http://atlasofurbanexpansion.org/cities/view/Milan 19
  • 20. data & content design LESSON 2 THE MOST POPULOUS CITY THROUGH TIME ▸ https://www.youtube.com/watch?v=pMs5xapBewM 20
  • 21. data & content design DATA COLLECTION MAY BE AFFECTED BY THEIR USE! We LESSON 2 21
  • 23. data & content design LESSON 2 STRUCTURED DATA Structured data is usually contained in rows and columns and its elements can be mapped into xed pre- dened model. Examples of sources: ▸ SQL Databases ▸ Spreadsheets such as Excel ▸ OLTP Systems ▸ Online forms ▸ Sensors such as GPS or RFID tags ▸ Network and Web server logs ▸ Medical devices 23
  • 24. data & content design LESSON 2 UNSTRUCTURED DATA Unstructured data is data that cannot be contained in a row-column format and doesn’t have a data model. Examples of sources: ▸ Web pages ▸ Images (JPEG, GIF, PNG, etc.) ▸ Videos ▸ Memos ▸ Reports ▸ Word documents and PowerPoint persentations ▸ Surveys 24
  • 25. data & content design LESSON 2 SEMI-STRUCTURED DATA Basically it’s a mix between both of the previous ones. Semi-structured data has some dening or consistent characteristics but doesn’t conform to a rigid structure. Examples of sources: ▸ E-mails ▸ XML and other markup languages ▸ Binary executables ▸ TCP/IP packets ▸ Zipped les ▸ JSON ▸ Web pages 25
  • 26. data & content design LESSON 2 DATA CLEANING - TIME 26
  • 27. data & content design LESSON 2 DATA CLEANING 27
  • 28. data & content design LESSON 2 DATA CLEANING - COUNTRY 28
  • 29. data & content design LESSON 2 DATA CLEANING ▸ Italy - 3 ▸ Italy (with space) - 2 ▸ Italia ▸ Pisa, Italy ▸ Milan ▸ Milan italy ▸ South Korea - 2 29 ▸ South Korea ▸ Egypt ▸ Mexico ▸ Serbia ▸ The Netherlands ▸ Norway ▸ Taiwan ▸ Taiwan ▸ Costa Rica ▸ Macedonia ▸ Turkey ▸ Australia
  • 30. data & content design LESSON 2 DATA CLEANING - NAME ▸ Greta Scuso ▸ Vittoria ▸ Soonji Kwun ▸ Rewan ▸ Aurora ▸ Neithan ▸ Nadja ▸ Andrea ▸ Nadia van 't Klooster ▸ Yeso Lee 30 ▸ Hanne Heimdal ▸ Hsin Yi Chen ▸ Yuri Michieletti ▸ Alessandro Calzoni ▸ Giulia Filippi ▸ Elena Fantini ▸ Stasha ▸ Eugenio Tonoli ▸ Ahmet Karan Oner ▸ Eileen ▸ Matteo
  • 32. data & content design LESSON 2 WHAT IS A DB? According to Wikipedia “a database is an organized collection of data, generally stored and accessed electronically from a computer system”. Ideally it is organized in such a way that it can be easily accessed, managed, and updated. 32
  • 33. data & content design LESSON 2 DB JARGON: QUERY When you want to perform an operation on data stored in a db, you should run a query. This is typically one of SELECT, INSERT, UPDATE, or DELETE. SELECT wakeUpTime FROM dCDCourse 33
  • 34. data & content design LESSON 2 DB JARGON: TRANSACTION When you need to perform a sequence of operations as a single unit of work, that’s a transaction. If one of you decide to withdraw from this course, then I need to update both the list of students enrolled to this course and the total count of students. If I didn’t operate inside a transaction, there’s a moment when one information (list of students or total count) is wrong. 34
  • 35. data & content design LESSON 2 DB JARGON: ACID Wikipedia: ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties of database transactions intended to guarantee validity even in the event of errors, power failures, etc. ▸ Atomicity means that you guarantee that either all of the transaction succeeds or none of it does. ▸ Consistency ensures that you guarantee that all data will be consistent. ▸ Isolation guarantees that all transactions will occur in isolation. No transaction will be affected by any other transaction. ▸ Durability means that, once a transaction is committed, it will remain permanently in the system. 35