5. Why StatMine?
• Statistics Netherlands (SN) mission produce
relevant information for:
•
•
•
•
•
•
•
Policy makers
Journalists
Citizens
Enterprises
Economists
Social scientists
Etc.
an exploration of dissemination data: StatMine
5
6. Numbers ≠ Information
StatLine is SN’s online DB (over 1 billion figures)
We know from a user study that:
1. Many interesting patterns in StatLine are not
spotted by users
2. Many important topics in StatLine are scattered
across multiple tables
an exploration of dissemination data: StatMine
6
7. Example of problem 2
• Policymaker interested in patients with diabetes:
•
•
•
•
•
Visits to medical doctor
Hospital admissions
Mortality
Medication consumption (insuline)
Obesity
Are all different statistical products (from different
sources)!
an exploration of dissemination data: StatMine
7
8. Data analysis = Data insight
Goal research project StatMine is to provide data
insight by:
• (I) Using data visualisation
• (II) Combining data table fragments
• (III) Deriving variables
All hypotheses (will be) tested with a prototype with
internal and external users.
(I), tested and succesful
(II, III,… ) is work in progress
an exploration of dissemination data: StatMine
8
9. Chart types
Bar chart
Line chart
Mosaic chart
Bubble/scatter chart
Comparison
Development
Structure
Correlation
an exploration of dissemination data: StatMine
9
10. Chart type – bar chart
an exploration of dissemination data: StatMine
10
11. Chart type – line chart
an exploration of dissemination data: StatMine
11
12. Chart type – mosaic chart
an exploration of dissemination data: StatMine
12
13. Chart type – bubble chart
an exploration of dissemination data: StatMine
13
14. Small multiples
Split chart into different subpopulations
Goal: compare subpopulations
Very little tools offer this functionality!
an exploration of dissemination data: StatMine
14
16. Composing a chart
Example:
• Year x Region x Gender x Age
• Count
• Mean income
• Employment
categorical variables /
dimensions
Numeric variables / topics
an exploration of dissemination data: StatMine
16
17. Prototype
• Built in php, javascript (d3)
• Imported 10 StatLine example tables
• Complex tables, e.g.
• Labor participation x gender x cohorts
• Labor market flow per quarter (employed/unemployed)
• Enterprise birth, death and growth x economic activity x
quarter
• Tested on:
• Internal users
• Owners of data
an exploration of dissemination data: StatMine
17
19. Evaluation
• Part I : very succesful
• Owners of data want prototype to check their own
data
• Provides insights
• Easy detection of anomalies
an exploration of dissemination data: StatMine
19
20. Work in progress
• II, Combination of different fragments
• Testing with policymakers (end this year)
• Or “How to glue statistical tables?”
• III, Derive variables + analysis
• Absolute vs relative (per population unit)
• Turnover / # employees
• Etc
an exploration of dissemination data: StatMine
20