10 Steps for High-Quality Datasets: a "Divide et impera" approach to #data.
I want to share my experience in producing high-quality data sets for data analysts.
How can #standardization improve #dataquality of #datasets for #dataanalysis?
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
10 Steps to High-Quality Datasets
1. 10 STEPS FOR
HIGH-QUALITY
DATASETS
BY PIER GIUSEPPE DE MEO
#1
Keep your Datasets separate.
#2
Prepare a toolbox with a set of transformation processes (procedures, functions,
scripts, etc.) that can be reused.
#3
Logically group the types of transformations, based on categories (e.g. missing
values, decodes, normalization, etc.).
#4
For every category identified, select a subset of data in a Dataset on which to apply
this type of transformation: repeat this process on all your Datasets separately.
#5
For every Dataset, if needed, enrich the data contained with other derived
information (e.g. calculated field, extraction of sub-information, etc.).
#6
Define the minimum level of details shared across all Datasets (e.g. single
transaction per day, groups of transactions per month, etc.).
#7
For every Dataset, groups data at the same level of granularity.
#8
Join all formatted Datasets in a single Master Dataset, based on granularity defined.
#9
In the Master Dataset produced, check whether there exists a subset of data on
which to apply any of the transformations in the toolbox.
#10
In the Master Dataset produced, if needed, enrich the data with some extra
information (e.g. metrics from various Datasets combined to form a KPI,
decryption based on a combination of fields, etc.).
Knowledge
Share
Series 1
DATASETS
A "Divide et impera" approach in producing high-quality
Datasets for data analysts.