SlideShare a Scribd company logo
1 of 53
GROUP 1
• Analytics Life Cycle
• Business Understanding
• Data Understanding
• Data Preparation
• Modeling
• Business understanding – What does
the business need?
• Data understanding – What data do
we have / need? Is it clean?
• Data preparation – How do we
organize the data for modeling?
• Modeling – What modeling techniques
should we apply?
• Evaluation – Which model best meets
the business objectives?
• Deployment – How do stakeholders
access the results?
ANALYTICS
LIFE CYCLE
data
understanding
business
understanding
deployment
evaluation
modeling
data
preparation
ensures that the project is aligned with
the business objectives
helps identify the data sources relevant
to the problem, saving time and
resources later
establishes the success criteria for the
project
Improved
project
outcomes
Reduced
risk
Efficient
use of
resources
Improved
communication
• Identifying the Business Problem
• Defining project objectives
• Determining success criteria
• Assessing project feasibility
• Identifying data sources
Involving
stakeholders
Conducting a
SWOT analysis
Defining data
mining goals
Communicating
findings to
stakeholders
a phase that involves gaining
familiarity with the data available for
analysis
• Data Collection - involves understanding
the sources, formats,and access methods
• Data Description - understanding its
structure and composition
• Data Exploration - performing initial
exploratory data
• Verify Data Quality - assessing quality
• Initial Insights and Hypotheses
Generation - startinh point for further
analysis
• Documentation - ensures that insights
are gained and captured
The primary objective of the Data Understanding
phase is to gain a comprehensive understanding
of the data available for analysis. This
understanding informs subsequent phases of the
CRISP-DM methodology, particularly data
preparation and modeling.
Data preparation is an important
step in data analytics. It aims at
assessing and improving the
quality of data for secondary
statistical analysis. With this, the
data is better understood and
the data analysis is performed
more accurately and efficiently.
TASKS FOR DATA PREPARATION
Data Cleaning Data Integration
Data
Transformation
Data Reduction
This step deals with
missing data, noise,
outliers, and correct
inconsistencies of the
data making sure that
it is accurate and
correct.
It is a process of combining
data derived from various
data sources(such as
database, flat files, etc.)
into a consistent data set
for both operational and
analytical.
It aims to transform
the data values into a
format, scale, or unit
that is more suitable
for analysis.
Data reduction is a
process of obtaining a
reduced representation of
the data set that is
much smaller in volume
but yet produce the same
(or almost the same)
analytical results.
FOUR TASK:
• Selecting modeling techniques
• Generating Design Model
• Building model(s)
• Assessing model(s)
As the first step in modeling, select the actual
modeling technique that is to be used initially.
During the business understanding stage, you may
already have picked a tool.
Deliverables for this task include two reports:
• Modeling technique: refers to the actual
modeling technique that is used.
• Modeling assumptions: specific assumptions
about the data, data quality or the data format.
Prior to building a model, a procedure
needs to be defined to test the model’s
quality and validity.
To generating a comprehensive test design:
Run the modelling tool on the prepared dataset to create one
or more models.
• Parameter settings – With any modelling tool there are often a
large number of parameters that can be adjusted.
• Models – These are the actual models produced by the modelling
tool, not a report on the models.
• Model descriptions – Describe the resulting models, report on
the interpretation of the models and document any difficulties
encountered with their meanings.
At this stage you should rank the models and assess
them according to the evaluation criteria.
In most data mining projects a single technique is
applied more than once and data mining results are
generated with several different techniques.
• Model assessment
• Revised parameter settings
• Give 1 benefit of
Business
Understanding
• Give 1 key
activities of the
Business
Understanding
Phase
• Name 2 phases
of analytics life
cycle
• What does CRISP-
DM stand for?
• What is the
process in data
preparation where
data from different
sources are
integrated to come
up with a single
data store?
• give 2 task of
MODELING
• What is the phase
that involves
gaining familiarity
with data
available for
analysis?
• The primary
objective of Data
Understanding
phase is to?
Any of these:
• Improved project
outcomes
• Reduced risk
• Efficient use of
resources
• Improved
communication
Any of these:
• Identifying the Business
Problem
• Defining project objectives
• Determining success criteria
• Assessing project feasibility
• Identifying data sources
Any of these:
• Cross-Industry
Standard Process
for Data Mining
• Data Integration
Any of these:
• Selecting modeling
techniques
• Generating Design Model
• Building model(s)
• Assessing model(s)
• Data
Understanding
• gain a
comprehensive
understanding of
the data

More Related Content

Similar to Group 1 Report CRISP - DM METHODOLOGY.pptx

Lecture 10 - DataMiningEngineering.ppt
Lecture 10 - DataMiningEngineering.pptLecture 10 - DataMiningEngineering.ppt
Lecture 10 - DataMiningEngineering.pptAsadkhan47384
 
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxDATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxrandyburney60861
 
MEGA Solution Footprint V5.pptx
MEGA Solution Footprint V5.pptxMEGA Solution Footprint V5.pptx
MEGA Solution Footprint V5.pptxWissamShehab1
 
Ibm test data_management_v0.4
Ibm test data_management_v0.4Ibm test data_management_v0.4
Ibm test data_management_v0.4Rosario Cunha
 
AI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектов
AI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектовAI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектов
AI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектовGeeksLab Odessa
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologySergey Shelpuk
 
Data Mining Implementation process.pptx
Data Mining Implementation process.pptxData Mining Implementation process.pptx
Data Mining Implementation process.pptxLithal Fragrance
 
Module-1.pptxcjxifkgzkzigoyxyxoxoyztiai. Tisi
Module-1.pptxcjxifkgzkzigoyxyxoxoyztiai. TisiModule-1.pptxcjxifkgzkzigoyxyxoxoyztiai. Tisi
Module-1.pptxcjxifkgzkzigoyxyxoxoyztiai. TisiArunnaik63
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Jeremy Lehman
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data miningHadi Fadlallah
 
MDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large EnterprisesMDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large EnterprisesMark Schoeppel
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.pptSK Chew
 
Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringData-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringDATAVERSITY
 
Introduction to Data Management Maturity Models
Introduction to Data Management Maturity ModelsIntroduction to Data Management Maturity Models
Introduction to Data Management Maturity ModelsKingland
 

Similar to Group 1 Report CRISP - DM METHODOLOGY.pptx (20)

Lecture 10 - DataMiningEngineering.ppt
Lecture 10 - DataMiningEngineering.pptLecture 10 - DataMiningEngineering.ppt
Lecture 10 - DataMiningEngineering.ppt
 
Analytics
AnalyticsAnalytics
Analytics
 
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxDATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
 
MEGA Solution Footprint V5.pptx
MEGA Solution Footprint V5.pptxMEGA Solution Footprint V5.pptx
MEGA Solution Footprint V5.pptx
 
Ibm test data_management_v0.4
Ibm test data_management_v0.4Ibm test data_management_v0.4
Ibm test data_management_v0.4
 
KPMG_Task2.pptx
KPMG_Task2.pptxKPMG_Task2.pptx
KPMG_Task2.pptx
 
AI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектов
AI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектовAI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектов
AI&BigData Lab 2016. Сергей Шельпук: Методология Data Science проектов
 
CRISP-DM: a data science project methodology
CRISP-DM: a data science project methodologyCRISP-DM: a data science project methodology
CRISP-DM: a data science project methodology
 
Data Mining Implementation process.pptx
Data Mining Implementation process.pptxData Mining Implementation process.pptx
Data Mining Implementation process.pptx
 
Module-1.pptxcjxifkgzkzigoyxyxoxoyztiai. Tisi
Module-1.pptxcjxifkgzkzigoyxyxoxoyztiai. TisiModule-1.pptxcjxifkgzkzigoyxyxoxoyztiai. Tisi
Module-1.pptxcjxifkgzkzigoyxyxoxoyztiai. Tisi
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Lesson1.2.pptx.pdf
Lesson1.2.pptx.pdfLesson1.2.pptx.pdf
Lesson1.2.pptx.pdf
 
MDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large EnterprisesMDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large Enterprises
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
crisp.ppt
crisp.pptcrisp.ppt
crisp.ppt
 
Datascience methodology
Datascience methodologyDatascience methodology
Datascience methodology
 
Data-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality EngineeringData-Ed Webinar: Data Quality Engineering
Data-Ed Webinar: Data Quality Engineering
 
2014 dqe handouts
2014 dqe handouts2014 dqe handouts
2014 dqe handouts
 
Introduction to Data Management Maturity Models
Introduction to Data Management Maturity ModelsIntroduction to Data Management Maturity Models
Introduction to Data Management Maturity Models
 

Recently uploaded

如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证acoha1
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeBoston Institute of Analytics
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证zifhagzkk
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样wsppdmt
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Klinik Aborsi
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...yulianti213969
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444saurabvyas476
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjadimosmejiaslendon
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...varanasisatyanvesh
 
DS Lecture-1 about discrete structure .ppt
DS Lecture-1 about discrete structure .pptDS Lecture-1 about discrete structure .ppt
DS Lecture-1 about discrete structure .pptTanveerAhmed817946
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchersdarmandersingh4580
 

Recently uploaded (20)

如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(UPenn毕业证书)宾夕法尼亚大学毕业证成绩单本科硕士学位证留信学历认证
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
如何办理澳洲拉筹伯大学毕业证(LaTrobe毕业证书)成绩单原件一模一样
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444sourabh vyas1222222222222222222244444444
sourabh vyas1222222222222222222244444444
 
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...Simplify hybrid data integration at an enterprise scale. Integrate all your d...
Simplify hybrid data integration at an enterprise scale. Integrate all your d...
 
DS Lecture-1 about discrete structure .ppt
DS Lecture-1 about discrete structure .pptDS Lecture-1 about discrete structure .ppt
DS Lecture-1 about discrete structure .ppt
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Abortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotecAbortion pills in Jeddah |+966572737505 | get cytotec
Abortion pills in Jeddah |+966572737505 | get cytotec
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 

Group 1 Report CRISP - DM METHODOLOGY.pptx

  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. • Analytics Life Cycle • Business Understanding • Data Understanding • Data Preparation • Modeling
  • 10.
  • 11. • Business understanding – What does the business need? • Data understanding – What data do we have / need? Is it clean? • Data preparation – How do we organize the data for modeling?
  • 12. • Modeling – What modeling techniques should we apply? • Evaluation – Which model best meets the business objectives? • Deployment – How do stakeholders access the results?
  • 14.
  • 15. ensures that the project is aligned with the business objectives helps identify the data sources relevant to the problem, saving time and resources later establishes the success criteria for the project
  • 17. • Identifying the Business Problem • Defining project objectives • Determining success criteria • Assessing project feasibility • Identifying data sources
  • 18. Involving stakeholders Conducting a SWOT analysis Defining data mining goals Communicating findings to stakeholders
  • 19.
  • 20. a phase that involves gaining familiarity with the data available for analysis
  • 21. • Data Collection - involves understanding the sources, formats,and access methods • Data Description - understanding its structure and composition • Data Exploration - performing initial exploratory data
  • 22. • Verify Data Quality - assessing quality • Initial Insights and Hypotheses Generation - startinh point for further analysis • Documentation - ensures that insights are gained and captured
  • 23. The primary objective of the Data Understanding phase is to gain a comprehensive understanding of the data available for analysis. This understanding informs subsequent phases of the CRISP-DM methodology, particularly data preparation and modeling.
  • 24.
  • 25. Data preparation is an important step in data analytics. It aims at assessing and improving the quality of data for secondary statistical analysis. With this, the data is better understood and the data analysis is performed more accurately and efficiently.
  • 26. TASKS FOR DATA PREPARATION Data Cleaning Data Integration Data Transformation Data Reduction This step deals with missing data, noise, outliers, and correct inconsistencies of the data making sure that it is accurate and correct. It is a process of combining data derived from various data sources(such as database, flat files, etc.) into a consistent data set for both operational and analytical. It aims to transform the data values into a format, scale, or unit that is more suitable for analysis. Data reduction is a process of obtaining a reduced representation of the data set that is much smaller in volume but yet produce the same (or almost the same) analytical results.
  • 27.
  • 28.
  • 29. FOUR TASK: • Selecting modeling techniques • Generating Design Model • Building model(s) • Assessing model(s)
  • 30. As the first step in modeling, select the actual modeling technique that is to be used initially. During the business understanding stage, you may already have picked a tool.
  • 31. Deliverables for this task include two reports: • Modeling technique: refers to the actual modeling technique that is used. • Modeling assumptions: specific assumptions about the data, data quality or the data format.
  • 32. Prior to building a model, a procedure needs to be defined to test the model’s quality and validity.
  • 33. To generating a comprehensive test design:
  • 34. Run the modelling tool on the prepared dataset to create one or more models. • Parameter settings – With any modelling tool there are often a large number of parameters that can be adjusted. • Models – These are the actual models produced by the modelling tool, not a report on the models. • Model descriptions – Describe the resulting models, report on the interpretation of the models and document any difficulties encountered with their meanings.
  • 35. At this stage you should rank the models and assess them according to the evaluation criteria. In most data mining projects a single technique is applied more than once and data mining results are generated with several different techniques. • Model assessment • Revised parameter settings
  • 36.
  • 37. • Give 1 benefit of Business Understanding
  • 38. • Give 1 key activities of the Business Understanding Phase
  • 39. • Name 2 phases of analytics life cycle
  • 40. • What does CRISP- DM stand for?
  • 41. • What is the process in data preparation where data from different sources are integrated to come up with a single data store?
  • 42. • give 2 task of MODELING
  • 43. • What is the phase that involves gaining familiarity with data available for analysis?
  • 44. • The primary objective of Data Understanding phase is to?
  • 45.
  • 46. Any of these: • Improved project outcomes • Reduced risk • Efficient use of resources • Improved communication
  • 47. Any of these: • Identifying the Business Problem • Defining project objectives • Determining success criteria • Assessing project feasibility • Identifying data sources
  • 51. Any of these: • Selecting modeling techniques • Generating Design Model • Building model(s) • Assessing model(s)