SlideShare a Scribd company logo
VihangShah
Data mining
Introduction
Data mining is a process of retrieving data from huge database. Data mining is
automatically searching large data to discover patterns and trends that is different from
simple analysis. Data mining is also known as Knowledge Discovery in Data (KDD).
Data mining Process
Problem Definition
Problem definition in this stage the need of project, objective of project and
requirements are defined and from that the basic plan should be implement on primary
level.
Problem
Defination
Data Gathering
& Preparation
Model building
& Evaluation
Knowledge
Deployment
VihangShah
Data gathering & Preparation
As you know in earlier phase you collect all requirements in this phase the additional data or
some data be omitted for further phases. This is also a time to identify data quality problem.
In short data preparation can significantly improve the information that can be discovered
through data mining. The outcome of the data preparation is final data set.
Once the data sources are identified, they need to be selected, cleaned, constructed and
formatted into the desired form.
Model Building and evaluation
In this phase selection and apply various modeling techniques for retrieving optimal values.
The test will be generated to validate the quality and validity of the model. One or more
model are created and run on the prepared dataset.
Knowledge deployment
The knowledge or information which we gain from data mining process need to present in
such a way that it will be use when we need knowledge or information. In this phase the
plans for deployment, maintenance and monitoring have to be created for implementation
and also future supports.
What can data mining do and Not Do?
Do:-
 Data mining can help to find pattern and relationships within your data.
 Data mining help you to discover hidden information in your data.
 Data mining actually give optimize result from huge databases.
 Data mining can help you to analyze the data for future use.
VihangShah
Not Do:-
 Data mining cannot work automatically.
 Data mining cannot give you information about value of the information to your
organization.
 Data mining does not eliminate the need to know your business, to understand your
data.
Data Mining Technique
Data mining have basically six different techniques and that are Association, classification,
clustering, prediction, sequential pattern and decision tree.
Association
Association basically works on relation between items that why it also called relation
technique. It is used in marketing analysis to identify a set of customer’s frequently
purchase together.
Retailers are using association technique to research customer’s buying habits. Based on
historical sale data, retailers might found out that customers buy bread they also buy butter.
Classification
Classification is used to classify each item into predefined set of data or group. For example:
- We can apply classification in application that gives all records of employees who left the
company, predict who will probably leave the company in a future period.
Clustering
In clustering the classes are defined and the objects are put in each class, while in
classification technique object are assigned into predefined classes.
For example:- Consider book management in library there is wide range of book that having
a different topic. So now reader must have easy searching facility of books that having same
topics so for that we make a cluster that can keep books that have some kind of similarities
in one cluster or one shelf and label it with a meaningful name.
VihangShah
Prediction
Prediction is technique that predicts relationship between independent variable and
relationship between dependent and independent variables.
For instance the prediction technique can be used in sales to predict profit for the future if
we consider sale is an independent variable, profit could be a dependent variable.
Sequential Patterns
This technique seeks to discover or identity similar patterns, regular events or trends in
transaction data over a business period.
Decision Tree
It is most used technique of data mining because it is easy to understand. In this the root of
decision tree is a simple question or condition that has a multiple answers.
Each answer leads to a set of questions or conditions that help us determine the data.
Note: - we often combine two or more data mining techniques together to form an
appropriate process that meets the business needs.
Data mining Applications
 Data mining help in marketing such as it will used for analysis to provide information
on what product together, when they were bought and in what sequence and it will
also help to find customer’s behavior.
 Data mining help in banking/finance sector such as it will used to identify customer
loyalty by analyzing the data of customer’s purchasing activities and it will also help
retain credit card customers.
 Data mining help in health care and insurance sector such as it will analysis the
claims which medical procedures are claimed together and it will also forecasts
which customer will potentially purchase new policies.
NOTE: - Data mining is also used to analyze the data in many sectors.
VihangShah

More Related Content

What's hot

Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
Fazle Rabbi Ador
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
Izwan Nizal Mohd Shaharanee
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
Basit Rafiq
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
Scottperrone
 
Bank market classification
Bank market classificationBank market classification
Bank market classification
Maruthi Nataraj K
 
KETL Quick guide to data analytics
KETL Quick guide to data analytics KETL Quick guide to data analytics
KETL Quick guide to data analytics
KETL Limited
 
Application areas of data mining
Application areas of data miningApplication areas of data mining
Application areas of data mining
priya jain
 
Data mining
Data miningData mining
Data mining
SATISH KUMAR
 
DSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankDSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBank
Eric Esajian
 
Data analytics
Data analyticsData analytics
Data analytics
BindhuBhargaviTalasi
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
tobiemuir
 
Unit ii data analytics
Unit ii data analytics Unit ii data analytics
Teaching Descriptive Analytics, Customer Profiling and Clustering
Teaching Descriptive Analytics, Customer Profiling and ClusteringTeaching Descriptive Analytics, Customer Profiling and Clustering
Teaching Descriptive Analytics, Customer Profiling and Clustering
St. Edward's University
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
DataminingTools Inc
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
Amritanshu Mehra
 
Business analytics
Business analyticsBusiness analytics
Business analytics
AshnaBritto
 
Data mining financial services
Data mining financial servicesData mining financial services
Data mining financial services
Hprentice
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
Sanzid Kawsar
 
Data Mining and Data Warehouse
Data Mining and Data WarehouseData Mining and Data Warehouse
Data Mining and Data Warehouse
Anupam Sharma
 
Data Mining
Data Mining Data Mining

What's hot (20)

Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
 
Bank market classification
Bank market classificationBank market classification
Bank market classification
 
KETL Quick guide to data analytics
KETL Quick guide to data analytics KETL Quick guide to data analytics
KETL Quick guide to data analytics
 
Application areas of data mining
Application areas of data miningApplication areas of data mining
Application areas of data mining
 
Data mining
Data miningData mining
Data mining
 
DSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBankDSO528GroupProject-PortugueseBank
DSO528GroupProject-PortugueseBank
 
Data analytics
Data analyticsData analytics
Data analytics
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
 
Unit ii data analytics
Unit ii data analytics Unit ii data analytics
Unit ii data analytics
 
Teaching Descriptive Analytics, Customer Profiling and Clustering
Teaching Descriptive Analytics, Customer Profiling and ClusteringTeaching Descriptive Analytics, Customer Profiling and Clustering
Teaching Descriptive Analytics, Customer Profiling and Clustering
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Data mining financial services
Data mining financial servicesData mining financial services
Data mining financial services
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Data Mining and Data Warehouse
Data Mining and Data WarehouseData Mining and Data Warehouse
Data Mining and Data Warehouse
 
Data Mining
Data Mining Data Mining
Data Mining
 

Viewers also liked

Man's heart
Man's heartMan's heart
Man's heart
jesussoldierindia
 
Elementary Concepts of data minig
Elementary Concepts of data minigElementary Concepts of data minig
Elementary Concepts of data minig
Dr Anjan Krishnamurthy
 
Data mining
Data miningData mining
Data mining
imran khan
 
Data minig
Data minig Data minig
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
Sandip Tipayle Patil
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
Poonam Kshirsagar
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 

Viewers also liked (7)

Man's heart
Man's heartMan's heart
Man's heart
 
Elementary Concepts of data minig
Elementary Concepts of data minigElementary Concepts of data minig
Elementary Concepts of data minig
 
Data mining
Data miningData mining
Data mining
 
Data minig
Data minig Data minig
Data minig
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 

Similar to Data Mining

Presentation in Strategic Plannin and Management.pptx
Presentation in Strategic Plannin and Management.pptxPresentation in Strategic Plannin and Management.pptx
Presentation in Strategic Plannin and Management.pptx
YRREHCPARCON
 
Data Analysis - Approach & Techniques
Data Analysis - Approach & TechniquesData Analysis - Approach & Techniques
Data Analysis - Approach & Techniques
InvenkLearn
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
Data analysis ireland
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
hp41112004
 
Data Mining
Data MiningData Mining
Data Mining
Gary Stefan
 
DataMining Techniq
DataMining TechniqDataMining Techniq
DataMining Techniq
Respa Peter
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
Seerat Malik
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
millerca2
 
Data and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptxData and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptx
Lamees EL- Ghazoly
 
Datamining
DataminingDatamining
Datamining
DataminingDatamining
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
Er. Nawaraj Bhandari
 
Data mining & data warehousing
Data mining & data warehousingData mining & data warehousing
Data mining & data warehousing
Shubha Brota Raha
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdf
Agile dock
 
leewayhertz.com-Data analysis workflow using Scikit-learn.pdf
leewayhertz.com-Data analysis workflow using Scikit-learn.pdfleewayhertz.com-Data analysis workflow using Scikit-learn.pdf
leewayhertz.com-Data analysis workflow using Scikit-learn.pdf
KristiLBurns
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Tony Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Hoang Nguyen
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Luis Goldster
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
James Wong
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
Fraboni Ec
 

Similar to Data Mining (20)

Presentation in Strategic Plannin and Management.pptx
Presentation in Strategic Plannin and Management.pptxPresentation in Strategic Plannin and Management.pptx
Presentation in Strategic Plannin and Management.pptx
 
Data Analysis - Approach & Techniques
Data Analysis - Approach & TechniquesData Analysis - Approach & Techniques
Data Analysis - Approach & Techniques
 
what is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysiswhat is ..how to process types and methods involved in data analysis
what is ..how to process types and methods involved in data analysis
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
 
Data Mining
Data MiningData Mining
Data Mining
 
DataMining Techniq
DataMining TechniqDataMining Techniq
DataMining Techniq
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
Data and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptxData and Information Visualization part 2.pptx
Data and Information Visualization part 2.pptx
 
Datamining
DataminingDatamining
Datamining
 
Datamining
DataminingDatamining
Datamining
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
Data mining & data warehousing
Data mining & data warehousingData mining & data warehousing
Data mining & data warehousing
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdf
 
leewayhertz.com-Data analysis workflow using Scikit-learn.pdf
leewayhertz.com-Data analysis workflow using Scikit-learn.pdfleewayhertz.com-Data analysis workflow using Scikit-learn.pdf
leewayhertz.com-Data analysis workflow using Scikit-learn.pdf
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 

Recently uploaded

Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 

Recently uploaded (20)

Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 

Data Mining

  • 1. VihangShah Data mining Introduction Data mining is a process of retrieving data from huge database. Data mining is automatically searching large data to discover patterns and trends that is different from simple analysis. Data mining is also known as Knowledge Discovery in Data (KDD). Data mining Process Problem Definition Problem definition in this stage the need of project, objective of project and requirements are defined and from that the basic plan should be implement on primary level. Problem Defination Data Gathering & Preparation Model building & Evaluation Knowledge Deployment
  • 2. VihangShah Data gathering & Preparation As you know in earlier phase you collect all requirements in this phase the additional data or some data be omitted for further phases. This is also a time to identify data quality problem. In short data preparation can significantly improve the information that can be discovered through data mining. The outcome of the data preparation is final data set. Once the data sources are identified, they need to be selected, cleaned, constructed and formatted into the desired form. Model Building and evaluation In this phase selection and apply various modeling techniques for retrieving optimal values. The test will be generated to validate the quality and validity of the model. One or more model are created and run on the prepared dataset. Knowledge deployment The knowledge or information which we gain from data mining process need to present in such a way that it will be use when we need knowledge or information. In this phase the plans for deployment, maintenance and monitoring have to be created for implementation and also future supports. What can data mining do and Not Do? Do:-  Data mining can help to find pattern and relationships within your data.  Data mining help you to discover hidden information in your data.  Data mining actually give optimize result from huge databases.  Data mining can help you to analyze the data for future use.
  • 3. VihangShah Not Do:-  Data mining cannot work automatically.  Data mining cannot give you information about value of the information to your organization.  Data mining does not eliminate the need to know your business, to understand your data. Data Mining Technique Data mining have basically six different techniques and that are Association, classification, clustering, prediction, sequential pattern and decision tree. Association Association basically works on relation between items that why it also called relation technique. It is used in marketing analysis to identify a set of customer’s frequently purchase together. Retailers are using association technique to research customer’s buying habits. Based on historical sale data, retailers might found out that customers buy bread they also buy butter. Classification Classification is used to classify each item into predefined set of data or group. For example: - We can apply classification in application that gives all records of employees who left the company, predict who will probably leave the company in a future period. Clustering In clustering the classes are defined and the objects are put in each class, while in classification technique object are assigned into predefined classes. For example:- Consider book management in library there is wide range of book that having a different topic. So now reader must have easy searching facility of books that having same topics so for that we make a cluster that can keep books that have some kind of similarities in one cluster or one shelf and label it with a meaningful name.
  • 4. VihangShah Prediction Prediction is technique that predicts relationship between independent variable and relationship between dependent and independent variables. For instance the prediction technique can be used in sales to predict profit for the future if we consider sale is an independent variable, profit could be a dependent variable. Sequential Patterns This technique seeks to discover or identity similar patterns, regular events or trends in transaction data over a business period. Decision Tree It is most used technique of data mining because it is easy to understand. In this the root of decision tree is a simple question or condition that has a multiple answers. Each answer leads to a set of questions or conditions that help us determine the data. Note: - we often combine two or more data mining techniques together to form an appropriate process that meets the business needs. Data mining Applications  Data mining help in marketing such as it will used for analysis to provide information on what product together, when they were bought and in what sequence and it will also help to find customer’s behavior.  Data mining help in banking/finance sector such as it will used to identify customer loyalty by analyzing the data of customer’s purchasing activities and it will also help retain credit card customers.  Data mining help in health care and insurance sector such as it will analysis the claims which medical procedures are claimed together and it will also forecasts which customer will potentially purchase new policies. NOTE: - Data mining is also used to analyze the data in many sectors.