SlideShare a Scribd company logo
DATA MINING
HISTORY:
DATA MINING:
Turn raw data into useful information(readable format).
By using a software to look for patterns in large batches of
data.
Businesses can use that data and learn more about their
customers to develop more effective marketing strategies,
sales etc…
PROCESSES:
Data Cleaning
Data integration
Data Transformation
Data Mining
Pattern Evolution
Data Presentation
KDD:
Knowledge Discovery from Data.
Growth of data: Terabytes to Petabytes
We are downing in data but starving for knowledge!
Fraud detection and detection of unusual patterns.
KDD PROCESSES:
Data Cleaning:
Removing noise and inconsistent data.
Data Integration:
Multiple data sources may be combined.
Data Selection:
Related data
KDD PROCESSES:
Data transformation
Unified Format
Data Mining
Extract patterns
Pattern Evaluation
Identifying true patterns
Data Presentation
Presenting data in a readable format
FLOW CAHRT FOR KDD:
WHAT KIND OF DATA??
Data from Databases.
Data from different resources.
Ex: Servers, Firewalls, log collectors etc….
TECHNIQUES USED:
CLASSIFICATION:
This technique helps to classify data in different classes.
Data handling, Ex: Multimedia, Text-data, Time-series data….
Data model involved: Ex: Databases
Analysis of data: Neural networks, ML, Statistics etc….
ASSOCIATION:
It creates rules that tells how often events are occurred together.
Ex: When a customer buys computer/device, then 90% of time they
will buy software’s.
PREDICTION:
Combination of other data mining techniques such as trends,
clustering, classification, etc.
It analyses past events or instanced in the right sequence to predict
a future event.
CLUSTERING:
This techniques helps to identify similar kind of data.
Recognizes differences and similarities between data.
Similar to classification technique, but involves grouping chunks of
data together based on their similarities.
Similar data is grouped in the same cluster.
Dissimilar data is grouped in the same cluster.
REGRESSION:
This techniques deals with prediction of values or number.
Planning and modeling data.
Ex: Predicting children’s height based on their age, weight, height
and other factors.
SEQUENTIAL PATTERNS:
Data evaluation will be done to discover sequential patterns.
It comprises of finding interesting subsequence in a set of
sequences, where the stake of a sequence can be measured in terms
of different criteria like length, occurrence frequency, etc.
DECISION TREES:
Supervised learning technique.
It is a tree that helps is in decision-making purposes.
The decision tree creates classification or regression models as a
tree structure.
It separates a data set into smaller subsets, and at the same time,
the decision tree is steadily developed. The final tree is a tree with
decision nodes and leaf nodes.
APPLICATIONS OF DATA MINING:
DATA MINING IN HEALTH CARE:
It holds incredible potential for healthcare services due to the
exponential growth in the number of electrical health records.
Previously Doctors and physicians hold patient information in the paper
where the data was quite difficult to hold.
Digitalization and innovation of new techniques reduce human efforts
and make data easily assessable.
For example, the computer keeps a massive amount of patient data with
accuracy and it improves the quality of the whole data management
system.
EDUCATIONAL DATA MINING:
This technique uses multiple algorithms to improve educational
results and explain educational procedures for further decision
making.
Learning initially began in the classroom and was based on
behavioural, psychological, and constructive models.
Behavioural models depend on observable changes in the behavior
of the student to determine the learning results.
TOOLS:
Oracle Data Mining
IBM SPSS Modeler
Rapid Miner
Apache Mahout
Weka
KNIME
SSQT(SQL Server Data tools)
 Oracle Data Mining
Oracle Data Mining(ODM), a component of the oracle Database, provides
powerful data mining algorithms that enable data analysts to discover insights, make
predictions and leverage their oracle data and investment.
 IBM SPSS Modeler
IBM SPSS Modeler is a data mining and text analytics software applications.
It is used to built predictive models and conduct other analytic tasks. It has a
visual interface which allows users to leverage statistical and data mining algorithms without
programming.
 Rapid Miner
RapidMiner is a data science software platform that provides an integrated
environment for the data preparation, machine learning, deep learning, text mining, and
predictive analytics. RapidMiner is developed on an open core model.
 Apache Mahout
Apache Mahout is a project of the Apache Software Foundation to produce free
implementations of distributed or otherwise scalable machine learning algorithms focused
primarily on linear algebra.
 WEKA
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms
can either be applied directly to a dataset or called from your own java code. Weka contains tools
for data pre-processing.
 KNIME
Stand for Konstanz information Miner, is a free and open-source data analytics, reporting
and integration platform. KNIME integrates various components for machine learning and data
mining through its modular data pipelining concept.
CONCLUSION:
The ultimate goal of data mining is the prediction of human
behaviour, and is by far the most common business application.
However this can easily be modelled to meet the objective of
detection and deterrence of criminals.
Data Mining tools increase not only the speed of analysis, but the
depth of its approach.
THANK YOU

More Related Content

Similar to DATA MINING DC Presentation.pptx

Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
VrushaliSolanke
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
Basma Gamal
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
Polash Halder
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
Data Mining
Data MiningData Mining
Data Mining
Gary Stefan
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Mahir Haque
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
Ravi Teja
 
Data Science
Data ScienceData Science
Data Science
Prakhyath Rai
 
Bigdataanalytics
BigdataanalyticsBigdataanalytics
Bigdataanalytics
Haroon Karim
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Cambridge Semantics
 
Python for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive GuidePython for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive Guide
Aivada
 
25 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 202225 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 2022
Kavika Roy
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
Dr.Florence Dayana
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
ranjit banshpal
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
dataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
Phi Jack
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345
AkhilSinghal21
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Osman Ali
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
Take1As
 

Similar to DATA MINING DC Presentation.pptx (20)

Introduction of Data Science and Data Analytics
Introduction of Data Science and Data AnalyticsIntroduction of Data Science and Data Analytics
Introduction of Data Science and Data Analytics
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data Mining
Data MiningData Mining
Data Mining
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
 
Data Science
Data ScienceData Science
Data Science
 
Bigdataanalytics
BigdataanalyticsBigdataanalytics
Bigdataanalytics
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
 
Python for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive GuidePython for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive Guide
 
25 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 202225 Best Data Mining Tools in 2022
25 Best Data Mining Tools in 2022
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
 
using big-data methods analyse the Cross platform aviation
 using big-data methods analyse the Cross platform aviation using big-data methods analyse the Cross platform aviation
using big-data methods analyse the Cross platform aviation
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 

Recently uploaded

❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
essorprof62
 
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Pro Tips for Effortless Contract Management
Pro Tips for Effortless Contract ManagementPro Tips for Effortless Contract Management
Pro Tips for Effortless Contract Management
Eternity Paralegal Services
 
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
eaqmokn
 
➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart
➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart
➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
High-Quality IPTV Monthly Subscription for $15
High-Quality IPTV Monthly Subscription for $15High-Quality IPTV Monthly Subscription for $15
High-Quality IPTV Monthly Subscription for $15
advik4387
 
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing
➒➌➎➏➑➐➋➑➐➐  Satta Matka Dpboss Matka Guessing➒➌➎➏➑➐➋➑➐➐  Satta Matka Dpboss Matka Guessing
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing
➒➌➎➏➑➐➋➑➐➐Dpboss Matka Guessing Satta Matka Kalyan Chart Indian Matka
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results
 
deft. 2024 pricing guide for onboarding
deft.  2024 pricing guide for onboardingdeft.  2024 pricing guide for onboarding
deft. 2024 pricing guide for onboarding
hello960827
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results
 
Kalyan chart 6366249026 India satta Matta Matka 143 jodi fix
Kalyan chart 6366249026 India satta Matta Matka 143 jodi fixKalyan chart 6366249026 India satta Matta Matka 143 jodi fix
Kalyan chart 6366249026 India satta Matta Matka 143 jodi fix
satta Matta matka 143 Kalyan chart jodi 6366249026
 
Enhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: IntroductionEnhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: Introduction
Cor Verdouw
 
DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...
DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...
DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...
DP Boss Satta Matka Kalyan Matka
 
Kanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR ReportKanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR Report
Helen Meek
 
MECE (Mutually Exclusive, Collectively Exhaustive) Principle
MECE (Mutually Exclusive, Collectively Exhaustive) PrincipleMECE (Mutually Exclusive, Collectively Exhaustive) Principle
MECE (Mutually Exclusive, Collectively Exhaustive) Principle
Operational Excellence Consulting
 
Science Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around UsScience Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around Us
PennapaKeavsiri
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results
 
L'indice de performance des ports à conteneurs de l'année 2023
L'indice de performance des ports à conteneurs de l'année 2023L'indice de performance des ports à conteneurs de l'année 2023
L'indice de performance des ports à conteneurs de l'année 2023
SPATPortToamasina
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results
 
AI Transformation Playbook: Thinking AI-First for Your Business
AI Transformation Playbook: Thinking AI-First for Your BusinessAI Transformation Playbook: Thinking AI-First for Your Business
AI Transformation Playbook: Thinking AI-First for Your Business
Arijit Dutta
 

Recently uploaded (20)

❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
❽❽❻❼❼❻❻❸❾❻ DPBOSS NET SPBOSS SATTA MATKA RESULT KALYAN MATKA GUESSING FREE KA...
 
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing Indian Matka
 
Pro Tips for Effortless Contract Management
Pro Tips for Effortless Contract ManagementPro Tips for Effortless Contract Management
Pro Tips for Effortless Contract Management
 
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)英国伦敦商学院毕业证如何办理
 
➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart
➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart
➒➌➎➏➑➐➋➑➐➐ Satta Matta Matka Dpboss Matka Guessing Kalyan panel Chart
 
High-Quality IPTV Monthly Subscription for $15
High-Quality IPTV Monthly Subscription for $15High-Quality IPTV Monthly Subscription for $15
High-Quality IPTV Monthly Subscription for $15
 
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing
➒➌➎➏➑➐➋➑➐➐  Satta Matka Dpboss Matka Guessing➒➌➎➏➑➐➋➑➐➐  Satta Matka Dpboss Matka Guessing
➒➌➎➏➑➐➋➑➐➐ Satta Matka Dpboss Matka Guessing
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
deft. 2024 pricing guide for onboarding
deft.  2024 pricing guide for onboardingdeft.  2024 pricing guide for onboarding
deft. 2024 pricing guide for onboarding
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
Kalyan chart 6366249026 India satta Matta Matka 143 jodi fix
Kalyan chart 6366249026 India satta Matta Matka 143 jodi fixKalyan chart 6366249026 India satta Matta Matka 143 jodi fix
Kalyan chart 6366249026 India satta Matta Matka 143 jodi fix
 
Enhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: IntroductionEnhancing Adoption of AI in Agri-food: Introduction
Enhancing Adoption of AI in Agri-food: Introduction
 
DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...
DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...
DPBOSS | KALYAN MAIN MARKET FAST MATKA RESULT KALYAN MATKA | MATKA RESULT | K...
 
Kanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR ReportKanban Coaching Exchange with Dave White - Example SDR Report
Kanban Coaching Exchange with Dave White - Example SDR Report
 
MECE (Mutually Exclusive, Collectively Exhaustive) Principle
MECE (Mutually Exclusive, Collectively Exhaustive) PrincipleMECE (Mutually Exclusive, Collectively Exhaustive) Principle
MECE (Mutually Exclusive, Collectively Exhaustive) Principle
 
Science Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around UsScience Around Us Module 2 Matter Around Us
Science Around Us Module 2 Matter Around Us
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
L'indice de performance des ports à conteneurs de l'année 2023
L'indice de performance des ports à conteneurs de l'année 2023L'indice de performance des ports à conteneurs de l'année 2023
L'indice de performance des ports à conteneurs de l'année 2023
 
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartSatta Matka Dpboss Kalyan Matka Results Kalyan Chart
Satta Matka Dpboss Kalyan Matka Results Kalyan Chart
 
AI Transformation Playbook: Thinking AI-First for Your Business
AI Transformation Playbook: Thinking AI-First for Your BusinessAI Transformation Playbook: Thinking AI-First for Your Business
AI Transformation Playbook: Thinking AI-First for Your Business
 

DATA MINING DC Presentation.pptx

  • 3. DATA MINING: Turn raw data into useful information(readable format). By using a software to look for patterns in large batches of data. Businesses can use that data and learn more about their customers to develop more effective marketing strategies, sales etc…
  • 4. PROCESSES: Data Cleaning Data integration Data Transformation Data Mining Pattern Evolution Data Presentation
  • 5. KDD: Knowledge Discovery from Data. Growth of data: Terabytes to Petabytes We are downing in data but starving for knowledge! Fraud detection and detection of unusual patterns.
  • 6. KDD PROCESSES: Data Cleaning: Removing noise and inconsistent data. Data Integration: Multiple data sources may be combined. Data Selection: Related data
  • 7. KDD PROCESSES: Data transformation Unified Format Data Mining Extract patterns Pattern Evaluation Identifying true patterns Data Presentation Presenting data in a readable format
  • 9. WHAT KIND OF DATA?? Data from Databases. Data from different resources. Ex: Servers, Firewalls, log collectors etc….
  • 11. CLASSIFICATION: This technique helps to classify data in different classes. Data handling, Ex: Multimedia, Text-data, Time-series data…. Data model involved: Ex: Databases Analysis of data: Neural networks, ML, Statistics etc….
  • 12. ASSOCIATION: It creates rules that tells how often events are occurred together. Ex: When a customer buys computer/device, then 90% of time they will buy software’s.
  • 13. PREDICTION: Combination of other data mining techniques such as trends, clustering, classification, etc. It analyses past events or instanced in the right sequence to predict a future event.
  • 14. CLUSTERING: This techniques helps to identify similar kind of data. Recognizes differences and similarities between data. Similar to classification technique, but involves grouping chunks of data together based on their similarities. Similar data is grouped in the same cluster. Dissimilar data is grouped in the same cluster.
  • 15. REGRESSION: This techniques deals with prediction of values or number. Planning and modeling data. Ex: Predicting children’s height based on their age, weight, height and other factors.
  • 16. SEQUENTIAL PATTERNS: Data evaluation will be done to discover sequential patterns. It comprises of finding interesting subsequence in a set of sequences, where the stake of a sequence can be measured in terms of different criteria like length, occurrence frequency, etc.
  • 17. DECISION TREES: Supervised learning technique. It is a tree that helps is in decision-making purposes. The decision tree creates classification or regression models as a tree structure. It separates a data set into smaller subsets, and at the same time, the decision tree is steadily developed. The final tree is a tree with decision nodes and leaf nodes.
  • 19. DATA MINING IN HEALTH CARE: It holds incredible potential for healthcare services due to the exponential growth in the number of electrical health records. Previously Doctors and physicians hold patient information in the paper where the data was quite difficult to hold. Digitalization and innovation of new techniques reduce human efforts and make data easily assessable. For example, the computer keeps a massive amount of patient data with accuracy and it improves the quality of the whole data management system.
  • 20. EDUCATIONAL DATA MINING: This technique uses multiple algorithms to improve educational results and explain educational procedures for further decision making. Learning initially began in the classroom and was based on behavioural, psychological, and constructive models. Behavioural models depend on observable changes in the behavior of the student to determine the learning results.
  • 21. TOOLS: Oracle Data Mining IBM SPSS Modeler Rapid Miner Apache Mahout Weka KNIME SSQT(SQL Server Data tools)
  • 22.  Oracle Data Mining Oracle Data Mining(ODM), a component of the oracle Database, provides powerful data mining algorithms that enable data analysts to discover insights, make predictions and leverage their oracle data and investment.  IBM SPSS Modeler IBM SPSS Modeler is a data mining and text analytics software applications. It is used to built predictive models and conduct other analytic tasks. It has a visual interface which allows users to leverage statistical and data mining algorithms without programming.
  • 23.  Rapid Miner RapidMiner is a data science software platform that provides an integrated environment for the data preparation, machine learning, deep learning, text mining, and predictive analytics. RapidMiner is developed on an open core model.  Apache Mahout Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms focused primarily on linear algebra.
  • 24.  WEKA Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own java code. Weka contains tools for data pre-processing.  KNIME Stand for Konstanz information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept.
  • 25. CONCLUSION: The ultimate goal of data mining is the prediction of human behaviour, and is by far the most common business application. However this can easily be modelled to meet the objective of detection and deterrence of criminals. Data Mining tools increase not only the speed of analysis, but the depth of its approach.