SlideShare a Scribd company logo
year

Evolution of data mining and
warehousing

1960’s

Data collection and database
creation

1970’s

Database Management systems

Mid 1980’s

Advanced database systems

Late 1980’s

Data warehousing and Data
mining

1990’s

Web Based Databases

2006

Information Systems

2013

Big data retrieval
Data Mining refers to
extracting or “mining”
knowledge from large
amounts of data
 Knowledge mining from
data
 Knowledge Extraction
 Data/Pattern analysis
 Data archaelogy
 Data Dredging
 Knowledge discovery from
data.

Knowledge Discovery Process:

Data cleaning

Data integration

Data selection

Data transformation

Data mining

Pattern evaluation

Knowledge presentation










Relational databases
Data Warehouses
Transactional Databases
Object Relational Databases
Temporal, Sequence and Time series
Databases
Spatial and Spatio Temporal Databases
Text and Multimedia Databases
Heterogeneous and Legacy Databases
Data Streams and WWW
1.Relational database




A set of variables
A set of messages
A set of methods




A temporal database typically stores
relational data that include time-related
attributes.
These attributes may involve several
timestamps, each having different
semantics.





A sequence database stores sequences
of ordered events, with or without a
concrete notion of time.
Examples include customer shopping
sequences,Web click streams, and
biological sequences.




A time-series database stores sequences
of values or events obtained over
repeated measurements of time (e.g.,
hourly, daily, weekly).
Examples include data collected from the
stock xchange, inventory control, and the
observation of natural phenomena (like
temperature and wind).


Data Warehouse
A data warehouse is a subjectoriented, integrated, time-variant, and
nonvolatile collection of data in support
of
management’s
decision-making
process






geographic (map)
databases,
very large-scale integration (VLSI) or
computed-aided design databases,
medical and satellite image databases.
Spatial data may be represented in
raster format:
 n-dimensional bit maps or pixel maps.

For example, a 2-D satellite
 each pixel registers the rainfall in a given
area.

Maps can be represented in vector
format, where roads, bridges, buildings,
and
 lakes are represented as unions or
overlays of basic geometric constructs,
such as points,
 lines, polygons, and the partitions and
networks formed by these components.
A spatial database that stores spatial
objects that change with time is called a
spatiotemporal database,
e.g., Cricket Ball




Text databases are databases that
contain word descriptions for objects.
Multimedia databases store image,
audio, and video data.




A heterogeneous database consists of a
set of interconnected, autonomous
component databases.
A legacy database is a group of
heterogeneous databases that
combines different kinds of data systems,
such as relational or object-oriented
databases,hierarchical databases,
network databases, spreadsheets,
multimedia databases, or file systems.


data flow in and out of an observation
platform (or window) dynamically is
generated and analyzed.


Capturing user access patterns in such
distributed information environments is
called Web usage mining (or Weblog
mining).
› Time Variant
 The Warehouse data represent the flow of data
through time. It can even contain projected
data.

› Non-Volatile
 Once data enter the Data Warehouse, they
are never removed.
 The Data Warehouse is always growing








Teradata
Oracle
SAP
BW - Business Information Warehouse
(SAP Netweaver BI)
Microsoft SQL Server
IBM DB2 (Infosphere Warehouse)
SAS




1984 — Metaphor Computer Systems,
founded by David Liddle and Don
Massaro, releases Data Interpretation
System (DIS).
DIS was a hardware/software package
and GUI for business users to create a
database management and analytic
system.
Survey (S): (2 Minutes)
The students are asked to browse the
following titles and subtitles from the
book.
Text Book:
Han and Kamber, “Data Mining”, Second
Edition, Elsevier,2008.
 Page no:105-109
 Page no : 2-21
1.Data Mining is otherwise called as
a) Knowledge mining
b) Knowledge mining from large data
c) Data extraction
d) None of the above
2.In knowledge Discovery process,data mining is after which process
a) Data transformation
b) Data selection
c) Neither (a) nor (b)
d) Both
3. In which type of data warehouse, once the data enter the Data
Warehouse, they are never removed.
a) Integrated
b) Time-variant
c) Subject oriented
d) Non-Volatile
4. An object relational database consists of
entities with
a) Variables
b) Messages
c) Methods
d) All the above
5.Web usage mining is otherwise called as Web
a) Web mining
b) Web log mining
c) None of the above
d) Both






Specify the seven steps in KDD process?
Explain four categories of data
warehousing?
Define heterogenous and legacy
database?
What are the data mining task
primitives?
What are the different kinds of data to
be mined?






A data warehouse maintains a copy of
information from the source transaction
systems. This architectural complexity
provides the opportunity to:
Congregate data from multiple sources
into a single database so a single query
engine can be used to present data.
Mitigate the problem of database
isolation level lock contention in
transaction processing systems caused by
attempts to run large, long running,
analysis queries in transaction processing
databases.





Maintain data history, even if the source
transaction systems do not.
Integrate data from multiple source
systems, enabling a central view across
the enterprise. This benefit is always
valuable, but particularly so when the
organization has grown by merger.
Improve data quality, by providing
consistent codes and descriptions,
flagging or even fixing bad data.








Present the organization's information
consistently.
Provide a single common data model for
all data of interest regardless of the
data's source.
Restructure the data so that it makes
sense to the business users.
Restructure the data so that it delivers
excellent query performance, even for
complex analytic queries, without
impacting the operational systems.
Add value to operational business
applications, notably customer
relationship management (CRM)

More Related Content

What's hot

Data Mining
Data MiningData Mining
Data Mining
SHIKHA GAUTAM
 
Data mining
Data miningData mining
Data mining
Hoang Nguyen
 
Data warehouse
Data warehouseData warehouse
Data warehouse
princeahan
 
Dataware housing
Dataware housingDataware housing
Dataware housingwork
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 
EDI Training Module 10: EDI Data Repository Overview
EDI Training Module 10:  EDI Data Repository OverviewEDI Training Module 10:  EDI Data Repository Overview
EDI Training Module 10: EDI Data Repository Overview
Environmental Data Initiative
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
dataware
 
data mining
data miningdata mining
data mining
manasa polu
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousing
Dhilsath Fathima
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
nayakslideshare
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
MadhuriNigam1
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
EDI Training Module 12: Learn to Cite and Link Your Data
EDI Training Module 12:  Learn to Cite and Link Your DataEDI Training Module 12:  Learn to Cite and Link Your Data
EDI Training Module 12: Learn to Cite and Link Your Data
Environmental Data Initiative
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
Khwaja Aamer
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
OpenAIRE
 
Advantages of metadata
Advantages of metadataAdvantages of metadata
Advantages of metadata
Azeem Sultan
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identification
guest453b14
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
Rajesh Kumar
 

What's hot (20)

Data Mining
Data MiningData Mining
Data Mining
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
EDI Training Module 10: EDI Data Repository Overview
EDI Training Module 10:  EDI Data Repository OverviewEDI Training Module 10:  EDI Data Repository Overview
EDI Training Module 10: EDI Data Repository Overview
 
Dw Concepts
Dw ConceptsDw Concepts
Dw Concepts
 
data mining
data miningdata mining
data mining
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousing
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Hdfs Dhruba
Hdfs DhrubaHdfs Dhruba
Hdfs Dhruba
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
EDI Training Module 12: Learn to Cite and Link Your Data
EDI Training Module 12:  Learn to Cite and Link Your DataEDI Training Module 12:  Learn to Cite and Link Your Data
EDI Training Module 12: Learn to Cite and Link Your Data
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 
Advantages of metadata
Advantages of metadataAdvantages of metadata
Advantages of metadata
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identification
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 

Similar to Data mining Introduction

Data Warehouse and Data Mining
Data Warehouse and Data MiningData Warehouse and Data Mining
Data Warehouse and Data Mining
Ranak Ghosh
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
PalaniKumarR2
 
Dm unit i r16
Dm unit i   r16Dm unit i   r16
Dm unit i r16
Kishore Kumar
 
Big data presentation
Big data presentationBig data presentation
Big data presentation
Chinh Vo Wili
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
Amdocs
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
Dhilsath Fathima
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
SamPrem3
 
Data mining
Data miningData mining
Data mining
Ritesh Tiwari
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology Landscape
ShivanandaVSeeri
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
JawaherAlbaddawi
 
data base system to new data science lerne
data base system to new data science lernedata base system to new data science lerne
data base system to new data science lerne
tarunprajapati0t
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Shruti Dalela
 
BD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdfBD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdf
eramfatima43
 
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
National Information Standards Organization (NISO)
 
History of NoSQL and Azure Documentdb feature set
History of NoSQL and Azure Documentdb feature setHistory of NoSQL and Azure Documentdb feature set
History of NoSQL and Azure Documentdb feature set
Soner Altin
 
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Denodo
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Vipin Batra
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
Database, Lecture-1.ppt
Database, Lecture-1.pptDatabase, Lecture-1.ppt
Database, Lecture-1.ppt
MatshushimaSumaya
 

Similar to Data mining Introduction (20)

Data Warehouse and Data Mining
Data Warehouse and Data MiningData Warehouse and Data Mining
Data Warehouse and Data Mining
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Dm unit i r16
Dm unit i   r16Dm unit i   r16
Dm unit i r16
 
Big data presentation
Big data presentationBig data presentation
Big data presentation
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Data mining
Data miningData mining
Data mining
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology Landscape
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
 
data base system to new data science lerne
data base system to new data science lernedata base system to new data science lerne
data base system to new data science lerne
 
Introduction to DataMining
Introduction to DataMiningIntroduction to DataMining
Introduction to DataMining
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
BD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdfBD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdf
 
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
 
History of NoSQL and Azure Documentdb feature set
History of NoSQL and Azure Documentdb feature setHistory of NoSQL and Azure Documentdb feature set
History of NoSQL and Azure Documentdb feature set
 
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
Myth Busters: I’m Building a Data Lake, So I Don’t Need Data Virtualization (...
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Database, Lecture-1.ppt
Database, Lecture-1.pptDatabase, Lecture-1.ppt
Database, Lecture-1.ppt
 

Recently uploaded

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 

Recently uploaded (20)

Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 

Data mining Introduction

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. year Evolution of data mining and warehousing 1960’s Data collection and database creation 1970’s Database Management systems Mid 1980’s Advanced database systems Late 1980’s Data warehousing and Data mining 1990’s Web Based Databases 2006 Information Systems 2013 Big data retrieval
  • 9. Data Mining refers to extracting or “mining” knowledge from large amounts of data  Knowledge mining from data  Knowledge Extraction  Data/Pattern analysis  Data archaelogy  Data Dredging  Knowledge discovery from data. Knowledge Discovery Process:  Data cleaning  Data integration  Data selection  Data transformation  Data mining  Pattern evaluation  Knowledge presentation
  • 10.
  • 11.
  • 12.
  • 13.          Relational databases Data Warehouses Transactional Databases Object Relational Databases Temporal, Sequence and Time series Databases Spatial and Spatio Temporal Databases Text and Multimedia Databases Heterogeneous and Legacy Databases Data Streams and WWW
  • 15.
  • 16.
  • 17.
  • 18.    A set of variables A set of messages A set of methods
  • 19.   A temporal database typically stores relational data that include time-related attributes. These attributes may involve several timestamps, each having different semantics.
  • 20.    A sequence database stores sequences of ordered events, with or without a concrete notion of time. Examples include customer shopping sequences,Web click streams, and biological sequences.
  • 21.   A time-series database stores sequences of values or events obtained over repeated measurements of time (e.g., hourly, daily, weekly). Examples include data collected from the stock xchange, inventory control, and the observation of natural phenomena (like temperature and wind).
  • 22.  Data Warehouse A data warehouse is a subjectoriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision-making process
  • 23.      geographic (map) databases, very large-scale integration (VLSI) or computed-aided design databases, medical and satellite image databases. Spatial data may be represented in raster format:  n-dimensional bit maps or pixel maps. For example, a 2-D satellite  each pixel registers the rainfall in a given area. 
  • 24. Maps can be represented in vector format, where roads, bridges, buildings, and  lakes are represented as unions or overlays of basic geometric constructs, such as points,  lines, polygons, and the partitions and networks formed by these components.
  • 25. A spatial database that stores spatial objects that change with time is called a spatiotemporal database, e.g., Cricket Ball 
  • 26.   Text databases are databases that contain word descriptions for objects. Multimedia databases store image, audio, and video data.
  • 27.   A heterogeneous database consists of a set of interconnected, autonomous component databases. A legacy database is a group of heterogeneous databases that combines different kinds of data systems, such as relational or object-oriented databases,hierarchical databases, network databases, spreadsheets, multimedia databases, or file systems.
  • 28.  data flow in and out of an observation platform (or window) dynamically is generated and analyzed.
  • 29.  Capturing user access patterns in such distributed information environments is called Web usage mining (or Weblog mining).
  • 30. › Time Variant  The Warehouse data represent the flow of data through time. It can even contain projected data. › Non-Volatile  Once data enter the Data Warehouse, they are never removed.  The Data Warehouse is always growing
  • 31.
  • 32.        Teradata Oracle SAP BW - Business Information Warehouse (SAP Netweaver BI) Microsoft SQL Server IBM DB2 (Infosphere Warehouse) SAS
  • 33.
  • 34.   1984 — Metaphor Computer Systems, founded by David Liddle and Don Massaro, releases Data Interpretation System (DIS). DIS was a hardware/software package and GUI for business users to create a database management and analytic system.
  • 35. Survey (S): (2 Minutes) The students are asked to browse the following titles and subtitles from the book. Text Book: Han and Kamber, “Data Mining”, Second Edition, Elsevier,2008.  Page no:105-109  Page no : 2-21
  • 36.
  • 37.
  • 38. 1.Data Mining is otherwise called as a) Knowledge mining b) Knowledge mining from large data c) Data extraction d) None of the above 2.In knowledge Discovery process,data mining is after which process a) Data transformation b) Data selection c) Neither (a) nor (b) d) Both 3. In which type of data warehouse, once the data enter the Data Warehouse, they are never removed. a) Integrated b) Time-variant c) Subject oriented d) Non-Volatile
  • 39. 4. An object relational database consists of entities with a) Variables b) Messages c) Methods d) All the above 5.Web usage mining is otherwise called as Web a) Web mining b) Web log mining c) None of the above d) Both
  • 40.      Specify the seven steps in KDD process? Explain four categories of data warehousing? Define heterogenous and legacy database? What are the data mining task primitives? What are the different kinds of data to be mined?
  • 41.    A data warehouse maintains a copy of information from the source transaction systems. This architectural complexity provides the opportunity to: Congregate data from multiple sources into a single database so a single query engine can be used to present data. Mitigate the problem of database isolation level lock contention in transaction processing systems caused by attempts to run large, long running, analysis queries in transaction processing databases.
  • 42.    Maintain data history, even if the source transaction systems do not. Integrate data from multiple source systems, enabling a central view across the enterprise. This benefit is always valuable, but particularly so when the organization has grown by merger. Improve data quality, by providing consistent codes and descriptions, flagging or even fixing bad data.
  • 43.      Present the organization's information consistently. Provide a single common data model for all data of interest regardless of the data's source. Restructure the data so that it makes sense to the business users. Restructure the data so that it delivers excellent query performance, even for complex analytic queries, without impacting the operational systems. Add value to operational business applications, notably customer relationship management (CRM)