introduction to data warehousing and mining

733 views
607 views

Published on

data warehousing and mining introduction class from kl university

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
733
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

introduction to data warehousing and mining

  1. 1. DATAWAREHOUSING AND MINING BY G.RAJESH CHANDRA
  2. 2. EVOLUTION OF DATABASE TECHNOLOGY  1960s (Primitive File Processing)   1970s to early 1980s (DBMS)   Data collection, database creation, IMS and network DBMS Relational data model, relational DBMS implementation ,SQL, OLTP,User Interfaces.etc 1980s: to Present (Advanced Data Bases)    RDBMS, advanced data models (extended-relational, OO, deductive, etc.) Application-oriented DBMS (spatial, scientific, engineering, etc.) 1990s: (Advanced Data Analysis)   Data mining, data warehousing, multimedia databases, and Web databases 2000s  Stream data management and mining  Data mining and its applications
  3. 3. WHY MINE DATA? COMMERCIAL VIEWPOINT  Lots of data is being collected and warehoused     Web data, e-commerce purchases at department/ grocery stores Bank/Credit Card transactions Competitive Pressure is Strong  Provide better, customized services for an edge (e.g. in Customer Relationship Management)
  4. 4. WHAT IS DATA MINING…..?  • Data mining (sometimes called data Discovery or Knowledge Discovery Data) is the process of analyzing data from different perspectives and summarizing it into useful information. Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data
  5. 5. WHY MINE DATA? SCIENTIFIC VIEWPOINT  Data collected and stored at enormous speeds (GB/hour)       remote sensors on a satellite telescopes scanning the skies microarrays generating gene expression data scientific simulations generating terabytes of data Traditional techniques infeasible for raw data Data mining may help scientists   in classifying and segmenting data in Hypothesis Formation
  6. 6. EXAMPLES: WHAT IS (NOT) DATA MINING?  What is not Data  What is Data Mining? Mining? – Look up phone – Certain names are more number in phone directory prevalent in certain US locations (O’Brien, O’Rurke, O’Reilly… in Boston area) – Query a Web – Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,) search engine for information about ―Amazon‖
  7. 7. DATA MINING IS ALSO CALLED AS..? • • Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc. Real Time Example Gold Mining
  8. 8. DATA WARE HOUSE = COLLECTION OF DATA BASES
  9. 9. WE HAVE TO USE DIFFERENT METHODS
  10. 10. RAW DATA =DATA BASES + NOISE DATA
  11. 11. DATA SELECTION AND TRANSFORMATION
  12. 12. DATA CLEANING AND INTEGRATION
  13. 13. DATA MINING
  14. 14. PATTERN EVALUATION
  15. 15. KNOWLEDGE REPRASENTATION
  16. 16. KNOWLEDGE REPRASENTATION
  17. 17. December 26, 2013 KNOWLEDGE DISCOVERY (KDD) PROCESS  Data mining—core of knowledge discovery process Pattern Evaluation Data Mining Task-relevant Data Data Warehouse Data Cleaning Data Integration Databases Selection

×