Data Mining Group Members Alisha Korpal Nivia Jain Sharuti Jain
Data Mining ? Huge amounts of data Electronic record of our decisions Choices in the supermarket Financial records
Data vs. Information
Data  : Collection of raw data , facts and figures. Information : processed form of data
Data Mining Extracting or “mining” knowledge from large amounts of data Data – driven discovery and modeling of hidden patterns in large volumes of data  Extraction of interesting (non trivial, implicit, previously and potentially useful) information or patterns from data  in  large databases .
Data Mining Process Defining the problem Preparing data Exploring data Building Models Exploring and validating Models Deploying and Updating models
Data Mining Process
Defining the Problem What are you looking for? What types of relationships are you trying to find? Do you want to make predictions from the data  mining model, or just look for interesting patterns and associations?
Contd… Which attribute of the dataset do you want to try to predict? How are the columns related? If there are multiple tables, how are the tables related? Does the problem you are trying to solve reflect the policies or processes of the business?
Preparing Data
Exploring Data   You must understand the data in order to make appropriate decisions when you create the mining models. Exploration techniques include calculating the minimum and maximum values, calculating mean and standard deviations, and looking at the distribution of the data. 
Models Building Models Exploring and Validating Models Deploying and Updating Models
Evolution of Data Mining Data collection  -1960s Data access  - 1980s Data Warehousing & decision support  -1990s Data Mining  -Emerging Today
Prospective, proactive information delivery Advanced algorithms, multiprocessor computers, massive databases "What’s likely to happen to Boston unit sales next month? Why?" Data Mining (Emerging Today) Retrospective, dynamic data delivery at multiple levels On-line analytic processing (OLAP), multidimensional databases, data warehouses "What were unit sales in New England last March? Drill down to Boston." Data Warehousing & Decision Support (1990s) Retrospective, dynamic data delivery at record level Relational databases (RDBMS), Structured Query Language (SQL), ODBC "What were unit sales in New England last March?" Data Access (1980s) Retrospective, static data delivery Computers, tapes, disks "What was my total revenue in the last five years?" Data Collection (1960s) Characteristics Enabling Technologies Business Question Evolutionary Step
Data mining Vs OLAP On-line  Analytical Processing Provides you with a very good view of what is happening, but can not predict what will happen in the future or why it is happening
Scope of Data Mining Automated prediction of trends and behaviors   Automated discovery of previously unknown patterns
Applications Science: Chemistry, Physics, Medicine Biochemical analysis Remote sensors on a satellite Medical images analysis
Applications Financial Industry, Banks, Businesses, E commerce Stock and investment analysis Risk management Sales forecasting
Applications Database analysis and decision support Market analysis and management Target marketing, customer relation management, market basket analysis, cross selling
Applications Risk analysis and management Forecasting, customer retention, improved underwriting Fraud detection and management
References:   http://www.data-miners.com/resources/SUGI29-Survival.pdf http://docs.google.com/viewer?a=v&q=cache:VRsb5lbwpGoJ:www.sdsc.edu/us/training/workshops/2006cihass/docs/2006cihass_DataMiningIntro.ppt+applications+of+data+mining+ppt&hl=en&gl=in&pid=bl&srcid=ADGEESg5iQeaEGa0RoHJpbQyDDbVKPNJwOS3Zg71DTIgFf8PhSbzZ39oAdQNwPb8wvwJAbwFwp-HcAwhGF-9C6TiHM3pv7vQm7Xf8umeBDY_oG6VtzK8eVwqAo95evUgkcvWwDO5YwKT&sig=AHIEtbQ1bj7uPnVGzCNysOs5V7_5apQk0A&pli=1
References: http:// www.thearling.com/text/dmwhite/dmwhite.htm http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm http://msdn.microsoft.com/en-us/library/ms174949.aspx
Conclusion
Result of Data Mining What may happen in future Classifying people or things into groups by recognizing patterns Clustering people or things into groups based on their attributes Sequencing what events are likely to lead to later events
Data Mining is not “ Blind” applications of algorithms Going to find relations where none exist Presenting data in different ways A difficult to understand technology requiring an advanced degree in computer science
Necessity is the  mother of invention
Thank you

Data mining

  • 1.
    Data Mining GroupMembers Alisha Korpal Nivia Jain Sharuti Jain
  • 2.
    Data Mining ?Huge amounts of data Electronic record of our decisions Choices in the supermarket Financial records
  • 3.
  • 4.
    Data :Collection of raw data , facts and figures. Information : processed form of data
  • 5.
    Data Mining Extractingor “mining” knowledge from large amounts of data Data – driven discovery and modeling of hidden patterns in large volumes of data Extraction of interesting (non trivial, implicit, previously and potentially useful) information or patterns from data in large databases .
  • 6.
    Data Mining ProcessDefining the problem Preparing data Exploring data Building Models Exploring and validating Models Deploying and Updating models
  • 7.
  • 8.
    Defining the ProblemWhat are you looking for? What types of relationships are you trying to find? Do you want to make predictions from the data mining model, or just look for interesting patterns and associations?
  • 9.
    Contd… Which attributeof the dataset do you want to try to predict? How are the columns related? If there are multiple tables, how are the tables related? Does the problem you are trying to solve reflect the policies or processes of the business?
  • 10.
  • 11.
    Exploring Data You must understand the data in order to make appropriate decisions when you create the mining models. Exploration techniques include calculating the minimum and maximum values, calculating mean and standard deviations, and looking at the distribution of the data. 
  • 12.
    Models Building ModelsExploring and Validating Models Deploying and Updating Models
  • 13.
    Evolution of DataMining Data collection -1960s Data access - 1980s Data Warehousing & decision support -1990s Data Mining -Emerging Today
  • 14.
    Prospective, proactive informationdelivery Advanced algorithms, multiprocessor computers, massive databases "What’s likely to happen to Boston unit sales next month? Why?" Data Mining (Emerging Today) Retrospective, dynamic data delivery at multiple levels On-line analytic processing (OLAP), multidimensional databases, data warehouses "What were unit sales in New England last March? Drill down to Boston." Data Warehousing & Decision Support (1990s) Retrospective, dynamic data delivery at record level Relational databases (RDBMS), Structured Query Language (SQL), ODBC "What were unit sales in New England last March?" Data Access (1980s) Retrospective, static data delivery Computers, tapes, disks "What was my total revenue in the last five years?" Data Collection (1960s) Characteristics Enabling Technologies Business Question Evolutionary Step
  • 15.
    Data mining VsOLAP On-line Analytical Processing Provides you with a very good view of what is happening, but can not predict what will happen in the future or why it is happening
  • 16.
    Scope of DataMining Automated prediction of trends and behaviors Automated discovery of previously unknown patterns
  • 17.
    Applications Science: Chemistry,Physics, Medicine Biochemical analysis Remote sensors on a satellite Medical images analysis
  • 18.
    Applications Financial Industry,Banks, Businesses, E commerce Stock and investment analysis Risk management Sales forecasting
  • 19.
    Applications Database analysisand decision support Market analysis and management Target marketing, customer relation management, market basket analysis, cross selling
  • 20.
    Applications Risk analysisand management Forecasting, customer retention, improved underwriting Fraud detection and management
  • 21.
    References: http://www.data-miners.com/resources/SUGI29-Survival.pdf http://docs.google.com/viewer?a=v&q=cache:VRsb5lbwpGoJ:www.sdsc.edu/us/training/workshops/2006cihass/docs/2006cihass_DataMiningIntro.ppt+applications+of+data+mining+ppt&hl=en&gl=in&pid=bl&srcid=ADGEESg5iQeaEGa0RoHJpbQyDDbVKPNJwOS3Zg71DTIgFf8PhSbzZ39oAdQNwPb8wvwJAbwFwp-HcAwhGF-9C6TiHM3pv7vQm7Xf8umeBDY_oG6VtzK8eVwqAo95evUgkcvWwDO5YwKT&sig=AHIEtbQ1bj7uPnVGzCNysOs5V7_5apQk0A&pli=1
  • 22.
    References: http:// www.thearling.com/text/dmwhite/dmwhite.htmhttp://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm http://msdn.microsoft.com/en-us/library/ms174949.aspx
  • 23.
  • 24.
    Result of DataMining What may happen in future Classifying people or things into groups by recognizing patterns Clustering people or things into groups based on their attributes Sequencing what events are likely to lead to later events
  • 25.
    Data Mining isnot “ Blind” applications of algorithms Going to find relations where none exist Presenting data in different ways A difficult to understand technology requiring an advanced degree in computer science
  • 26.
    Necessity is the mother of invention
  • 27.