Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DATA MINING – TECHNIQUES AND APPLICATIONS

1,167 views

Published on

  • Be the first to comment

DATA MINING – TECHNIQUES AND APPLICATIONS

  1. 1. DATA MINING – TECHNIQUES AND APPLICATIONS Charlie Chough CS157B Spring 2006
  2. 2. TOPICS <ul><li>What is Data Mining? </li></ul><ul><li>How does Data Mining work? </li></ul><ul><li>What are the applications for Data Mining? </li></ul><ul><li>What are the issues surrounding Data Mining? </li></ul>
  3. 3. What Is Data Mining? <ul><li>Data Mining is the extraction of hidden predictive information from large databases. </li></ul><ul><li>Data Mining can predict future trends and behaviors allowing businesses to make proactive, knowledge-driven business decision. </li></ul>
  4. 4. What Is Data Mining? <ul><li>The Evolution of Data Mining </li></ul>(Emerging Today)   Prospective, proactive information delivery Advanced algorithms, multiprocessor computers, massive databases &quot;What’s likely to happen to Boston unit sales next month? Why?&quot; Data Mining (1990s) Retrospective, dynamic data delivery at multiple levels On-line analytic processing (OLAP), multidimensional databases, data warehouses &quot;What were unit sales in New England last March? Drill down to Boston.&quot; Data Warehousing & Decision Support (1980s)   Retrospective, dynamic data delivery at record level Relational databases (RDBMS), Structured Query Language (SQL), ODBC &quot;What were unit sales in New England last March?&quot; Data Access (1960s)   Retrospective, static data delivery Computers, tapes, disks &quot;What was my total revenue in the last five years?&quot; Data Collection Characteristics Enabling Technologies Business Question Evolutionary Step
  5. 5. How Does Data Mining Work? <ul><li>3 Phase Approach </li></ul><ul><ul><li>1) Exploration </li></ul></ul><ul><ul><li>2) Model Building and Validation </li></ul></ul><ul><ul><li>3) Deployment </li></ul></ul>
  6. 6. How Does Data Mining Work? <ul><li>Exploration </li></ul><ul><ul><li>Data Preparation </li></ul></ul><ul><ul><ul><li>Cleaning Data </li></ul></ul></ul><ul><ul><ul><li>Data Transformation </li></ul></ul></ul><ul><ul><ul><li>Feature Selection </li></ul></ul></ul><ul><ul><ul><li>Exploratory Data Analysis </li></ul></ul></ul>
  7. 7. How Does Data Mining Work? <ul><li>Model Building and Validation </li></ul><ul><ul><li>Techniques </li></ul></ul><ul><ul><ul><li>Decision Trees </li></ul></ul></ul><ul><ul><ul><li>Clustering </li></ul></ul></ul><ul><ul><ul><li>Association Rules </li></ul></ul></ul>
  8. 8. How Does Data Mining Work? <ul><li>Model Building and Validation </li></ul><ul><ul><li>Decision Trees </li></ul></ul><ul><ul><ul><li>Tree shaped structures that represent sets of decisions. </li></ul></ul></ul>
  9. 9. How Does Data Mining Work? <ul><li>Model Building and Validation </li></ul><ul><ul><li>Hierarchical Clustering </li></ul></ul><ul><ul><ul><li>Clusters are discovered successively using previously established clusters. </li></ul></ul></ul><ul><ul><li>Partitional Clustering </li></ul></ul><ul><ul><ul><li>All clusters are discovered at once. </li></ul></ul></ul>
  10. 10. How Does Data Mining Work? <ul><li>Model Building and Validation </li></ul><ul><ul><li>Hierarchial Clustering </li></ul></ul><ul><ul><ul><li>Agglomerative Clustering (up or down) </li></ul></ul></ul><ul><ul><ul><ul><li>All elements are treated as a cluster and are merged into successively larger clusters. </li></ul></ul></ul></ul><ul><ul><ul><li>Divisive Clustering </li></ul></ul></ul><ul><ul><ul><ul><li>Begins with the entire data set and breaks the data set into clusters. </li></ul></ul></ul></ul>
  11. 11. How Does Data Mining Work? <ul><li>Model Building and Validation </li></ul><ul><ul><li>Partitional Clustering </li></ul></ul><ul><ul><ul><li>K-means clustering </li></ul></ul></ul><ul><ul><ul><li>QT Clustering </li></ul></ul></ul><ul><ul><ul><li>Fuzzy C-means Clustering </li></ul></ul></ul>
  12. 12. How Does Data Mining Work? <ul><li>Model Building and Validation </li></ul><ul><ul><li>Association Rules </li></ul></ul><ul><ul><ul><li>Association Rules describe a correlation of events. </li></ul></ul></ul><ul><ul><ul><ul><li>Support </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Confidence </li></ul></ul></ul></ul>
  13. 13. How Does Data Mining Work? <ul><li>Deployment </li></ul><ul><ul><li>Select the best model from the previous phase and apply it to new data in order to generate predictions or estimates of the expected outcome. </li></ul></ul>
  14. 14. Applications for Data Mining? <ul><li>Retail Market Basket Analysis </li></ul><ul><li>Business Intelligence </li></ul><ul><li>Medicine </li></ul><ul><li>Law Enforcement </li></ul>
  15. 15. Applications for Data Mining? <ul><li>Retail Market Basket Analysis </li></ul><ul><ul><li>Online retailers that suggest other products based on what other customers have purchased </li></ul></ul><ul><ul><li>Merchandising based on what items customers purchase together </li></ul></ul><ul><ul><ul><li>Milk and bread </li></ul></ul></ul><ul><ul><ul><li>Diapers and Beer </li></ul></ul></ul>
  16. 16. Applications for Data Mining? <ul><li>Business Intelligence </li></ul><ul><ul><li>Business Intelligence tools allow businesses to gather, store, access and analyze corporate data to aid in the decision-making process. </li></ul></ul><ul><ul><ul><li>Customer Profiling </li></ul></ul></ul><ul><ul><ul><li>Inventory and Distribution Analysis </li></ul></ul></ul><ul><ul><ul><li>Market Research and Segmentation </li></ul></ul></ul>
  17. 17. Applications for Data Mining? <ul><li>Medicine </li></ul><ul><ul><li>Data mining can be used to find combinations of prescription drugs that can have harmful interaction or side effects. </li></ul></ul>
  18. 18. Applications for Data Mining? <ul><li>Law Enforcement </li></ul><ul><ul><li>Law enforcement agencies are using data mining to help identify terrorists. </li></ul></ul>
  19. 19. Issues Surrounding Data Mining <ul><li>Privacy Concerns </li></ul><ul><li>Data Dredging </li></ul>
  20. 20. Issues Surrounding Data Mining <ul><li>Privacy Concerns </li></ul><ul><ul><li>Multi-state Anti-Terrorism Information Exchange (MATRIX) </li></ul></ul><ul><ul><ul><li>Massive collection of non-publicly available, personal data managed by a private Florida company. </li></ul></ul></ul>
  21. 21. Issues Surrounding Data Mining <ul><li>Privacy Concerns </li></ul><ul><ul><li>Government agencies failed to properly implement privacy rules for data mining. </li></ul></ul><ul><ul><ul><li>Lapses by the Dept. of Agriculture, FBI, IRS, Small Business Administration and State Department increased the risk of data exposure. </li></ul></ul></ul>
  22. 22. Issues Surrounding Data Mining <ul><li>Data Dredging </li></ul><ul><ul><li>The practice of imposing patterns on data where none exist. </li></ul></ul>
  23. 23. Conculsions <ul><li>Data Mining is a powerful tool with real-world applications </li></ul><ul><li>But... Data Mining must be used carefully </li></ul>
  24. 24. References <ul><ul><li>Silberschatz, Korth, Sudarshan. 2006. Database System Concepts 5 th Ed. New York, NY: McGraw Hill </li></ul></ul><ul><ul><li>Wikipedia.com. 2006. ( http://en.wikipedia.org/wiki/Data_mining ) </li></ul></ul><ul><ul><li>Thearling.com. 2006. ( http://www.thearling.com ) </li></ul></ul><ul><ul><li>Small Business Computing.com. 2006. ( http://sbc.webopedia.com/TERM/B/Business_Intelligence.html ) </li></ul></ul>

×