Session 10 data


Education, Technology, Business
  1. 1. using data strategically adopted some materials from David Schuff
  2. 2. why data?
  3. 3. Key bank saved $500,000 by improving their direct mailing using data mining and data warehouse in their Home Equity Loan program.
  4. 4. Airline industry can forecast at the seat level for each flight to perform “yield” management.
  5. 5. learn insights such as “30+ male customers buy 6-pack beer and disposable diaper at the same time around 2-4 am”
  6. 6. Progressive Insurance can offer usage-based insurance plan using their database
  7. 7. ESS – Executive Support Systems DSS – Decision Support Systems MIS – Management Information Systems TPS – Transaction Processing Systems Strategic Management Tactical Management Business Operations
  8. 8. data becomes the basis of these different levels of decision making
  9. 9. two different types of data- usages: transactional vs. informational
  10. 10. report (using query) vs. decision-making (mining)
  11. 11. What is a database? !! Structured collection of data items !! Types of Database Management Systems (DBMS) "! Hierarchical "! Network "! Relational •! The one most often seen •! Access, MS SQL Server, Oracle, DB2
  12. 12. What is a Relational Database? !! A set of two or more tables related to each other through key fields !! Key field "! A field on which a table can be sorted (indexed) !! Primary Key "! Field which uniquely identifies a record "! Why have a primary key? •! There may be many people named John Smith, so how do you tell them apart? •! Use something which is unique, like a social security number •! Social security number is a common key field
  13. 13. Data-Driven DSS (a.k.a. Business Intelligence) !! Also known as Data Mining and OLAP (Online Analytical Processing) !! Finding non-obvious patterns in data !! Data Mining generally implies using statistical techniques "! correlation analysis "! clustering to find patterns and relationships in large databases
  14. 14. Operational and informational data stores
  15. 15. !! Relational databases are optimized for efficiency in data storage "! OLTP – Online transaction processing !! Dimensional databases are optimized for efficiency in data retrieval "! OLAP – Online analytical processing "! MOLAP – Multidimensional OLAP •! Stored in cubes that can be easily retrieved and aggregated !! ROLAP – Relational OLAP "! “Fakes” MOLAP-style aggregation using a relational database
  16. 16. Data warehouse implementation: The data cube A data cube stores its data in a single table. That table is organized along dimensions. This cube has three dimensions: store, product, and time.
  17. 17. SQL (OLAP) query •How many light bulbs did we sell in the 1st Qtr of 2000 in California vs. NewYork? Data mining query •How do the buyers of light bulbs in California and NewYork differ? •What else do the buyers of light bulbs in California buy along with light bulbs? •Which sales regions had anomalous sales in the 1st Qtr of 2000?
  18. 18. !"#$%$&'()$*+&",-$.( •! /..0*"120&3( –! 4+1'(0'+$%(5%06-*'.(.+0-76('+$(.'0%$(.'0*8(-5(0&("9('+$(.'0%$(+1.( 1(.17$(0&(:0;$(<7$*'%0&"*.=( –! />1*+$6(;1"7"&?("&(6"%$*'(;1%8$2&?( •! @$,-$&*$3( –! /&17A.".(0&(*7"*8B.'%$1;( –! C$6"*17(%$.$1%*+( •! )";$B.$%"$.(*7-.'$%"&?( –! D"&6(*-.'0;$%.(E"'+(.";"71%(51>$%&(09('$7$5+0&$(-.1?$.( –! !$'$%;"&$(5%06-*'.(E"'+(.";"71%(.$77"&?(51>$%&.( –! D"&6(.'0*8.(E"'+(.";"71%(5%"*$(;0F$;$&'.( •! G71.."H*120&( –! G%$6"'(%12&?( –! )1%?$'(;1%8$2&?(
  19. 19. !"#$%$"!&'()%*'%+&*",'(*-$" Divisional DB Corporate Data Warehouse Cleaning Collecting ERP system Data Mining OLAP Data Visualization
  20. 20. key issues: • reliability • scalability • security • speed • data integrity • availability
  21. 21. data may be boring, but the most critical element in IT architecture