Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Data Mining and Knowledge Discovery Outline
1. Outline
Data Mining
and
„We are drowning in data, but we are starving for knowledge“
Knowledge Discovery
Part 2: Clustering in Large Databases
- Hierarchical Clustering
- Divisive Clustering
- Density based Clustering
Erik Kropat
University of the Bundeswehr
Munich, Germany
2. Why “Data Mining”?
• Companies are collecting massive amounts of data on customers,
operations, and the competitive landscape.
Firms can gain a competitive advantage from these data
• But, there is far too much data
− Online shops record purchase behaviours for millions of customers
(sometimes with hundreds features for each customer)
− Phone companies keep info on 100’s of millions of accounts
(each with thousands of transactions)
− Databases can often be hundreds of terabytes in size
(this will be peanuts in the future).
3. Why “Data Mining”?
„We are drowning in data, but we are starving for knowledge“
(John Naisbitt)
4. Knowledge Discovery in Large Databases
Process of finding valuable and useful patterns in datasets
5. Analysis of data sets from …
• businesses & investments
• finance & economics
• science & technology
• bioinformatics
• telecommunication
… or more complex data sets
• multimedia & sound
• images & video
• automatic news analysis
• social media analysis.
6. What are the data sources?
Consumer data
− Credit card transactions data
− Supermarket transactions data
− Loyalty cards
− Web server logs
− Social media
Variety of features
− Name and address
− History of shopping and purchases
− Demographics
− Credit rating
− Quality & market share of products
9. Key Tasks
Decision Trees
Assocation Rule
Learning
Neural Networks
Digital Forensics
Automatic Derivation
of Ontologies
10. Retail
• Customer segmentation
Identify purchase patterns of „typical“ customers
Targeted advertisement, costumized pricing, cost-effective promotions
• Market basket analysis
Identify the purchase behaviour of groups of customers
• Sales promotions
Identify likely responders to sales promotions
11. Banking
• Credit rating
Given a large number names, which persons are likely
to default on their credit cards?
• Fraud detection
− Credit card fraud detection
− Network intrusion detection
12. Telecommunications
Companies are facing an escalating competition and are forced to
aggressively market special pricing programs aimed at retaining
existing customers and attracting new ones.
• Call detail record analysis
Identify customer segments with similar use patterns.
Offer attractive pricing and feature promotions.
• Customer loyalty / customer churn management
Some customers repeatedly „churn“ (switch providers).
Identify those who are likely to switch or who are likely to remain loyal.
Companies can target their spending on customers who will produce the most profit.
• Set pricing strategies in a highly competitive market.
13. Big Data is Big Business
Companies are using their data sets to aim their services
and products with increasing precision.
Business Intelligence
− SAP AG is a German global software corporation
that provides enterprise software applications.
− SAP AG is one of the largest enterprise software companies.
− In October 2007, SAP AG announced a $6.8 billion deal to acquire „Business Objects“.
− Since 2009 „Business Objects“ is a division of SAP AG instead of a separate company.