Data Mining_SPSS.doc.doc


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data Mining_SPSS.doc.doc

  1. 1. Data Mining Product Features The Product Features table includes many of the mainstream data mining products, focusing on those with which Exclusive Ore® has had hands-on experience. While we attempt to keep the table as current as possible, we encourage you to use the hot-links to the vendor sites for the latest information on features and functionality. Key: NN=Neural Net; Tree=Decision Tree; Naïve Bayes; k-Mns=K Means; k-NN=k- Nearest Neighbor; Stats=Traditional Statistical Techniques; Pred=Prediction; Time Series; Clust=Clustering; Assoc=Association; Win 32=Windows 95/98/NT; UNIX; Par=Parallel Scalability (in at least one OS); API SDK=API or SDK available; SQL Ext=Uses Special SQL Extensions
  2. 2. Naï Tim AP ve k- k- e Wi I SQ N Tre Bay Mn N Sta Pre Seri Clu Ass n UNI Pa SD L Company Product N e es s N ts d es st oc 32 X r K Ext Angoss International KnowledgeSE Ye Ye Ltd. EKER Yes Yes s Yes s KnowledgeST Ye Ye Ye Ye UDIO s Yes s Yes Yes s Yes s Yes Yes Business BusinessMine Ye Objects r Yes s Cognos Ye Ye Incorporated 4Thought s Yes Yes s Ye Scenario Yes s DataBase Fair, Isaac/HNC Mining Ye Ye Ye Software Marksman s Yes Yes s s Informix/Red Brick Red Brick Ye Software Inc. Data Mine Yes Yes s Yes Yes International Business Intelligent Ye Ye Ye Ye Machines Miner s Yes s Yes Yes Yes Yes s Yes s Yes Accrue Decision Ye Ye Software Series s Yes Yes Yes Yes Yes Yes s Yes Ye Ye NeuralWare NeuralSIM s Yes s Ye Ye Ye Oracle Corp. Darwin s Yes s Yes Yes s Salford Ye Systems CART Yes Yes s Yes Enterprise Ye Ye SAS Institute Miner s Yes Yes Yes Yes Yes Yes s Yes Ye SPSS, Inc. Answer Tree Yes Yes Yes s Yes Ye Ye Clementine s Yes Yes Yes Yes Yes s Yes Neural Ye Ye Connection s Yes Yes s Yes Pattern Unica Recognition Ye Ye Ye Ye Technology Workbench s s s Yes Yes Yes Yes s Yes
  3. 3. Ye Ye Ye Ye Model 1 s Yes Yes s Yes Yes s Yes s Naï Tim AP ve k- k- e Wi I SQ N Tre Bay Mn N Sta Pre Seri Clu Ass n UNI Pa SD L Company Product N e es s N ts d es st oc 32 X r K Ext Data Mining Improves Decision Making
  4. 4. Data mining uncovers patterns in data using predictive techniques. These patterns play a critical role in decision making because they reveal areas for process improvement. Using data mining, organizations can increase the profitability of their interactions with customers, detect fraud, and improve risk management. The patterns uncovered using data mining help organizations make better and timelier decisions. Data mining helps SPSS customers solve business problems SPSS data mining solutions and services have enabled hundreds of organizations to achieve remarkable results in many areas. For example, organizations have used data mining to:  Boost sales by 50 percent and reduce marketing costs by 30 percent by uncovering cross-selling and "rollover" sales opportunities. (HSBC Bank plc)  Triple online profits by improving personalization features. (Sofmap)  Secure an additional $50 million in revenue by using an accurate propensity model to target offers.(Standard Life)  Improve the response rate of direct mail campaigns by 100 percent.(BT) Using data mining tools Most analysts separate data mining software into two groups: data mining tools and data mining applications. Data mining tools provide a number of techniques that can be applied to any business problem. Data mining applications, on the other hand, embed techniques inside an application customized to address a specific business problem. Regardless of whether we are aware of it, our daily lives are influenced by data mining applications. For example, almost every financial transaction is processed by a data mining application to detect fraud. Both data mining tools and data mining applications are valuable, however. Increasingly, organizations are using data mining tools and data mining applications together in an integrated environment for predictive analytics. So what do data mining tools add? Data mining tools are used to ensure flexibility and the greatest accuracy possible. Essentially, data mining tools increase the effectiveness of data mining applications. Since no two organizations or data sets are alike, no single technique delivers the best results for everyone. Not only do data mining tools deliver in-depth techniques, but data mining tools also deliver flexibility to use combinations of techniques to improve predictive accuracy. Because data mining tools are so flexible, a set of data mining guidelines and a data mining methodology have been developed to help guide the process. The Cross-Industry Standard Process for Data Mining (CRISP-DM) ensures your organization's results with data mining tools are timely and reliable. This methodology was created in conjunction with practitioners and vendors to supply data mining practitioners with checklists, guidelines, tasks, and objectives for every stage of the data mining process. Clementine data mining transforms data into actionable results Clementine, the SPSS data mining workbench, enables your organization to quickly develop predictive data mining models and deploy those data mining models into your organization's operations - improving decision making. Using Clementine's powerful, visual data mining interface and your business expertise, you can quickly interact with your data and begin discovering patterns you can use to change your organization for the better. To learn more about the Clementine data mining workbench, visit the comprehensive Clementine page.
  5. 5. The latest data mining advances—text mining and Web mining Recent advances have led to the newest and hottest trends in data mining—text mining and Web mining. These two data mining technologies open a rich vein of customer data in the form of textual comments from survey research and log files from Web servers, which were previously unusable. Applying data mining to these data adds a richness and depth to the patterns already uncovered through your data mining efforts. Learn more about text mining and Web mining on Data mining fights crime Key Products and Services • Clementine® • Web Mining for Clementine® • Text Mining for Clementine® Resources • Events • Investor Center • Press Releases Demos and Downloads • Brochures • Demos • Patches and Utilities • White Papers HSBC Bank plc Situation Headquartered in London, HSBC Holdings plc is one of the largest banking and financial services organizations in the world. The HSBC Group has more than 9,700 offices in 77 countries. HSBC Bank plc is the U.K. member of the HSBC banking group. Its card services division manages the credit card products. The card services analysis unit uses SPSS software for marketing campaign evaluation and customer segmentation strategies. About 125 million credit card transactions are made each year by HSBC Bank cardholders. Each transaction carries an income fee payable by the receiving bank to the cardholder's bank. The rules on fee rates applied are somewhat complex. A further complicating factor is that banks processing the transactions from the retailers may use different processing systems. Challenge The credit card market is one of the most volatile banking product markets in the U.K., due in part to an influx of new entrants over the past five years. These range from single product suppliers looking to
  6. 6. extend their market to vertical industries diversifying their product range. Barriers to entry are relatively low, so the credit card tends to be used as a "foot-in-the-door" product. This has increased the need for traditional U.K. credit card issuers such as HSBC to develop appropriate retention and income generation strategies. Bob Howlett, manager of business analysis, card services, HSBC Bank plc, explains, "We believed that incorrect fees were being received for some transactions. Our task was to verify this and also set up a process which would ensure that the correct transaction fee was being applied on all transactions." HSBC Bank plc decided to evaluate SPSS after internal software and basic database software had proved ineffectual. Solution As a result of extensive experience in handling large volumes of data, a member of the card services analysis unit was asked to help solve this challenge. A daily transaction feed was sorted using SPSS for Windows software and, within two weeks, income shortfalls had been identified. Further manipulation of the data indicated substantial variances between the income received and the income that should have been received. "We anticipate that recovery of this underpayment will generate in excess of one million pounds (~US$1.43 million) per annum," explained Mr. Howlett. Interested in HSBC Bank plc? Download the PDF We anticipate that recovery of this underpayment will generate in excess of one million pounds per annum. Bob Howlett Manager Business Analysis Card Services HSBC Bank Plc. Results Using SPSS software, HSBC Bank plc was able to analyze millions of bytes of data efficiently to recover lost money. The people who analyze customer-level databases are able to apply their knowledge to sort high volumes of individual transactions and deliver the required outputs with ease. Predictive Analytics can make your organization more successful Resources • E-newsletter • Events • Investor Center • Press Releases RSS Feeds from SPSS Learn more about RSS SPSS Success RSS feed Sofmap Situation
  7. 7. Sofmap Company, Ltd., Tokyo, is one of Japan's top personal computer and software retailers with 40 retail stores located throughout the country. Challenge Sofmap managers believed that many of their customers had difficulty making hardware and software purchasing decisions, which was hindering online sales. Solution Sofmap used SPSS Inc.'s Clementine data mining solution to build an engine that recommends appropriate products based on customers' profiles, which are based on information gathered during the online registration process and from past transactions. Interested in Sofmap? Download the PDF Results  Page views increased 67 percent per month after recommendation engine went live  Profits tripled in 2001, as sales increased 18 percent vs. the same period of the previous year Sofmap first began selling computers over the Internet in 1995 - establishing the "Sofmap Virtual Store" to supplement "Sofmap Hyper," their mail order catalog. Over the years, the company's e-commerce sales have grown dramatically. However, Sofmap executives believed that they were losing sales from less computer-savvy individuals who didn't have the expertise to select products that met their needs. The company addressed this problem by developing a recommendation engine that provided personalized recommendations to their online customers. "We already had a large database with information on over two million customers," said Nobuyuki Matsuda, strategic planning manager for Sofmap. "We wanted to analyze the demographics, individual characteristics and purchase patterns of these customers in order to provide visitors with information that will help them make wise purchases." The recommendation engine, the first of its kind in the Japanese market, attracted a large amount of publicity and buzz among users. Sofmap needed to quickly and easily mine the information that was buried in the customer and transactional databases. The company wanted a software package that their marketing staff, who have the best understanding of customer behavior, could use to perform their own analysis. Sofmap knew this would save time and money by eliminating the need to have programmers, who were already very busy, involved in the early stages of the project. Clementine's ease of use enabled Sofmap's marketers to analyze transaction data and information from their customer database and generate the business rules used in the recommendation engine. "We selected Clementine because it enables users without computer programming knowledge to develop very powerful analytical solutions," explained Matsuda. "This approach is so much more efficient than to have the marketers communicate the goals of the analysis to the programmers. Now the marketers can do the entire job themselves, which saves a considerable amount of time."
  8. 8. Sofmap marketers collected a wide range of information for use in building models of purchasing behavior including:  Demographic information such as age, sex and occupation  Survey data collected from previous events and seminars  Purchase information from over-the-counter sales  Transaction data from including both page views and purchase data Using Clementine to analyze this information, the marketing staff clustered customers based on what it called a "digital lifestyle model." The model was then used to construct both the business rules and the recommendation engine, as well as to personalize the site for returning customers. The engine works by comparing the customer's profile, registered on their "My Sofmap" page, to the pre- defined digital lifestyle model. Based on what digital lifestyle model the profile matches up with, recommendations are made. For example, the engine can:  Select news articles and featured products to be displayed on the customer's personal Sofmap home page  Select specific products to highlight when the customer visits a particular product section  Recommend complementary products when a customer makes a purchase The marketers also identified their most profitable customers, based on shopping frequency and purchase size. This enabled the new recommendation engine to focus on these customers in particular. We selected Clementine because it enables users without computer programming knowledge to develop very powerful analytical solutions. Nobuyuki Matsuda Strategic Planning Manager Sofmap Page views increased 67 percent per month after recommendation engine went live During the first month that the recommendation engine was available, site traffic increased from a typical 18 million page views per month to a new sustained level of more than 30 million views per month. Sofmap managers said that the rise in traffic could be almost entirely attributed to the new recommendation engine. Overall, the recommendations increased the "stickiness" of the site from 7.8 to 15 page views per session. Profits tripled in 2001, as sales increased 18 percent vs. the same period of the previous year Even more important is that sales have significantly increased since the recommendation engine went live, driven by the fact that consumers purchased many more items. Prior to the go-live date, sales at saw an annual growth rate of approximately 270 percent. After the recommendation engine went live, the growth rate immediately jumped to 320 percent. According to Sofmap, achieving this substantial increase in sales without any additional promotional expenditure has tripled the profitability of the site.
  9. 9. Standard Life Situation Standard Life is one of the world's leading mutual financial services companies. Its mutual status is a key factor in its success. This status brings many benefits, the chief one being that there are no shareholders to satisfy, only customers. All the firm's actions are therefore driven by the need to benefit its customers. The Standard Life group includes the Standard Life Assurance Company, Standard Life Bank, Standard Life Healthcare and Standard Life Investments. Challenge Standard Life Bank launched its Freestyle Mortgage product in January 1999 with extensive TV and press advertising campaigns. In recent months a number of similar products have appeared from rival providers and so it is essential that Standard Life Bank can both consolidate and continue to expand its share of the mortgage market. To achieve these aims the company must:  identify the key attributes of clients attracted to the mortgage offer,  cross sell Standard Life Bank products to the clients of other Standard Life companies,  develop a remortgage model which could be deployed on the group Web site and  examine the profitability of the mortgage business being accepted by Standard Life Bank. A major part of the project was to develop models that could identify customer characteristics relevant to any mortgage product. Donald MacDonald, customer data analyst at Standard Life further explains, "Our vision was to increase both the speed at which we build our models and the sophistication of those same models. This ultimately leads to improved customer communications as well as greater returns on the bottom line". Solution With SPSS data mining software, Standard Life can undertake a growing number of projects. With the aid of SPSS, they were able to build a propensity model for the remortgage offer showing the types of clients attracted to this product. This model is easily portable and applicable to other client databases within the Standard Life group.
  10. 10. Using the propensity models, a remortgage mailing campaign was planned and executed by the Customer Data Analysis team in conjunction with Standard Life Bank. The mailing included a randomly selected control cell. The model allowed the bank to focus its efforts on the best prospects for the remortgage product and create scores for each customer. These scores enabled them to achieve more targeted direct mail and to score prospects with similar characteristics accessing the Web site. Clementine has already paid for itself many times over and will continue to do so for many years to come. Donald MacDonald, customer data analysis, Standard Life Results Overall, Standard Life's data mining enabled it to better understand the characteristics of its mortgage clients so it could more accurately search for potential new clients. Also, the bank now has the capability to profile incoming prospects quickly and personalize their Web experiences accordingly. As a result of its data mining efforts, Standard Life:  Built a propensity model for the Standard Life Bank mortgage offer identifying key customer types that can be applied across the whole group prospect pool  Discovered the key drivers for purchasing a remortgage product  Achieved with the model, a nine times greater response than that achieved by the control group  Secured £33million (approx. $47 million) worth of mortgage application revenue Donald MacDonald sums up a highly successful project by explaining, "The effective targeting of our outbound communications achieved through data mining consistently produces significant cost savings for the company. I can honestly say that Clementine has already paid for itself many times over and will continue to do so for many years to come." BT
  11. 11. Challenge To obtain the greatest value from its marketing budget, BT (British Telecommunications) needed to identify customers' propensity to purchase and calculate their likely comparative value once they became customers. After creating accurate customer profiles, BT intended to develop new products targeted to specific customer groups. Solution BT selected Clementine to analyze data and build exploratory models for its "Business Highway" campaign, which was aimed at small business customers. The expected results? A higher response rate to marketing campaigns, increased product revenues — and an even greater market share for the company. Interested in BT? Download the PDF Results  Finding hidden patterns with data mining  Analyzing data and building models  Identifying a targeted "best prospects" list  Improving direct mail response rate by 100 percent  The return on investment The once-peaceful telecommunications industry has turned cutthroat. BT, a former monopoly, is a leading supplier of local, national, and international phone and data services in the United Kingdom. With annual sales of $29 billion, it also competes with other U.K. telecommunications companies. To retain its customers, gain new customers, and maximize sales, the company needed facts about exactly who was buying its products and services. To identify these customers, the company established a customer and campaign analysis team, headed by Senior Consultant Stephen O'Brien, within its business connections division. The team's first assignment was to model customer profiles for BT's Business Highway product, which provides small business customers with three telephone numbers, one standard and two digital, on a single line. The launch included a major direct mail campaign and national media coverage. Even before completing the final models, we were able to surpass our original target — and increase the campaign response rate by 100 percent. — Stephen R. O'Brien Senior Consultant BT Finding hidden patterns with data mining To mine the data sample and find underlying patterns and trends, BT selected Clementine, SPSS' rapid modeling environment. O'Brien called Clementine "a data mining toolkit," because it offered his team a wide range of analytical methods, including clustering, neural networks, association rules, and decision trees. It also easily handled common data problems, such as outliers, missing data, and low-value data. Analyzing data and building models
  12. 12. The team used Clementine as its primary product for data analysis and experimental modeling. During data analysis, the team employed Clementine to identify data quality issues, become familiar with the data and data distribution, and eliminate data attributes not strongly associated with the purchase of Business Highway. Then it measured the predictive strength of individual data attributes in relation to the customer's propensity to buy the product. For example, two-digit district codes, a geographic indicator, were clearly linked to response and purchase data. After the analyses, the team quickly built and tested a series of experimental models using Clementine decision trees. The product's greatest strength, said O'Brien, is that "you don't get lost in the data mining project. Clementine reduces the cost of failure by letting you try out lots of ideas quickly and eliminate them from consideration. You can build many explanatory models over a few days." Identifying a targeted "best prospects" list "The main output of Clementine is insights about the data — that's what data mining's all about—and visual representations of those insights," O'Brien said. "Our deliverables to sales and marketing were lists of customers and charts showing why these were the customers they should speak to about the Business Highway product," he added. Improving direct mail response rate by 100 percent "The Business Highway project raises issues about how you can benefit from using data mining in business. With Clementine, the exploratory data analysis and visualization we were able to do up front enabled us to develop satisfactory customer selection criteria. Even before completing the final models, we were able to surpass our original target — and increase the campaign response rate by 100 percent," O'Brien said. More work remains. Next, the team plans to use Clementine to identify customers who have the greatest profit potential and those customers who demand lots of attention but do not buy. In the future, the team may also try to determine consistent patterns for customer defection, often referred to as "churn." The return on investment Successful customer profiling requires business knowledge, the right data — and the right products. BT's modeling program enables it to target customers over the life of products and campaigns, identify trends in the changing marketplace, and improve its penetration in different market sectors. And at every step, Clementine will support marketing with speedy, statistically sound analyses. The payoff? As the Business Highway project shows, better customers and higher sales.
  13. 13. Predictive Web insight for more effective action Data mining is a proven technology for advanced analysis that detects key patterns and trends. But the time-consuming complexity of preparing Web data with the business context necessary for data mining has hampered its use in Web analysis—until now. The new standard in Web analysis Web Mining for Clementine is an add-on module that makes it easy for analysts to perform ad hoc predictive Web analysis within Clementine’s intuitive visual workflow interface. By bringing together the leading technologies for both Web analytics and data mining—along with SPSS’ more than 35 years of analytical experience—Web Mining for Clementine sets a new standard for Web analysis. Easily transform raw Web data into analysis-ready business events using the integrated Web mining source node—powered by proven NetGenesis Web analytics technology. These business events are available within predictive Web analysis applications that quickly deliver actionable insight. Enable online business decision makers to take more effective action with the ability to:  Automatically discover user segments  Detect the most significant sequences  Understand product and content affinities  Predict user propensity to convert, buy, or churn