Achieve Better Insight and Prediction with Data Mining


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Achieve Better Insight and Prediction with Data Mining

  1. 1. Clementine® 11.1 – Specifications Achieve Better Insight and Prediction with Data Mining Data mining provides organizations with a clearer view of Clementine is popular worldwide with data miners current conditions and deeper insight into future events. With and business users alike because it enables you to: Clementine from SPSS Inc., your organization can conduct n Easily access, prepare, and integrate structured data data mining that incorporates many types of data, resulting and also text, Web, and survey data in deeper insight into your customers and every aspect of n Rapidly build and validate models, using the most your operations. advanced statistical and machine-learning techniques available Clementine enables your organization to strengthen n Efficiently deploy insight and predictive models on performance in a number of areas. For example, you could: a scheduled basis or in real time to the people that n Improve customer acquisition and retention make decisions and recommendations, and the n Increase customer lifetime value systems that support them n Detect and minimize risk and fraud n Reduce cycle time while maintaining quality in Clementine offers a number of unique capabilities that product development make it an ideal choice for today’s data-rich organizations. n Support scientific research Use the predictive insight gained through Clementine to guide customer interactions in real time and share that insight throughout your organization. With Clementine’s powerful data preparation, visualization, and predictive modeling capabilities, you can solve business problems faster.
  2. 2. Streamline the data mining process Choose from an unparalleled breadth of techniques Clementine’s intuitive graphical interface enables analysts Clementine offers a broad range of data mining techniques to visualize every step of the data mining process as that are designed to meet the needs of every data mining part of a “stream.” By interacting with streams, analysts application. You can choose from a number of algorithms and business users can collaborate in adding business for clustering, classification, association, and prediction, knowledge to the data mining process. Because data as well as algorithms for automated multiple modeling, miners can focus on knowledge discovery rather than time-series forecasting, and interactive rule building. on technical tasks like writing code, they can pursue “train-of-thought” analysis, explore the data more deeply, Optimize your current information technologies and uncover additional hidden relationships. Clementine is an open, standards-based solution. It integrates with your organization’s existing information From this visual interface, you can easily access and systems, both when accessing data and when deploying integrate data from textual sources, Web logs, SPSS’ results. You don’t need to move data into and out of a Dimensions survey research products, as well as data in ™ proprietary format. This helps you conserve resources, virtually any type of database, spreadsheet, or flat file— deliver results faster, and reduce infrastructure costs. including SPSS for Windows , SAS, and Microsoft Excel ® ® ® files. No other data mining solution offers this versatility. Follow a proven, repeatable process During every phase of the data mining process, Clementine Leverage all your data for improved models supports the de facto industry standard, the CRoss- Only with Clementine can you directly and easily access Industry Standard Process for Data Mining (CRISP-DM). text, Web, and survey data, and integrate these additional This means your company can focus on solving business types of data in your predictive models. SPSS customers problems through data mining, rather than on reinventing have found that using additional types of data increases a new process for every project. Individual Clementine the “lift” or accuracy of predictive models, leading to more projects can be efficiently organized using the CRISP-DM useful recommendations and improved outcomes. project manager. With the fully integrated Text Mining for Clementine® Add enterprise-level capabilities module, you can extract concepts and opinions from any Clementine can efficiently analyze the amounts of data type of text—such as internal reports, call center notes, typically generated by small to mid-sized organizations. customer e-mails, media or journal articles, blogs, and more. If your data mining needs grow in volume or complexity, And, with Web Mining for Clementine , you can discover ® SPSS makes it easy for you to move to our enterprise- patterns in the behavior of visitors to your Web site. level offering. Direct access to survey data in Dimensions products enables you to include demographic, attitudinal, and behavioral information in your models—rounding out your understanding of the people or organizations you serve.
  3. 3. Using client/server architecture, Clementine Server enables Improved reporting features multiple data analysts to work simultaneously without Clementine 11.1 includes enhanced reporting features straining computing resources. You can take advantage of that enable your organization’s analysts to illustrate in-database data mining on leading information platforms results graphically and communicate them quickly and and efficiently process large amounts of data. Clementine clearly to executive management. Server also offers additional deployment options—helping n Generate report-quality graphics via a new graphics you to extend the benefits of data mining across geographic engine or functional lines and put results in the hands of decision n Control the appearance of graphs using new graph makers. editing tools n Leverage SPSS output from within Clementine, You can further support analytical assets throughout your including extensive reports and high-quality graphics organization by using Clementine with SPSS Predictive Enterprise Services™, which enables you to centralize the Enhanced data preparation tools storage and management of data mining models and all Extensive enhancements to Clementine’s data preparation associated processes. With this platform, you can control capabilities help your organization ensure better results the versioning of your predictive models, audit who uses by providing more efficient approaches to cleaning and and modifies them, provide full user authentication, auto- transforming data for statistical analysis. mate the process of updating your models, and schedule n Correct data quality issues such as outliers and missing model execution. As a result, your predictive models values more easily via a unified view of data become real business assets and your organization gains n Streamline the transformation process by visualizing the highest possible return on your data mining invest- and selecting from multiple transformations ment. n Access SPSS scripting capabilities and syntax for greater flexibility in data management What’s new in Clementine 11.1 With this release, SPSS continues its commitment to Additional data mining algorithms delivering a data mining solution that offers the greatest More than a dozen new algorithms enable your analysts possible efficiency and flexibility in the development to perform processes such as modeling, regression, and deployment of predictive models. and forecasting faster and with greater precision, permitting more sophisticated statistical analysis and Clementine 11.1 features extensive enhancements in four more accurate results. key areas to help your organization improve performance, n Easily automate the building, evaluation, and productivity, and return on your data mining investment. automation of multiple models, eliminating repetitive tasks n Create rule-based models interactively that include your business knowledge n Develop self-learning response models that can be trained incrementally on new data instead of re-building from scratch
  4. 4. High-performance architecture Gain significant improvements in security, scalability, integration, and performance as a result of major Business Data Understanding Understanding enhancements to the Clementine 11.1 architecture. n Enjoy greater flexibility and interoperability through improved integration with other systems and architectures Data Preparation and extended support for industry standards such as Deployment PMML 3.1 Modeling n Use parallel processing (when available) to leverage Data high-performance hardware and achieve faster time- to-solution and higher ROI Evaluation Security and Privacy Ensure that sensitive or confidential data does not fall into the wrong hands. The CRISP-DM process, as shown in this diagram, enables data miners to implement efficient data mining projects that yield measurable business results. n Easily transform data into an anonymous form, excluding confidential data values and metadata n Use secure sockets layer (SSL) encryption for secure as long-range planning, with insight into current and communication of sensitive data future conditions. You can accomplish this securely and efficiently, across your entire enterprise, with SPSS The cornerstone of the Predictive Enterprise Predictive Enterprise Services. Clementine makes data predictive and facilitates the delivery of predictive insight to the people in your Clementine's extensive capabilities are supported by the organization who make decisions and the systems most advanced statistical and machine learning techniques that support daily customer interactions. available. For greater value, it is open to operation and integration with your current information systems. If your organization has any data stored in text form or in Web logs—as many do—you can leverage it by using Text Mining for Clementine or Web Mining for Clementine. “We are extremely impressed with the customer-centric response from SPSS in advancing their data mining And you can understand the attitudes and beliefs that product to help organizations, such as EarthLink, lie behind behavior—why customers make the choices better meet their customer marketing needs.” they do—by incorporating survey data from any of SPSS’ Dimensions survey research products. – Atique Shah Vice President of Direct Marketing and Consumer Insights Thanks to its integration with SPSS predictive applications EarthLink, Inc. and other information systems, Clementine enables you to guide daily decisions and recommendations, as well
  5. 5. Features nAutomatically extract concepts from Modeling Clementine’s main features are described any type of text by using Text Mining Employ a wide range of data mining algorithms below in terms of the CRISP-DM process. for Clementine* with many advanced features to get the best – Web site data possible results from your data. Business understanding n Automatically extract Web site events n Interactive model and equation browsers Clementine’s visual interface makes it easy from Web logs using Web Mining for and advanced statistical output for your organization to apply business Clementine* n Combine models through meta-modeling knowledge to data mining projects. In – Survey data – Multiple models can be combined, or addition, optional business-specific n Directly access data stored in the one model can be used to analyze a Clementine Application Templates (CATs) Dimensions Data Model or in the data second model are available to help you get results faster. files of Dimensions* products n Import PMML models from other tools such CATs ship with sample data so that you – Data export as AnswerTree® and SPSS for Windows can easily see the details of best-practice n Work with delimited and fixed-width n Use the Clementine External Module techniques. text files, Microsoft Excel, SPSS, and Interface (CEMI) for custom algorithms n CRM CAT* SAS 6, 7, 8, and 9 files – Purchase add-on tools from the n Telco CAT* n Export in XLS format through the Clementine Partner Plus Program n Fraud CAT* Excel Output Node n Microarray CAT* n Choose from various data-cleaning options Clementine’s data mining algorithms are n Web Mining CAT* (requires the purchase – Remove or replace invalid data organized into a “base” module and optional of Web Mining for Clementine) – Use predictive modeling to additional algorithm modules. The base automatically impute missing values module includes: Data understanding – Automatically generate operations for n C&RT, CHAID & QUEST — Decision tree n Obtain a comprehensive first look at your the detection and treatment of outliers algorithms including interactive tree data using Clementine’s data audit node and extremes building n View data quickly through graphs, summary n Manipulate data n K-means—Clustering statistics, or an assessment of data quality – Work with complete record and field n GRI—Generalized rule induction association n Create basic graph types, such as histograms, operations, including: discovery algorithm distributions, line plots, and point plots n Field filtering, naming, derivation, n Factor/PCA—Data reduction using factor n Edit your graphs to communicate results binning, re-categorization, value analysis and principal component analysis more clearly replacement, and field reordering n Linear Regression—Best-fit linear equation n Use association detection when analyzing n Record selection, sampling, merging modeling Web data (through inner joins, full outer joins, n Interact with data by selecting a region of partial outer joins, and anti-joins), and The Clementine Classification Module a graph and see the selected information concatenation; sorting, aggregation, includes: in a table; or use the information in a later and balancing n Binary classifier—Automate the creation phase of your analysis n Data restructuring, including and evaluation of multiple models n Access SPSS statistics and reporting transposition n Decision list—Interactive rule-building tools from Clementine, including reports n Splitting numerical records into for marketing and customer applications and graphics sub-ranges optimized for prediction n Self-learning response model— Bayesian n Extensive string functions: string model with incremental learning Data preparation creation, substitution, search and n Time-series—Generate and automatically n Access data matching, whitespace removal, select time-series forecasting models – Structured (tabular) data and truncation n C5.0 decision tree and rule set algorithm n Access ODBC-compliant data sources n Preparing data for time-series analysis n Neural Networks—Multi-layer perceptrons with the included SPSS Data Access with the Time Plot node with back-propagation learning, and radial Pack. Drivers in this middleware pack – Partition data into training, test, and basis function networks support IBM DB2®, Oracle, Microsoft validation datasets n Binomial and multinomial logistic SQL Server, Informix®, and Sybase® – Transform data automatically for regression databases. multiple variables n Discriminant analysis n Import delimited and fixed-width text n Visualization of standard n Generalized linear models (GLM) files, any SPSS file, and SAS® 6, 7, 8, transformations and 9 files n Access data management and n Specify worksheets and data ranges transformations performed in when accessing data in Excel SPSS directly from Clementine – Unstructured (textual) data Features subject to change based on final product release. Symbol indicates a new feature. * Separately priced modules
  6. 6. Features (continued) n Automatically export Clementine streams SPSS Predictive Enterprise Services (optional*) The Clementine Segmentation Module to SPSS predictive analytics applications SPSS Predictive Enterprise Services is an includes: – Combine exported Clementine streams enterprise-level platform that enables you n Kohonen Network—Clustering neural network with predictive models, business rules, to manage and automate your analytical n TwoStep Clustering—Select the right and exclusions to optimize customer processes and easily deploy results across number of clusters automatically interactions your organization to increase productivity n Anomaly Detection—Detect unusual records n Cleo™ (optional*) and increase the value of your analytical through the use of a cluster-based algorithm – Implement a Web-based solution for investment. SPSS Predictive Enterprise Base rapid model deployment Services helps you to: The Clementine Association Module includes: – Enable multiple users to simultaneously n Centralize and manage analytical assets n Apriori–Popular association discovery access and immediately score single to leverage organizational knowledge and algorithm with advanced evaluation records, multiple records, or an entire provide powerful change management functions database, through a customizable capabilities to help with auditability and n CARMA–Association algorithm which browser-based interface compliance supports multiple consequents n Scripting n Automate your analytical processes to n Sequence–Sequential association – Use scripts to automate complex increase productivity and ensure reliable, algorithm for order-sensitive analyses repetitive tasks consistent, accurate results n Deploy analytical results by delivering Evaluation Clementine Server (optional*) output through customizable, browser n Easily evaluate models using lift, gains, n Clementine Server, SPSS’ enterprise-level based end-user interfaces or integrating profit, and response graphs offering, provides all of the data mining with your existing applications using – Use a one-step process that shortens capabilities of the client version plus standardized web services interfaces project time when evaluating multiple increased performance and other models functionality. For a complete list of Predictive Enterprise Services Clementine – Define hit conditions and scoring features, refer to the Clementine Server Client Adapter enables the analyst to interact expressions to interpret model specifications. Key capabilities enable directly with the Base services to store, performance you to: retrieve, browse, and search for analytical n Analyze overall model accuracy with n Employ in-database mining to leverage assets. Predictive Enterprise Services coincidence matrices and other automatic high-performance database implementations Clementine Server Adapter enables the evaluation tools n Use in-database modeling to build models Process Manager to control Clementine in the database using leading database tasks. For more information about how your Deployment technologies (DB2 Enterprise Edition 8.2, organization can benefit by using Clementine Clementine offers a choice of deployment Oracle 10g, and Microsoft SQL Server Server with SPSS Predictive Enterprise capabilities to meet your organization’s needs. Analysis Services) and leverage high- Services, see the SPSS Clementine Server n Clementine Solution Publisher Runtime performance database implementations brochure. (optional*) – Vendor-supplied algorithms included – Automate the export of all operations, for IBM DB DWE, Oracle Data Mining, including data access, data manipulation, and Microsoft SQL Server 2005 text mining, model scoring—including n Leverage high-performance hardware, combinations of models—and post- experience quicker time-to solution, and processing achieve greater ROI through parallel – Use a runtime environment for executing execution of streams and multiple models image files on target platforms n Transmit sensitive data securely between Clementine Client and Clementine Server through secure sockets layer (SSL) encryption Features subject to change based on final product release. Symbol indicates a new feature. * Separately priced modules To learn more, please visit For SPSS office locations and telephone numbers, go to SPSS is a registered trademark and the other SPSS products named are trademarks of SPSS Inc. All other names are trademarks of their respective owners. © 2007 SPSS Inc. All rights reserved. CLM111SPC-0407