SlideShare a Scribd company logo
1 of 21
100

SQL SERVER: Data Mining
Types of analysis
• Ad-hoc query/Reporting/Analysis
  – What is the purpose?
     • Simple reports
     • Key Performance Indicators
     • OLAP cubes – Slice & Dice
  – In Real time - What happens now?
     • Events/Triggers

• Data Mining
  – How do we do it?
  – What happens?
What does Data Mining Do?
 Explores
Your Data

             Finds
            Patterns

                        Performs
                       Predictions
Data Mining Algorithms
•   Classification
•   Regression
•   Segmentation
•   Association
•   Forecasting
•   Text Analysis
•   Advanced Data Exploration
Mining Process
Training data                    Data to be
                  Mining Model   predicted




    DM Engine




Mining Model
                                      With
                                      predictions
Data Mining Process
                                                                     SSAS
                                                                     (OLAP)
                  Business                          Data
                                                                     DSV
                Understanding                   Understanding



                                                                              SSIS
                                                                              SSAS
                                                                   Data
                                Data                                          (OLAP)
                                                                Preparation

SSIS
SSAS(OLAP)
SSRS             Deployment
Flexible APIs                                                                   SSAS
                                                                 Modeling      (Data
                                                                              Mining)

                                   Evaluation

                                                                  www.crisp-dm.org
Data Mining in SQL Server 2008
• New algorithms developed in conjunction
  with Microsoft Research
• Data mining is made accessible and easy to
  use through integrated user interface, cross-
  product integration and familiar, standard APIs
• Complete framework for building and
  deploying intelligent applications on the fly
• Integration into the cloud.
Top New Features in SQL Server 2008

• Test multiple data mining models simultaneously with statistical
  scores of error and accuracy and confirm their stability with cross
  validation
• Build multiple, incompatible mining models within a single
  structure; apply model analysis over filtered data; query against
  structure data to present complete information, all enabled by
  enhanced mining structures
• Combine the best of both worlds by blending optimized near-term
  predictions (ARTXP) and stable long-term predictions (ARIMA) with
  Better Time Series Support
• Discover the relationship between items that are frequently
  purchased together by using Shopping Basket Analysis; generate
  interactive forms for scoring new cases with Predictive Calculator,
  delivered with Microsoft SQL Server 2008 Data Mining Add-ins for
  Office 2007
Rich and Innovative Algorithms
•   Benefit from many rich and innovative data mining algorithms, most developed by Microsoft Research to
    support common business problems promptly and accurately.
•   Market Basket Analysis - Discover which items tend to be bought together to create recommendations on-
    the-fly and to determine how product placement can directly contribute to your bottom line
•   Churn Analysis - Anticipate customers who may be considering canceling their service and identify benefits
    that will keep them from leaving
•   Market Analysis - Define market segments by automatically grouping similar customers together. Use
    these segments to seek profitable customers
•   Forecasting - Predict sales and inventory amounts and learn how they are interrelated to foresee
    bottlenecks and improve performance
•   Data Exploration - Analyze profitability across customers, or compare customers who prefer different
    brands of the same product to discover new opportunities
•   Unsupervised Learning - Identify previously unknown relationships between various elements of your
    business to better inform your decisions
•   Web Site Analysis - Understand how people use your Web site and group similar usage patterns to offer a
    better experience
•   Campaign Analysis - Spend marketing dollars more effectively by targeting the customers most likely to
    respond to a promotion
•   Information Quality - Identify and handle anomalies during data entry or data loading to improve the
    quality of information
•   Text Analysis - Analyze feedback to find common themes and trends that concern your customers or
    employees, informing decisions with unstructured input
Value of Data Mining
                           Business Knowledge

                                                            SQL Server 2008
Business value




                                                                           Data Mining


                                                          OLAP



                                     Reports (Adhoc)

                           Reports (static)

                  Simple                                         Complex
                                              Usability
Data Mining User Interface
• SQL Server BI Development Studio
  – Environment for creation and data exploration
  – Data Mining projects in Visual Studio solutions, tightly
    integrated
  – Source Control Integration
• SQL Server Management Studio
  – One tool for all administrative tasks
  – Manage, view and query mining models
BI Integration
• Integration Services
  – Data Mining processing and results integrate
    directly in IS pipeline
• OLAP
  – Processing of mining models directly from
    cubes
  – Use of mining results as dimensions
• Reporting Services
  – Embed Data Mining results directly in
    Reporting Services Reports
Applied Data Mining
• Make Decisions without Coding
   – Learn business rules directly from data
• Client Customization
   – Learn logic customized for each client
• Automatic Update
   – Data mining application logic updated by model re-
     processing
   – Applications do not need to be rewritten, recompiled, re-
     deployed
Server Mining Architecture
      BI Dev        Your Application
      Studio
      (Visual
      Studio)        OLE DB/ ADOMD/ XMLA
                                            App
Deploy                                      Data


Analysis Services   Mining Model
Server
                    Data Mining Algorithm           Data
                                                   Source
Data Mining EXtensions
• OLE DB for Data Mining specification
   – Now part of XML/A specification
   – See www.xmla.org for XML/A details
• Connect to Analysis Server
   – OLEDB, ADO, ADO.Net, ADOMD.Net, XMLA
   Dim cmd as ADOMD.Command
   Dim reader as ADOMD.DataReader
   Cmd.Connection = conn
   Set reader =
     Cmd.ExecuteReader(“Select
     Predict(Gender)…”)
Typical DM Process Using DMX
Define a model:
CREATE MINING MODEL ….

                                    Data Mining
Train a model:                   Management System
INSERT INTO dmm ….                   (DMMS)
         Training Data




Prediction using a model:          Mining Model
SELECT …
FROM dmm PREDICTION JOIN …
         Prediction Input Data
DMX Commands
• Definition (DDL)
   –   CREATE – Make new model
   –   SELECT INTO – Create model by copying existing
   –   EXPORT – Save model as .abf file
   –   IMPORT – Retrieve model from .abf file
• Manipulation (DML)
   –   INSERT INTO – Train model
   –   UPDATE – Change content of model
   –   DELETE – Clear content
   –   SELECT – Browse model
DMX SELECT Elements
•   SELECT [FLATTENED] [TOP] <columns>
•   FROM <model>
•   PREDICTION JOIN <table>
•   ON <mapping>
•   WHERE <filter>
•   ORDER BY <sort expression>
    – Use query builder to create SELECT statement
Training a DM Model: Simple
INSERT INTO CollegePlanModel
  (StudentID, Gender, ParentIncome,
   Encouragement, CollegePlans)
OPENROWSET(‘<provider>’, ‘<connection>’,
      ‘SELECT    StudentID,
                 Gender,
                 ParentIncome,
                 Encouragement,
                 CollegePlans
       FROM CollegePlansTrainData’)
Prediction Using a DM Model
• PREDICTION JOIN
  SELECT t.ID, CPModel.Plan
  FROM CPModel PREDICTION JOIN
      OPENQUERY(…,„SELECT * FROM NewStudents‟) AS t
  ON CPModel.Gender = t.Gender AND
     CPModel.IQ = t.IQ
Visit more self help tutorials

• Pick a tutorial of your choice and browse
  through it at your own pace.
• The tutorials section is free, self-guiding and
  will not involve any additional support.
• Visit us at www.dataminingtools.net

More Related Content

What's hot

Introduction of ssis
Introduction of ssisIntroduction of ssis
Introduction of ssis
deepakk073
 

What's hot (20)

Informatica session
Informatica sessionInformatica session
Informatica session
 
Power BI - Power Query
Power BI - Power QueryPower BI - Power Query
Power BI - Power Query
 
Power bi
Power biPower bi
Power bi
 
ETL Using Informatica Power Center
ETL Using Informatica Power CenterETL Using Informatica Power Center
ETL Using Informatica Power Center
 
Power BI Tutorial For Beginners | Power BI Tutorial | Power BI Demo | Power B...
Power BI Tutorial For Beginners | Power BI Tutorial | Power BI Demo | Power B...Power BI Tutorial For Beginners | Power BI Tutorial | Power BI Demo | Power B...
Power BI Tutorial For Beginners | Power BI Tutorial | Power BI Demo | Power B...
 
OBIEE ARCHITECTURE.ppt
OBIEE ARCHITECTURE.pptOBIEE ARCHITECTURE.ppt
OBIEE ARCHITECTURE.ppt
 
Business Intelligence concepts
Business Intelligence conceptsBusiness Intelligence concepts
Business Intelligence concepts
 
Why shift from ETL to ELT?
Why shift from ETL to ELT?Why shift from ETL to ELT?
Why shift from ETL to ELT?
 
Kettle – Etl Tool
Kettle – Etl ToolKettle – Etl Tool
Kettle – Etl Tool
 
Business Intelligence Presentation (1/2)
Business Intelligence Presentation (1/2)Business Intelligence Presentation (1/2)
Business Intelligence Presentation (1/2)
 
Power BI visuals
Power BI visualsPower BI visuals
Power BI visuals
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Introduction of ssis
Introduction of ssisIntroduction of ssis
Introduction of ssis
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
PowerPivot and PowerQuery
PowerPivot and PowerQueryPowerPivot and PowerQuery
PowerPivot and PowerQuery
 
Pass 2018 introduction to dax
Pass 2018 introduction to daxPass 2018 introduction to dax
Pass 2018 introduction to dax
 
Power BI Governance and Development Best Practices - Presentation at #MSBIFI ...
Power BI Governance and Development Best Practices - Presentation at #MSBIFI ...Power BI Governance and Development Best Practices - Presentation at #MSBIFI ...
Power BI Governance and Development Best Practices - Presentation at #MSBIFI ...
 
Power BI
Power BIPower BI
Power BI
 
Business Intelligence - Conceptual Introduction
Business Intelligence - Conceptual IntroductionBusiness Intelligence - Conceptual Introduction
Business Intelligence - Conceptual Introduction
 
OBIEE - Introduction & building reports
OBIEE - Introduction & building reportsOBIEE - Introduction & building reports
OBIEE - Introduction & building reports
 

Viewers also liked

Viewers also liked (20)

SQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next LevelSQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next Level
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
Lf conditionals
Lf conditionalsLf conditionals
Lf conditionals
 
Lecture no 15
Lecture no 15Lecture no 15
Lecture no 15
 
Microsoft Data Mining 2012
Microsoft Data Mining 2012Microsoft Data Mining 2012
Microsoft Data Mining 2012
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
38475471 qa-and-software-testing-interview-questions-and-answers
38475471 qa-and-software-testing-interview-questions-and-answers38475471 qa-and-software-testing-interview-questions-and-answers
38475471 qa-and-software-testing-interview-questions-and-answers
 
Microsoft azure service 소개자료
Microsoft azure service 소개자료Microsoft azure service 소개자료
Microsoft azure service 소개자료
 
Interview Questions for Mobile application Testing
Interview Questions for Mobile application TestingInterview Questions for Mobile application Testing
Interview Questions for Mobile application Testing
 
Preparing your QA team for mobile testing
Preparing your QA team for mobile testingPreparing your QA team for mobile testing
Preparing your QA team for mobile testing
 
Webservices(or)SoapUI Interview Questions
Webservices(or)SoapUI Interview QuestionsWebservices(or)SoapUI Interview Questions
Webservices(or)SoapUI Interview Questions
 
Portavocía en redes sociales
Portavocía en redes socialesPortavocía en redes sociales
Portavocía en redes sociales
 
Quick Look At Clustering
Quick Look At ClusteringQuick Look At Clustering
Quick Look At Clustering
 
Norihicodanch
NorihicodanchNorihicodanch
Norihicodanch
 
LISP: Errors In Lisp
LISP: Errors In LispLISP: Errors In Lisp
LISP: Errors In Lisp
 
LISP: Scope and extent in lisp
LISP: Scope and extent in lispLISP: Scope and extent in lisp
LISP: Scope and extent in lisp
 
Matlab: Saving And Publishing
Matlab: Saving And PublishingMatlab: Saving And Publishing
Matlab: Saving And Publishing
 
LISP:Predicates in lisp
LISP:Predicates in lispLISP:Predicates in lisp
LISP:Predicates in lisp
 
Matlab: Discrete Linear Systems
Matlab: Discrete Linear SystemsMatlab: Discrete Linear Systems
Matlab: Discrete Linear Systems
 
Data-Applied: Technology Insights
Data-Applied: Technology InsightsData-Applied: Technology Insights
Data-Applied: Technology Insights
 

Similar to SQL Server: Data Mining

Data Mining for Developers
Data Mining for DevelopersData Mining for Developers
Data Mining for Developers
llangit
 
Big Data Needs Big Analytics
Big Data Needs Big AnalyticsBig Data Needs Big Analytics
Big Data Needs Big Analytics
Deepak Ramanathan
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 

Similar to SQL Server: Data Mining (20)

BI 2008 Simple
BI 2008 SimpleBI 2008 Simple
BI 2008 Simple
 
The Microsoft BigData Story
The Microsoft BigData StoryThe Microsoft BigData Story
The Microsoft BigData Story
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
 
Process.ppt
Process.pptProcess.ppt
Process.ppt
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
 
SQL Server 2008 Data Mining
SQL Server 2008 Data MiningSQL Server 2008 Data Mining
SQL Server 2008 Data Mining
 
StreamCentral Technical Overview
StreamCentral Technical OverviewStreamCentral Technical Overview
StreamCentral Technical Overview
 
Java Developers, make the database work for you (NLJUG JFall 2010)
Java Developers, make the database work for you (NLJUG JFall 2010)Java Developers, make the database work for you (NLJUG JFall 2010)
Java Developers, make the database work for you (NLJUG JFall 2010)
 
Best Practices for Building a Data Lake on AWS
Best Practices for Building a Data Lake on AWSBest Practices for Building a Data Lake on AWS
Best Practices for Building a Data Lake on AWS
 
Machine Learning and AI at Oracle
Machine Learning and AI at OracleMachine Learning and AI at Oracle
Machine Learning and AI at Oracle
 
Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008
 
Data Mining 2008
Data Mining 2008Data Mining 2008
Data Mining 2008
 
Data Mining for Developers
Data Mining for DevelopersData Mining for Developers
Data Mining for Developers
 
Sas Statistical Analysis System
Sas Statistical Analysis SystemSas Statistical Analysis System
Sas Statistical Analysis System
 
DA_01_Intro.pptx
DA_01_Intro.pptxDA_01_Intro.pptx
DA_01_Intro.pptx
 
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
AWS re:Invent 2016: How to Build a Big Data Analytics Data Lake (LFS303)
 
SSAS Design &amp; Incremental Processing - PASSMN May 2010
SSAS Design &amp; Incremental Processing - PASSMN May 2010SSAS Design &amp; Incremental Processing - PASSMN May 2010
SSAS Design &amp; Incremental Processing - PASSMN May 2010
 
Kylin and Druid Presentation
Kylin and Druid PresentationKylin and Druid Presentation
Kylin and Druid Presentation
 
Big Data Needs Big Analytics
Big Data Needs Big AnalyticsBig Data Needs Big Analytics
Big Data Needs Big Analytics
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 

More from DataminingTools Inc

More from DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 

SQL Server: Data Mining

  • 2. Types of analysis • Ad-hoc query/Reporting/Analysis – What is the purpose? • Simple reports • Key Performance Indicators • OLAP cubes – Slice & Dice – In Real time - What happens now? • Events/Triggers • Data Mining – How do we do it? – What happens?
  • 3. What does Data Mining Do? Explores Your Data Finds Patterns Performs Predictions
  • 4. Data Mining Algorithms • Classification • Regression • Segmentation • Association • Forecasting • Text Analysis • Advanced Data Exploration
  • 5. Mining Process Training data Data to be Mining Model predicted DM Engine Mining Model With predictions
  • 6. Data Mining Process SSAS (OLAP) Business Data DSV Understanding Understanding SSIS SSAS Data Data (OLAP) Preparation SSIS SSAS(OLAP) SSRS Deployment Flexible APIs SSAS Modeling (Data Mining) Evaluation www.crisp-dm.org
  • 7. Data Mining in SQL Server 2008 • New algorithms developed in conjunction with Microsoft Research • Data mining is made accessible and easy to use through integrated user interface, cross- product integration and familiar, standard APIs • Complete framework for building and deploying intelligent applications on the fly • Integration into the cloud.
  • 8. Top New Features in SQL Server 2008 • Test multiple data mining models simultaneously with statistical scores of error and accuracy and confirm their stability with cross validation • Build multiple, incompatible mining models within a single structure; apply model analysis over filtered data; query against structure data to present complete information, all enabled by enhanced mining structures • Combine the best of both worlds by blending optimized near-term predictions (ARTXP) and stable long-term predictions (ARIMA) with Better Time Series Support • Discover the relationship between items that are frequently purchased together by using Shopping Basket Analysis; generate interactive forms for scoring new cases with Predictive Calculator, delivered with Microsoft SQL Server 2008 Data Mining Add-ins for Office 2007
  • 9. Rich and Innovative Algorithms • Benefit from many rich and innovative data mining algorithms, most developed by Microsoft Research to support common business problems promptly and accurately. • Market Basket Analysis - Discover which items tend to be bought together to create recommendations on- the-fly and to determine how product placement can directly contribute to your bottom line • Churn Analysis - Anticipate customers who may be considering canceling their service and identify benefits that will keep them from leaving • Market Analysis - Define market segments by automatically grouping similar customers together. Use these segments to seek profitable customers • Forecasting - Predict sales and inventory amounts and learn how they are interrelated to foresee bottlenecks and improve performance • Data Exploration - Analyze profitability across customers, or compare customers who prefer different brands of the same product to discover new opportunities • Unsupervised Learning - Identify previously unknown relationships between various elements of your business to better inform your decisions • Web Site Analysis - Understand how people use your Web site and group similar usage patterns to offer a better experience • Campaign Analysis - Spend marketing dollars more effectively by targeting the customers most likely to respond to a promotion • Information Quality - Identify and handle anomalies during data entry or data loading to improve the quality of information • Text Analysis - Analyze feedback to find common themes and trends that concern your customers or employees, informing decisions with unstructured input
  • 10. Value of Data Mining Business Knowledge SQL Server 2008 Business value Data Mining OLAP Reports (Adhoc) Reports (static) Simple Complex Usability
  • 11. Data Mining User Interface • SQL Server BI Development Studio – Environment for creation and data exploration – Data Mining projects in Visual Studio solutions, tightly integrated – Source Control Integration • SQL Server Management Studio – One tool for all administrative tasks – Manage, view and query mining models
  • 12. BI Integration • Integration Services – Data Mining processing and results integrate directly in IS pipeline • OLAP – Processing of mining models directly from cubes – Use of mining results as dimensions • Reporting Services – Embed Data Mining results directly in Reporting Services Reports
  • 13. Applied Data Mining • Make Decisions without Coding – Learn business rules directly from data • Client Customization – Learn logic customized for each client • Automatic Update – Data mining application logic updated by model re- processing – Applications do not need to be rewritten, recompiled, re- deployed
  • 14. Server Mining Architecture BI Dev Your Application Studio (Visual Studio) OLE DB/ ADOMD/ XMLA App Deploy Data Analysis Services Mining Model Server Data Mining Algorithm Data Source
  • 15. Data Mining EXtensions • OLE DB for Data Mining specification – Now part of XML/A specification – See www.xmla.org for XML/A details • Connect to Analysis Server – OLEDB, ADO, ADO.Net, ADOMD.Net, XMLA Dim cmd as ADOMD.Command Dim reader as ADOMD.DataReader Cmd.Connection = conn Set reader = Cmd.ExecuteReader(“Select Predict(Gender)…”)
  • 16. Typical DM Process Using DMX Define a model: CREATE MINING MODEL …. Data Mining Train a model: Management System INSERT INTO dmm …. (DMMS) Training Data Prediction using a model: Mining Model SELECT … FROM dmm PREDICTION JOIN … Prediction Input Data
  • 17. DMX Commands • Definition (DDL) – CREATE – Make new model – SELECT INTO – Create model by copying existing – EXPORT – Save model as .abf file – IMPORT – Retrieve model from .abf file • Manipulation (DML) – INSERT INTO – Train model – UPDATE – Change content of model – DELETE – Clear content – SELECT – Browse model
  • 18. DMX SELECT Elements • SELECT [FLATTENED] [TOP] <columns> • FROM <model> • PREDICTION JOIN <table> • ON <mapping> • WHERE <filter> • ORDER BY <sort expression> – Use query builder to create SELECT statement
  • 19. Training a DM Model: Simple INSERT INTO CollegePlanModel (StudentID, Gender, ParentIncome, Encouragement, CollegePlans) OPENROWSET(‘<provider>’, ‘<connection>’, ‘SELECT StudentID, Gender, ParentIncome, Encouragement, CollegePlans FROM CollegePlansTrainData’)
  • 20. Prediction Using a DM Model • PREDICTION JOIN SELECT t.ID, CPModel.Plan FROM CPModel PREDICTION JOIN OPENQUERY(…,„SELECT * FROM NewStudents‟) AS t ON CPModel.Gender = t.Gender AND CPModel.IQ = t.IQ
  • 21. Visit more self help tutorials • Pick a tutorial of your choice and browse through it at your own pace. • The tutorials section is free, self-guiding and will not involve any additional support. • Visit us at www.dataminingtools.net