SlideShare a Scribd company logo
Data Mining with Excel 2007
and SQL Server 2008

Mark Tabladillo Ph.D.
http://www.marktab.net
November 10, 2008
Approach of this Presentation
      • Emphasize
             – Conceptual value of data mining
             – Relationship of data mining to the real world
      • Reserve
             – Specific procedures and mechanics
             – Specific mathematics
             – Production implementation


© 2008 Mark Tabladillo Ph.D.                                   2
Introduction
      • Microsoft Data Mining (MDM) is a major
        branch of SQL Server Analysis Services (SSAS)
      • The technology is supported by a new
        language within SSAS called DMX (Data
        Mining Extensions)
      • Currently, the two promoted interfaces are
        BIDS (Business Intelligence Development
        Studio) and Excel 2007

© 2008 Mark Tabladillo Ph.D.                            3
Introduction
      • SQL Server 2008 has some improvements over
        2005, but the main technology is similar
      • A major improvement for 2008 is the
        documentation (Books Online)
      • Microsoft’s team releases technology
        information at
        http://www.sqlserverdatamining.com


© 2008 Mark Tabladillo Ph.D.                         4
Outline
      •    Main Conclusions on Data Mining
      •    Data Mining Definition
      •    Microsoft Data Mining Fundamentals
      •    Overview of Microsoft Data Mining Algorithms
      •    Conclusion




© 2008 Mark Tabladillo Ph.D.                              5
Four Interactive Demos
      •    Card Sorting
      •    Demographic Profiles
      •    Sports (College Football)
      •    Money (American Economy)




© 2008 Mark Tabladillo Ph.D.                       6
Data Mining Definitions
      • Data mining is the automatic or semi-
        automatic process of exploring data for
        meaningful or useful patterns.
      • Data mining algorithms typically use
        estimation or optimization to achieve results
        (as opposed to only calculations).




© 2008 Mark Tabladillo Ph.D.                            7
Data Mining Provides Insight
      • Business
             – What reasons contribute to stock price changes?
             – Why do longer term jobless benefits hit a 25 year
               high?
      • Entertainment
             – Who is more likely to lose a civil lawsuit?
             – How well will new DVD sales do in the next few
               months?


© 2008 Mark Tabladillo Ph.D.                                       8
Data Mining Provides Insight
      • Sports
             – How much should a sports team offer for a proven
               free agent?
             – What factors lead to winning a tennis
               championship?
      • Technology
             – How does Cisco know there are warning signals in
               the tech sector?
             – What is the net loss in losing corporate secrets?
© 2008 Mark Tabladillo Ph.D.                                       9
Data Mining Provides Insight
      • Politics
             – What priorities do American voters have for the
               new President?
             – Why did a certain candidate win or lose a race?
      • Science
             – What factors contribute to ozone holes over the
               Antarctic?
             – Why do we believe that Tyrannosaurus Rex had a
               good sense of smell?
© 2008 Mark Tabladillo Ph.D.                                     10
Functions in Technology
      • Job Titles = Rationalized System to Pay People
        Less or Give them More Responsibility
      • “Engineer”?
      • “Scientist”?




© 2008 Mark Tabladillo Ph.D.                             11
The Scientific Method
      •    (Suppose you are a computer scientist)
      •    Define the question
      •    Gather information and resources (observe)
      •    Form hypothesis
      •    Perform experiment and collect data




© 2008 Mark Tabladillo Ph.D.                            12
The Scientific Method
      • Analyze data – data mining is an option
      • Interpret data and draw conclusions that serve
        as a starting point for new hypothesis
      • Publish results
      • Retest (frequently done by other scientists)




© 2008 Mark Tabladillo Ph.D.                             13
Microsoft Data Mining
      • Microsoft Data Mining refers to Microsoft’s
        specific implementation of certain common
        data mining algorithms for the DMX (Data
        Mining Extensions) language.
      • Also called SQL Server Data Mining, the
        technology is implemented through tools
        rather than through a single, finished
        application interface.

© 2008 Mark Tabladillo Ph.D.                          14
Data Mining Input and Results
      • Data mining input can include continuous
        numeric, categorized (ordinal or nominal), and
        text data.
      • Data mining results consists of a lower
        dimensional model, either describing the
        empirical data (unsupervised), or the
        relationship between named input and output
        attributes (supervised)

© 2008 Mark Tabladillo Ph.D.                             15
Data Explosion




© 2008 Mark Tabladillo Ph.D.                    16
Donald Farmer – May 2008
      "[We don't] have all the functionality of something like a SAS or
          an SPSS, because that's just not our market," he conceded.
      It comes down to a difference of scale, according to Farmer. SAS
          and SPSS typically target larger, more expensive deployments,
          typically with users well-versed in the usage of their tools.
          Microsoft is targeting a different kind of data mining
          consumer: the Excel analyst, for example, who might not have
          much (if any) experience with data mining, predictive
          analytics or statistical analysis, for that matter.




© 2008 Mark Tabladillo Ph.D.                                              17
Donald Farmer – May 2008
      "By the way, I don't mean to say we can't hit the high-end. Within
         Microsoft, we have our own database marketing team. We're
         one of the largest companies in the world. We have a huge
         database marketing team who do classic customer analysis.
         These guys were all SAS users, but when they joined Microsoft,
         they started using our tools. The entire process runs on our
         database, they actually use the Excel [data mining] add-ins to
         do it. It's not that there's nothing they don't miss, [it's that]
         they are able to achieve the same business results using our
         tools.“
      Redmond Magazine – May 7, 2008
      http://redmondmag.com/news/article.asp?EditorialsID=9836

© 2008 Mark Tabladillo Ph.D.                                                 18
Obtaining the Add-in




© 2008 Mark Tabladillo Ph.D.                          19
Obtaining the Add-in (Nov 2008)
            http://www.microsoft.com/sqlserver/2008/en/us/data-mining-addins.aspx




© 2008 Mark Tabladillo Ph.D.                                                        20
System Requirements
      •    Supported Operating Systems: Windows Server 2003 Service Pack 2; Windows Server 2008;
           Windows Vista Service Pack 1; Windows XP Service Pack 3
      •    Microsoft .NET Framework 2.0.
      •    If installing the Table Analysis Tools or Data Mining Client for Excel, Microsoft Office 2007
           with .NET Programmability Support.
           Supported editions of Office 2007 include:
             – Professional
             – Professional Plus
             – Ultimate
             – Enterprise
      •    If installing the Data Mining Templates for Visio, Microsoft Visio Professional 2007 with .NET
           Programmability Support.
      •    40 MB of available hard disk space.
      •    Note: The Data Mining Add-ins require a connection to one of the following versions of SQL
           Server 2008 Analysis Services:
             – Enterprise
             – Standard


© 2008 Mark Tabladillo Ph.D.                                                                                21
Delivering Predictive Analysis to Every User

      • Comprehensive
             – Extend the benefits of predictive analysis to all users, delivering a full
               data mining development life cycle through the familiar environment
               of the 2007 Microsoft Office system.
      • Intuitive
             – Empower users to harness advanced data mining technologies, hiding
               complexity behind automated tasks that deliver actionable insight
               throughout the organization.
      • Collaborative
             – Share data mining models through interactive graphical visualizations,
               and deliver recommendation and insight with simple and prompt
               publishing capabilities.

© 2008 Mark Tabladillo Ph.D.                                                                22
Top New Features
      • Score new cases to seek most profitable customers
        with new Prediction Calculator.
      • Discover cross-sell/up-sell opportunities to optimize
        offerings with new Shopping Basket Analysis.
      • Validate accuracy and stability of models
        simultaneously with new, richly formatted Cross
        Validation.
      • Generate summary reports to enhance referencing
        and collaboration with the new Document Model
        feature.

© 2008 Mark Tabladillo Ph.D.                                    23
SQL Server 2008 Menu Items




© 2008 Mark Tabladillo Ph.D.                 24
Asking Permission




© 2008 Mark Tabladillo Ph.D.                       25
Asking Permission Text
      DBA Person,
          I have downloaded and installed Microsoft SQL Server 2008 Data Mining Add-ins
          for Office 2007 on my machine ARCHITECT. These add-ins let me analyze my
          spreadsheet data in powerful ways by utilizing Microsoft SQL Server 2008 Analysis
          Services.
      In order to use these add-ins, I will need to be connected to an instance of Microsoft
          SQL Server 2008 Analysis Services that has been configured to support the add-
          ins. This configuration needs to be carried out by an administrator by following
          these steps:
      1. Download the add-ins package from
          http://www.microsoft.com/sqlserver/2008/en/us/trial-software.aspx.
          2. Launch the Setup, select the Server Configuration Tool and install it.
          3. Run the Server Configuration Tool and follow the wizard steps.
      I would appreciate it if you could let me know whether it is possible for you to
          configure an instance of SQL Server 2008 Analysis Services as described above and
          give me access to it.
      Thank you,
          Data Miner
© 2008 Mark Tabladillo Ph.D.                                                                   26
What is a model?




© 2008 Mark Tabladillo Ph.D.                      27
List the Data Mining Algorithms
      • Ten Answers
      • Each one is a field of academic focus




© 2008 Mark Tabladillo Ph.D.                    28
The Data Mining Algorithms
      •    Microsoft Decision Trees
      •    Microsoft Clustering
      •    Microsoft Time Series
      •    Microsoft Association Rules
      •    Microsoft Sequence Clustering
      •    Microsoft Naive Bayes
      •    Microsoft Neural Network
      •    Microsoft Linear Regression
      •    Microsoft Logistic Regression
      •    Text Mining

© 2008 Mark Tabladillo Ph.D.                   29
What is a calculation?
      • Business intelligence relies on many common
        calculations.




© 2008 Mark Tabladillo Ph.D.                            30
A Parable of Unity and Diversity
      • One day a parabola met a line. They each
        wondered aloud how much they had in
        common. They moved around to find out.



               Parabola
                                         Line

© 2008 Mark Tabladillo Ph.D.                       31
The Analyze Tab


                 Menu Option                     Data Mining Algorithm
                 Analyze Key Influencers         Naïve Bayes
                 Detect Categories               Clustering
                 Fill from Example               Logistic Regression
                 Forecast                        Time Series
                 Highlight Exceptions            Clustering
                 Scenario Analysis (Goal Seek)   Logistic Regression
                 Scenario Analysis (What If)     Logistic Regression
                 Prediction Calculator           Logistic Regression
                 Shopping Basket Analysis        Association Rules
© 2008 Mark Tabladillo Ph.D.                                             32
Why Different Button Names?


                 Menu Option                     Data Mining Algorithm
                 Analyze Key Influencers         Naïve Bayes
                 Detect Categories               Clustering
                 Fill from Example               Logistic Regression
                 Forecast                        Time Series
                 Highlight Exceptions            Clustering
                 Scenario Analysis (Goal Seek)   Logistic Regression
                 Scenario Analysis (What If)     Logistic Regression
                 Prediction Calculator           Logistic Regression
                 Shopping Basket Analysis        Association Rules
© 2008 Mark Tabladillo Ph.D.                                             33
The Data Mining Tab


      • The ribbon has different regions:
         • Data Preparation
         • Data Modeling
         • Accuracy and Validation
         • Model Usage
         • Management
         • Connection
© 2008 Mark Tabladillo Ph.D.                         34
Demo 1: Card Sorting
      • Take the sample of cards you have and put
        them into one or more groups. Write in the
        area below what your groups are.




© 2008 Mark Tabladillo Ph.D.                          35
Demo 2: Demographic Profiles
      • Exercise 1. We will assume that each of the 10
        listed people uses SQL Server technology as
        some part of their job. For the column
        marked “UserGroup”, write in YES (and NO
        otherwise) for people you believe would be
        interested in future SQL Server user group
        meetings.



© 2008 Mark Tabladillo Ph.D.                             36
Demo 2: Demographic Profiles
      • Exercise 2: Assume an average house in your
        neighborhood or area is for sale. For the
        column marked “NewNeighbors”, write in YES
        (and NO otherwise) for people you believe
        might be a potential buyer for that average
        home.




© 2008 Mark Tabladillo Ph.D.                          37
What is unsupervised?
      • Model of the empirical data.




© 2008 Mark Tabladillo Ph.D.                       38
What is supervised?
      • Model of the process between input and
        output attributes.




© 2008 Mark Tabladillo Ph.D.                         39
Scientific Progress
      • Why might two scientists come to slightly or
        widely different conclusions?




© 2008 Mark Tabladillo Ph.D.                           40
Demo 3: Sports
      • Look at page 8C with the USA Today Coaches
        Poll. Based on this list (and other information
        on college football on this page) do you
        completely agree with the rankings? Why or
        why not?




© 2008 Mark Tabladillo Ph.D.                              41
Demo 4: Money
      • Look at page 6B with the USA Today Market
        Trends. Choose three specific pieces of
        information on this chart which, to you,
        illustrate the current state of the American
        Economy.




© 2008 Mark Tabladillo Ph.D.                           42
Wittgenstein’s Duck-Rabbit




© 2008 Mark Tabladillo Ph.D.                     43
Data Mining Examples Tour




© 2008 Mark Tabladillo Ph.D.                   44
Data Mining
      •    “Data” precedes “Mining”
      •    “Data” – when is it easier?
      •    “Data” – when is it harder?
      •    “Mining” – when is it easier?
      •    “Mining” – when is it harder?




© 2008 Mark Tabladillo Ph.D.                 45
Regroup and Conclusion
      • Main Points from this Presentation




© 2008 Mark Tabladillo Ph.D.                     46
Resources
     •    Microsoft SQL Server 2008
          http://www.microsoft.com/sqlserver/2008/en/us/data-mining.aspx
     •    SQL Server Data Mining
          http://www.sqlserverdatamining.com/ssdm/default.aspx
     •    Adventure Works Tutorial – “SQL Server 2005 Data Mining Tutorial
          http://www.sqlserverdatamining.com/ssdm/Home/Tutorials/tabid/57/Default.aspx
     •    MSDN Forums (“Katmai” = 2008, “SQL Server” = 2005 and before)
          http://forums.microsoft.com/MSDN/default.aspx?SiteID=1
     •    Data Mining with Microsoft SQL Server 2008 (Coming November 17, 2008)
          by Jamie MacLennan (Author), ZhaoHui Tang (Author), Bogdan Crivat (Author)
     •    Smart Business Intelligence Solutions with Microsoft® SQL Server® 2008 (PRO-Developer)
          (Coming February 4, 2009)
          by Lynn Langit (Author), Matthew Roche (Author)
     •    KD Nuggets (Data Mining and Knowledge Discovery Portal)
          http://www.kdnuggets.com/
     •    Association of Computing Machinery
          http://www.acm.org/



© 2008 Mark Tabladillo Ph.D.                                                                       47
Contact Information
      • Data Mining Portal and Blog
        http://marktab.net

      • Twitter: @marktabnet
      • Also on:
        Linked In
        Facebook


© 2008 Mark Tabladillo Ph.D.                         48

More Related Content

What's hot

A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
millerca2
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
Sushil Kulkarni
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
Pragya Pandey
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
Hadi Fadlallah
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
Devakumar Jain
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
Ali Abbasi
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
Motaz Saad
 
Lecture 01 Data Mining
Lecture 01 Data MiningLecture 01 Data Mining
Lecture 01 Data Mining
Pier Luca Lanzi
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
Data mining
Data miningData mining
Data mining
imran khan
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
Mahmoud Alfarra
 
01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI
Achmad Solichin
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
neelamoberoi1030
 
Ch 1 intro_dw
Ch 1 intro_dwCh 1 intro_dw
Ch 1 intro_dw
Sushil Kulkarni
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
GDi Techno Solutions
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
Dr. Abdul Ahad Abro
 
Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
FellowBuddy.com
 

What's hot (20)

A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Data mining
Data miningData mining
Data mining
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Lecture 01 Data Mining
Lecture 01 Data MiningLecture 01 Data Mining
Lecture 01 Data Mining
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
Data mining
Data miningData mining
Data mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
 
01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Ch 1 intro_dw
Ch 1 intro_dwCh 1 intro_dw
Ch 1 intro_dw
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Data mining
Data miningData mining
Data mining
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 

Viewers also liked

SQL Server 2005 Everywhere Edition Value Proposition
SQL Server 2005 Everywhere Edition Value PropositionSQL Server 2005 Everywhere Edition Value Proposition
SQL Server 2005 Everywhere Edition Value Propositionbutest
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
editorijettcs
 
Data mining test notes (back)
Data mining test notes (back)Data mining test notes (back)
Data mining test notes (back)
BarryK88
 
WEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow InterfaceWEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow Interface
weka Content
 
Basics of Ms Excel Office Tool
Basics of Ms Excel Office ToolBasics of Ms Excel Office Tool
Basics of Ms Excel Office Tool
Subodh Shahare(सुबोध शहारे)
 
WEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And AttributesWEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And Attributes
weka Content
 
WEKA - A Data Mining Tool - by Shareek Ahamed
WEKA - A Data Mining Tool - by Shareek AhamedWEKA - A Data Mining Tool - by Shareek Ahamed
WEKA - A Data Mining Tool - by Shareek Ahamed
Shareek Ahamed
 
rule-based classifier
rule-based classifierrule-based classifier
rule-based classifierSean Chiu
 
Data mining assignment 3
Data mining assignment 3Data mining assignment 3
Data mining assignment 3
BarryK88
 
Wek1
Wek1Wek1
Weka project - Classification & Association Rule Generation
Weka project - Classification & Association Rule GenerationWeka project - Classification & Association Rule Generation
Weka project - Classification & Association Rule Generation
rsathishwaran
 
HCI - Group Report for Metrolink App
HCI - Group Report for Metrolink AppHCI - Group Report for Metrolink App
HCI - Group Report for Metrolink AppDarran Mottershead
 
Data Engineering - Data Mining Assignment
Data Engineering - Data Mining AssignmentData Engineering - Data Mining Assignment
Data Engineering - Data Mining AssignmentDarran Mottershead
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
Krish_ver2
 
McDonald's Supply Chain
McDonald's Supply ChainMcDonald's Supply Chain
McDonald's Supply Chain
Harsh Narula
 
Ejemplo de Aplicaciones en Weka
Ejemplo de Aplicaciones en WekaEjemplo de Aplicaciones en Weka
Ejemplo de Aplicaciones en Weka
Raquel Solano
 
Notes On Intranet Implementation And Roadmap
Notes On Intranet Implementation And RoadmapNotes On Intranet Implementation And Roadmap
Notes On Intranet Implementation And RoadmapAlan McSweeney
 

Viewers also liked (20)

SQL Server 2005 Everywhere Edition Value Proposition
SQL Server 2005 Everywhere Edition Value PropositionSQL Server 2005 Everywhere Edition Value Proposition
SQL Server 2005 Everywhere Edition Value Proposition
 
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONSEXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
EXPLORING DATA MINING TECHNIQUES AND ITS APPLICATIONS
 
Data mining test notes (back)
Data mining test notes (back)Data mining test notes (back)
Data mining test notes (back)
 
2014 Profile of Results
2014 Profile of Results2014 Profile of Results
2014 Profile of Results
 
WEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow InterfaceWEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow Interface
 
Basics of Ms Excel Office Tool
Basics of Ms Excel Office ToolBasics of Ms Excel Office Tool
Basics of Ms Excel Office Tool
 
Excel 2007 Get Up To Speed
Excel 2007  Get Up To SpeedExcel 2007  Get Up To Speed
Excel 2007 Get Up To Speed
 
WEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And AttributesWEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And Attributes
 
WEKA - A Data Mining Tool - by Shareek Ahamed
WEKA - A Data Mining Tool - by Shareek AhamedWEKA - A Data Mining Tool - by Shareek Ahamed
WEKA - A Data Mining Tool - by Shareek Ahamed
 
rule-based classifier
rule-based classifierrule-based classifier
rule-based classifier
 
Data mining assignment 3
Data mining assignment 3Data mining assignment 3
Data mining assignment 3
 
Wek1
Wek1Wek1
Wek1
 
Weka project - Classification & Association Rule Generation
Weka project - Classification & Association Rule GenerationWeka project - Classification & Association Rule Generation
Weka project - Classification & Association Rule Generation
 
HCI - Group Report for Metrolink App
HCI - Group Report for Metrolink AppHCI - Group Report for Metrolink App
HCI - Group Report for Metrolink App
 
Data Engineering - Data Mining Assignment
Data Engineering - Data Mining AssignmentData Engineering - Data Mining Assignment
Data Engineering - Data Mining Assignment
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
McDonald's Supply Chain
McDonald's Supply ChainMcDonald's Supply Chain
McDonald's Supply Chain
 
Ejemplo de Aplicaciones en Weka
Ejemplo de Aplicaciones en WekaEjemplo de Aplicaciones en Weka
Ejemplo de Aplicaciones en Weka
 
Ms excel
Ms excelMs excel
Ms excel
 
Notes On Intranet Implementation And Roadmap
Notes On Intranet Implementation And RoadmapNotes On Intranet Implementation And Roadmap
Notes On Intranet Implementation And Roadmap
 

Similar to Data Mining With Excel 2007 And SQL Server 2008

Data mining with excel 2010 and power pivot
Data mining with excel 2010 and power pivotData mining with excel 2010 and power pivot
Data mining with excel 2010 and power pivotigsc
 
Data Mining with Excel 2010 and PowerPivot
Data Mining with Excel 2010 and PowerPivotData Mining with Excel 2010 and PowerPivot
Data Mining with Excel 2010 and PowerPivot
Mark Tabladillo
 
Enteprise Data Mining with SQL Server by Mark Tabladillo
Enteprise Data Mining with SQL Server by Mark TabladilloEnteprise Data Mining with SQL Server by Mark Tabladillo
Enteprise Data Mining with SQL Server by Mark Tabladillo
Felipe Ferreira
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
Prof.Balakrishnan S
 
Building Data Science Ecosystems for Smart Cities and Smart Commerce
Building Data Science Ecosystems for Smart Cities and Smart CommerceBuilding Data Science Ecosystems for Smart Cities and Smart Commerce
Building Data Science Ecosystems for Smart Cities and Smart Commerce
Alex Liu
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Denodo
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
Denodo
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
Denodo
 
Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01
Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01
Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01
Soujanya V
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Daniel Zivkovic
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Denodo
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Denodo
 
Big data webinar may23 nrit by sunil
Big data webinar may23 nrit by sunilBig data webinar may23 nrit by sunil
Big data webinar may23 nrit by sunilSujit Ghosh
 
Die Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AIDie Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AI
Denodo
 
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015 Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Vladi Vexler
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation
 
Agile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceAgile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational Intelligence
Inside Analysis
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
Mithlesh Sadh
 
Unified approach to analytics
Unified approach to analyticsUnified approach to analytics
Unified approach to analytics
Madhumita Mantri
 

Similar to Data Mining With Excel 2007 And SQL Server 2008 (20)

Data mining with excel 2010 and power pivot
Data mining with excel 2010 and power pivotData mining with excel 2010 and power pivot
Data mining with excel 2010 and power pivot
 
Data Mining with Excel 2010 and PowerPivot
Data Mining with Excel 2010 and PowerPivotData Mining with Excel 2010 and PowerPivot
Data Mining with Excel 2010 and PowerPivot
 
Enteprise Data Mining with SQL Server by Mark Tabladillo
Enteprise Data Mining with SQL Server by Mark TabladilloEnteprise Data Mining with SQL Server by Mark Tabladillo
Enteprise Data Mining with SQL Server by Mark Tabladillo
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Building Data Science Ecosystems for Smart Cities and Smart Commerce
Building Data Science Ecosystems for Smart Cities and Smart CommerceBuilding Data Science Ecosystems for Smart Cities and Smart Commerce
Building Data Science Ecosystems for Smart Cities and Smart Commerce
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01
Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01
Bigdataissueschallengestoolsngoodpractices 141130054740-conversion-gate01
 
Tamilarasu_Uthirasamy_10Yrs_Resume
Tamilarasu_Uthirasamy_10Yrs_ResumeTamilarasu_Uthirasamy_10Yrs_Resume
Tamilarasu_Uthirasamy_10Yrs_Resume
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Big data webinar may23 nrit by sunil
Big data webinar may23 nrit by sunilBig data webinar may23 nrit by sunil
Big data webinar may23 nrit by sunil
 
Die Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AIDie Big Data Fabric als Enabler für Machine Learning & AI
Die Big Data Fabric als Enabler für Machine Learning & AI
 
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015 Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
Data Modeling and Scale Out - ScaleBase + 451-Group webinar 30.4.2015
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
Agile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceAgile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational Intelligence
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Unified approach to analytics
Unified approach to analyticsUnified approach to analytics
Unified approach to analytics
 

More from Mark Tabladillo

How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006
Mark Tabladillo
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap
Mark Tabladillo
 
201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for Developers
Mark Tabladillo
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
Mark Tabladillo
 
201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0
Mark Tabladillo
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
Mark Tabladillo
 
201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML
Mark Tabladillo
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
Mark Tabladillo
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
Mark Tabladillo
 
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
Mark Tabladillo
 
Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904
Mark Tabladillo
 
Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904
Mark Tabladillo
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
Mark Tabladillo
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
Mark Tabladillo
 
Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808
Mark Tabladillo
 
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Mark Tabladillo
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
Mark Tabladillo
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
Mark Tabladillo
 
How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610
Mark Tabladillo
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016
Mark Tabladillo
 

More from Mark Tabladillo (20)

How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap
 
201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for Developers
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
 
201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
 
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
 
Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904
 
Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808
 
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
 
How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016
 

Recently uploaded

Attending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learnersAttending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learners
Erika906060
 
PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop.com LTD
 
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
BBPMedia1
 
Pitch Deck Teardown: RAW Dating App's $3M Angel deck
Pitch Deck Teardown: RAW Dating App's $3M Angel deckPitch Deck Teardown: RAW Dating App's $3M Angel deck
Pitch Deck Teardown: RAW Dating App's $3M Angel deck
HajeJanKamps
 
Memorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.pptMemorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.ppt
seri bangash
 
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
PaulBryant58
 
chapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxationchapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxation
AUDIJEAngelo
 
Unveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdfUnveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdf
Sam H
 
The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...
awaisafdar
 
Cracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptxCracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptx
Workforce Group
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
Cynthia Clay
 
CADAVER AS OUR FIRST TEACHER anatomt in your.pptx
CADAVER AS OUR FIRST TEACHER anatomt in your.pptxCADAVER AS OUR FIRST TEACHER anatomt in your.pptx
CADAVER AS OUR FIRST TEACHER anatomt in your.pptx
fakeloginn69
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
Ben Wann
 
BeMetals Presentation_May_22_2024 .pdf
BeMetals Presentation_May_22_2024   .pdfBeMetals Presentation_May_22_2024   .pdf
BeMetals Presentation_May_22_2024 .pdf
DerekIwanaka1
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
LR1709MUSIC
 
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-indiafalcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
Falcon Invoice Discounting
 
Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)
Lviv Startup Club
 
Lookback Analysis
Lookback AnalysisLookback Analysis
Lookback Analysis
Safe PaaS
 
The-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic managementThe-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic management
Bojamma2
 
Filing Your Delaware Franchise Tax A Detailed Guide
Filing Your Delaware Franchise Tax A Detailed GuideFiling Your Delaware Franchise Tax A Detailed Guide
Filing Your Delaware Franchise Tax A Detailed Guide
YourLegal Accounting
 

Recently uploaded (20)

Attending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learnersAttending a job Interview for B1 and B2 Englsih learners
Attending a job Interview for B1 and B2 Englsih learners
 
PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024PriyoShop Celebration Pohela Falgun Mar 20, 2024
PriyoShop Celebration Pohela Falgun Mar 20, 2024
 
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
RMD24 | Retail media: hoe zet je dit in als je geen AH of Unilever bent? Heid...
 
Pitch Deck Teardown: RAW Dating App's $3M Angel deck
Pitch Deck Teardown: RAW Dating App's $3M Angel deckPitch Deck Teardown: RAW Dating App's $3M Angel deck
Pitch Deck Teardown: RAW Dating App's $3M Angel deck
 
Memorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.pptMemorandum Of Association Constitution of Company.ppt
Memorandum Of Association Constitution of Company.ppt
 
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...
 
chapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxationchapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxation
 
Unveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdfUnveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdf
 
The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...
 
Cracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptxCracking the Workplace Discipline Code Main.pptx
Cracking the Workplace Discipline Code Main.pptx
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
 
CADAVER AS OUR FIRST TEACHER anatomt in your.pptx
CADAVER AS OUR FIRST TEACHER anatomt in your.pptxCADAVER AS OUR FIRST TEACHER anatomt in your.pptx
CADAVER AS OUR FIRST TEACHER anatomt in your.pptx
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
 
BeMetals Presentation_May_22_2024 .pdf
BeMetals Presentation_May_22_2024   .pdfBeMetals Presentation_May_22_2024   .pdf
BeMetals Presentation_May_22_2024 .pdf
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
 
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-indiafalcon-invoice-discounting-a-premier-platform-for-investors-in-india
falcon-invoice-discounting-a-premier-platform-for-investors-in-india
 
Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)Maksym Vyshnivetskyi: PMO Quality Management (UA)
Maksym Vyshnivetskyi: PMO Quality Management (UA)
 
Lookback Analysis
Lookback AnalysisLookback Analysis
Lookback Analysis
 
The-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic managementThe-McKinsey-7S-Framework. strategic management
The-McKinsey-7S-Framework. strategic management
 
Filing Your Delaware Franchise Tax A Detailed Guide
Filing Your Delaware Franchise Tax A Detailed GuideFiling Your Delaware Franchise Tax A Detailed Guide
Filing Your Delaware Franchise Tax A Detailed Guide
 

Data Mining With Excel 2007 And SQL Server 2008

  • 1. Data Mining with Excel 2007 and SQL Server 2008 Mark Tabladillo Ph.D. http://www.marktab.net November 10, 2008
  • 2. Approach of this Presentation • Emphasize – Conceptual value of data mining – Relationship of data mining to the real world • Reserve – Specific procedures and mechanics – Specific mathematics – Production implementation © 2008 Mark Tabladillo Ph.D. 2
  • 3. Introduction • Microsoft Data Mining (MDM) is a major branch of SQL Server Analysis Services (SSAS) • The technology is supported by a new language within SSAS called DMX (Data Mining Extensions) • Currently, the two promoted interfaces are BIDS (Business Intelligence Development Studio) and Excel 2007 © 2008 Mark Tabladillo Ph.D. 3
  • 4. Introduction • SQL Server 2008 has some improvements over 2005, but the main technology is similar • A major improvement for 2008 is the documentation (Books Online) • Microsoft’s team releases technology information at http://www.sqlserverdatamining.com © 2008 Mark Tabladillo Ph.D. 4
  • 5. Outline • Main Conclusions on Data Mining • Data Mining Definition • Microsoft Data Mining Fundamentals • Overview of Microsoft Data Mining Algorithms • Conclusion © 2008 Mark Tabladillo Ph.D. 5
  • 6. Four Interactive Demos • Card Sorting • Demographic Profiles • Sports (College Football) • Money (American Economy) © 2008 Mark Tabladillo Ph.D. 6
  • 7. Data Mining Definitions • Data mining is the automatic or semi- automatic process of exploring data for meaningful or useful patterns. • Data mining algorithms typically use estimation or optimization to achieve results (as opposed to only calculations). © 2008 Mark Tabladillo Ph.D. 7
  • 8. Data Mining Provides Insight • Business – What reasons contribute to stock price changes? – Why do longer term jobless benefits hit a 25 year high? • Entertainment – Who is more likely to lose a civil lawsuit? – How well will new DVD sales do in the next few months? © 2008 Mark Tabladillo Ph.D. 8
  • 9. Data Mining Provides Insight • Sports – How much should a sports team offer for a proven free agent? – What factors lead to winning a tennis championship? • Technology – How does Cisco know there are warning signals in the tech sector? – What is the net loss in losing corporate secrets? © 2008 Mark Tabladillo Ph.D. 9
  • 10. Data Mining Provides Insight • Politics – What priorities do American voters have for the new President? – Why did a certain candidate win or lose a race? • Science – What factors contribute to ozone holes over the Antarctic? – Why do we believe that Tyrannosaurus Rex had a good sense of smell? © 2008 Mark Tabladillo Ph.D. 10
  • 11. Functions in Technology • Job Titles = Rationalized System to Pay People Less or Give them More Responsibility • “Engineer”? • “Scientist”? © 2008 Mark Tabladillo Ph.D. 11
  • 12. The Scientific Method • (Suppose you are a computer scientist) • Define the question • Gather information and resources (observe) • Form hypothesis • Perform experiment and collect data © 2008 Mark Tabladillo Ph.D. 12
  • 13. The Scientific Method • Analyze data – data mining is an option • Interpret data and draw conclusions that serve as a starting point for new hypothesis • Publish results • Retest (frequently done by other scientists) © 2008 Mark Tabladillo Ph.D. 13
  • 14. Microsoft Data Mining • Microsoft Data Mining refers to Microsoft’s specific implementation of certain common data mining algorithms for the DMX (Data Mining Extensions) language. • Also called SQL Server Data Mining, the technology is implemented through tools rather than through a single, finished application interface. © 2008 Mark Tabladillo Ph.D. 14
  • 15. Data Mining Input and Results • Data mining input can include continuous numeric, categorized (ordinal or nominal), and text data. • Data mining results consists of a lower dimensional model, either describing the empirical data (unsupervised), or the relationship between named input and output attributes (supervised) © 2008 Mark Tabladillo Ph.D. 15
  • 16. Data Explosion © 2008 Mark Tabladillo Ph.D. 16
  • 17. Donald Farmer – May 2008 "[We don't] have all the functionality of something like a SAS or an SPSS, because that's just not our market," he conceded. It comes down to a difference of scale, according to Farmer. SAS and SPSS typically target larger, more expensive deployments, typically with users well-versed in the usage of their tools. Microsoft is targeting a different kind of data mining consumer: the Excel analyst, for example, who might not have much (if any) experience with data mining, predictive analytics or statistical analysis, for that matter. © 2008 Mark Tabladillo Ph.D. 17
  • 18. Donald Farmer – May 2008 "By the way, I don't mean to say we can't hit the high-end. Within Microsoft, we have our own database marketing team. We're one of the largest companies in the world. We have a huge database marketing team who do classic customer analysis. These guys were all SAS users, but when they joined Microsoft, they started using our tools. The entire process runs on our database, they actually use the Excel [data mining] add-ins to do it. It's not that there's nothing they don't miss, [it's that] they are able to achieve the same business results using our tools.“ Redmond Magazine – May 7, 2008 http://redmondmag.com/news/article.asp?EditorialsID=9836 © 2008 Mark Tabladillo Ph.D. 18
  • 19. Obtaining the Add-in © 2008 Mark Tabladillo Ph.D. 19
  • 20. Obtaining the Add-in (Nov 2008) http://www.microsoft.com/sqlserver/2008/en/us/data-mining-addins.aspx © 2008 Mark Tabladillo Ph.D. 20
  • 21. System Requirements • Supported Operating Systems: Windows Server 2003 Service Pack 2; Windows Server 2008; Windows Vista Service Pack 1; Windows XP Service Pack 3 • Microsoft .NET Framework 2.0. • If installing the Table Analysis Tools or Data Mining Client for Excel, Microsoft Office 2007 with .NET Programmability Support. Supported editions of Office 2007 include: – Professional – Professional Plus – Ultimate – Enterprise • If installing the Data Mining Templates for Visio, Microsoft Visio Professional 2007 with .NET Programmability Support. • 40 MB of available hard disk space. • Note: The Data Mining Add-ins require a connection to one of the following versions of SQL Server 2008 Analysis Services: – Enterprise – Standard © 2008 Mark Tabladillo Ph.D. 21
  • 22. Delivering Predictive Analysis to Every User • Comprehensive – Extend the benefits of predictive analysis to all users, delivering a full data mining development life cycle through the familiar environment of the 2007 Microsoft Office system. • Intuitive – Empower users to harness advanced data mining technologies, hiding complexity behind automated tasks that deliver actionable insight throughout the organization. • Collaborative – Share data mining models through interactive graphical visualizations, and deliver recommendation and insight with simple and prompt publishing capabilities. © 2008 Mark Tabladillo Ph.D. 22
  • 23. Top New Features • Score new cases to seek most profitable customers with new Prediction Calculator. • Discover cross-sell/up-sell opportunities to optimize offerings with new Shopping Basket Analysis. • Validate accuracy and stability of models simultaneously with new, richly formatted Cross Validation. • Generate summary reports to enhance referencing and collaboration with the new Document Model feature. © 2008 Mark Tabladillo Ph.D. 23
  • 24. SQL Server 2008 Menu Items © 2008 Mark Tabladillo Ph.D. 24
  • 25. Asking Permission © 2008 Mark Tabladillo Ph.D. 25
  • 26. Asking Permission Text DBA Person, I have downloaded and installed Microsoft SQL Server 2008 Data Mining Add-ins for Office 2007 on my machine ARCHITECT. These add-ins let me analyze my spreadsheet data in powerful ways by utilizing Microsoft SQL Server 2008 Analysis Services. In order to use these add-ins, I will need to be connected to an instance of Microsoft SQL Server 2008 Analysis Services that has been configured to support the add- ins. This configuration needs to be carried out by an administrator by following these steps: 1. Download the add-ins package from http://www.microsoft.com/sqlserver/2008/en/us/trial-software.aspx. 2. Launch the Setup, select the Server Configuration Tool and install it. 3. Run the Server Configuration Tool and follow the wizard steps. I would appreciate it if you could let me know whether it is possible for you to configure an instance of SQL Server 2008 Analysis Services as described above and give me access to it. Thank you, Data Miner © 2008 Mark Tabladillo Ph.D. 26
  • 27. What is a model? © 2008 Mark Tabladillo Ph.D. 27
  • 28. List the Data Mining Algorithms • Ten Answers • Each one is a field of academic focus © 2008 Mark Tabladillo Ph.D. 28
  • 29. The Data Mining Algorithms • Microsoft Decision Trees • Microsoft Clustering • Microsoft Time Series • Microsoft Association Rules • Microsoft Sequence Clustering • Microsoft Naive Bayes • Microsoft Neural Network • Microsoft Linear Regression • Microsoft Logistic Regression • Text Mining © 2008 Mark Tabladillo Ph.D. 29
  • 30. What is a calculation? • Business intelligence relies on many common calculations. © 2008 Mark Tabladillo Ph.D. 30
  • 31. A Parable of Unity and Diversity • One day a parabola met a line. They each wondered aloud how much they had in common. They moved around to find out. Parabola Line © 2008 Mark Tabladillo Ph.D. 31
  • 32. The Analyze Tab Menu Option Data Mining Algorithm Analyze Key Influencers Naïve Bayes Detect Categories Clustering Fill from Example Logistic Regression Forecast Time Series Highlight Exceptions Clustering Scenario Analysis (Goal Seek) Logistic Regression Scenario Analysis (What If) Logistic Regression Prediction Calculator Logistic Regression Shopping Basket Analysis Association Rules © 2008 Mark Tabladillo Ph.D. 32
  • 33. Why Different Button Names? Menu Option Data Mining Algorithm Analyze Key Influencers Naïve Bayes Detect Categories Clustering Fill from Example Logistic Regression Forecast Time Series Highlight Exceptions Clustering Scenario Analysis (Goal Seek) Logistic Regression Scenario Analysis (What If) Logistic Regression Prediction Calculator Logistic Regression Shopping Basket Analysis Association Rules © 2008 Mark Tabladillo Ph.D. 33
  • 34. The Data Mining Tab • The ribbon has different regions: • Data Preparation • Data Modeling • Accuracy and Validation • Model Usage • Management • Connection © 2008 Mark Tabladillo Ph.D. 34
  • 35. Demo 1: Card Sorting • Take the sample of cards you have and put them into one or more groups. Write in the area below what your groups are. © 2008 Mark Tabladillo Ph.D. 35
  • 36. Demo 2: Demographic Profiles • Exercise 1. We will assume that each of the 10 listed people uses SQL Server technology as some part of their job. For the column marked “UserGroup”, write in YES (and NO otherwise) for people you believe would be interested in future SQL Server user group meetings. © 2008 Mark Tabladillo Ph.D. 36
  • 37. Demo 2: Demographic Profiles • Exercise 2: Assume an average house in your neighborhood or area is for sale. For the column marked “NewNeighbors”, write in YES (and NO otherwise) for people you believe might be a potential buyer for that average home. © 2008 Mark Tabladillo Ph.D. 37
  • 38. What is unsupervised? • Model of the empirical data. © 2008 Mark Tabladillo Ph.D. 38
  • 39. What is supervised? • Model of the process between input and output attributes. © 2008 Mark Tabladillo Ph.D. 39
  • 40. Scientific Progress • Why might two scientists come to slightly or widely different conclusions? © 2008 Mark Tabladillo Ph.D. 40
  • 41. Demo 3: Sports • Look at page 8C with the USA Today Coaches Poll. Based on this list (and other information on college football on this page) do you completely agree with the rankings? Why or why not? © 2008 Mark Tabladillo Ph.D. 41
  • 42. Demo 4: Money • Look at page 6B with the USA Today Market Trends. Choose three specific pieces of information on this chart which, to you, illustrate the current state of the American Economy. © 2008 Mark Tabladillo Ph.D. 42
  • 43. Wittgenstein’s Duck-Rabbit © 2008 Mark Tabladillo Ph.D. 43
  • 44. Data Mining Examples Tour © 2008 Mark Tabladillo Ph.D. 44
  • 45. Data Mining • “Data” precedes “Mining” • “Data” – when is it easier? • “Data” – when is it harder? • “Mining” – when is it easier? • “Mining” – when is it harder? © 2008 Mark Tabladillo Ph.D. 45
  • 46. Regroup and Conclusion • Main Points from this Presentation © 2008 Mark Tabladillo Ph.D. 46
  • 47. Resources • Microsoft SQL Server 2008 http://www.microsoft.com/sqlserver/2008/en/us/data-mining.aspx • SQL Server Data Mining http://www.sqlserverdatamining.com/ssdm/default.aspx • Adventure Works Tutorial – “SQL Server 2005 Data Mining Tutorial http://www.sqlserverdatamining.com/ssdm/Home/Tutorials/tabid/57/Default.aspx • MSDN Forums (“Katmai” = 2008, “SQL Server” = 2005 and before) http://forums.microsoft.com/MSDN/default.aspx?SiteID=1 • Data Mining with Microsoft SQL Server 2008 (Coming November 17, 2008) by Jamie MacLennan (Author), ZhaoHui Tang (Author), Bogdan Crivat (Author) • Smart Business Intelligence Solutions with Microsoft® SQL Server® 2008 (PRO-Developer) (Coming February 4, 2009) by Lynn Langit (Author), Matthew Roche (Author) • KD Nuggets (Data Mining and Knowledge Discovery Portal) http://www.kdnuggets.com/ • Association of Computing Machinery http://www.acm.org/ © 2008 Mark Tabladillo Ph.D. 47
  • 48. Contact Information • Data Mining Portal and Blog http://marktab.net • Twitter: @marktabnet • Also on: Linked In Facebook © 2008 Mark Tabladillo Ph.D. 48