Data Mining Beyond Adventure Works (Redmond WA 10/3/2009)

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Data Mining Beyond Adventure Works (Redmond WA 10/3/2009) - Presentation Transcript

    1. Data Mining beyond Adventure Works Mark Tabladillo Ph D Ph.D. MTabladillo <(at)> solidq.com October 3 2009 3,
    2. Approach of this Presentation • Emphasize – Conceptual value of data mining – Relationship of data mining to the real world ld • Reserve – Specific procedures and mechanics – Specific mathematics – Production implementation © 2009 Mark Tabladillo Ph.D. 2
    3. Outline • Data Mining Fundamentals • Interactive Demos •CConclusion l i © 2009 Mark Tabladillo Ph.D. 3
    4. Interactive Demos • Sports • Government Forecasting © 2009 Mark Tabladillo Ph.D. 4
    5. Data Mining Definitions • Data mining is the automatic or semi- automatic process of exploring data for t ti f l i d t f meaningful or useful patterns. • Data mining algorithms typically use estimation or optimization to achieve results (as opposed to only calculations). © 2009 Mark Tabladillo Ph.D. 5
    6. Microsoft Data Mining • Microsoft Data Mining refers to Microsoft’s specific implementation of Mi ft’ ifi i l t ti f certain common data mining algorithms for the th DMX (D t Mi i E t (Data Mining Extensions) i ) language. • Also called SQL Server Data Mining, the technology is integrated into SQL Server rather than presented as an independent application. © 2009 Mark Tabladillo Ph.D. 6
    7. Data Mining Tasks • Supervised – Answer known, what is correlated? • Unsupervised – Answer unknown (unspecified), what are the groups? • Forecasting – Given a trend, what is next? , Value Slide © 2009 Mark Tabladillo Ph.D. 7
    8. List the Data Mining Algorithms • Ten Answers • Each one is a field of academic focus © 2009 Mark Tabladillo Ph.D. 8
    9. The Data Mining Algorithms • Microsoft Naive Bayes • Microsoft Linear R Mi ft Li Regression i • Microsoft Decision Trees • Microsoft Time Series • Microsoft Clustering • Microsoft Sequence Clustering • Microsoft Association Rules • Microsoft Neural Networks • Microsoft Logistic Regression • Text Mining © 2009 Mark Tabladillo Ph.D. 9
    10. The Analyze Tab Menu Option Data Mining Algorithm Analyze Key Influencers Naïve Bayes Detect Categories Clustering Fill from Example Logistic Regression Forecast Time Series Highlight Exceptions Clustering Scenario Analysis (Goal Seek) Logistic Regression Scenario Analysis (What If) Logistic Regression Prediction Calculator Logistic Regression Shopping Basket Analysis Association Rules © 2009 Mark Tabladillo Ph.D. 10
    11. Demo One: National League Baseball • Directions: You Y are on the management team for the th tt f th Atlanta Braves. To better serve the team, you hhave b been i t t d b th owner t instructed by the to group the players by considering both their position and th i salary. iti d their l © 2009 Mark Tabladillo Ph.D. 11
    12. Demo One: National League Baseball • The following rules apply: – You must make more than one group – Each group must have at least two players – Players of different position may be in the same group © 2009 Mark Tabladillo Ph.D. 12
    13. Demo One: National League Baseball • Individual attributes can be used to make groups • Historical statistics can be used to group new players • Both supervised and unsupervised p p algorithms can be applied to the same data © 2009 Mark Tabladillo Ph.D. 13
    14. Demo Two: Government Forecasting • Directions: The P id t is ki Th President i asking your opinion on i i how the following numbers will increase over th next f the t few months. B th Because thithis project is sensitive, you do not know what these numbers measure. H th b However, bbasedd on the available history, make your best projection f th next six periods. j ti for the t i i d © 2009 Mark Tabladillo Ph.D. 14
    15. Demo Two: Government Forecasting 8 7 6 5 4 3 2 1 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 © 2009 Mark Tabladillo Ph.D. 15
    16. Demo Two: Government Forecasting 12 10 8 6 4 2 0 Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug 2007 2007 2007 2007 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2009 2009 2009 2009 2009 2009 2009 2009 © 2009 Mark Tabladillo Ph.D. 16
    17. Demo Two: Government Forecasting • Rapid response is as useful as prediction • Seek intelligent correlations among related metrics • Projections depend on time frame – modeling is continual g © 2009 Mark Tabladillo Ph.D. 17
    18. Forecasting Algorithms • Microsoft Time Series Value Slide © 2009 Mark Tabladillo Ph.D. 18
    19. Supervised Algorithms • Microsoft Naive Bayes • Microsoft Linear R Mi ft Li Regression i • Microsoft Decision Trees • Microsoft Neural Networks • Microsoft Logistic Regression Value Slide © 2009 Mark Tabladillo Ph.D. 19
    20. Unsupervised Algorithms • Microsoft Clustering • Microsoft Sequence Clustering Mi ft S Cl t i • Microsoft Association Rules • Text Mining Value Slide © 2009 Mark Tabladillo Ph.D. 20
    21. Resources • MarkTab.NET Links, video resources and information for data mining • Data Mining with Microsoft SQL Server 2008 by Jamie MacLennan (Author), ZhaoHui Tang (Author), Bogdan Crivat (Author) • Smart Business Intelligence Solutions with Microsoft® SQL Server® 2008 (PRO-Developer) by Lynn Langit (Author), Matthew Roche (Author) • Solid Quality Mentors © 2009 Mark Tabladillo Ph.D. 21
    22. Regroup and Conclusion • Main Points from this Presentation © 2009 Mark Tabladillo Ph.D. 22
    23. Contact Information • Mark Tabladillo mtabladillo <{at}> solidq.com t bl dill <{ t}> lid • Also on: Linked In Facebook © 2009 Mark Tabladillo Ph.D. 23
    24. Bonus: Sequence Clustering Ideas • Trading players in professional sports • Assigning l A i i players t certain positions to t i iti • Moving from city to city • Store path at the mall • Cancer treatment path • Taking up a musical instrument • Taking up sports • Blogging • Viral news © 2009 Mark Tabladillo Ph.D. 24

    + Mark TabladilloMark Tabladillo, 2 months ago

    custom

    214 views, 0 favs, 0 embeds more stats

    (Delivered at Redmond WA -- Oct 3, 2009) Microsoft more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 214
      • 214 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 8
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories