• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Document Classification using DMX in SQL Server Analysis Services

Document Classification using DMX in SQL Server Analysis Services



Presentation for SQL Saturday Raleigh NC, Septmber 18, 2010 ...

Presentation for SQL Saturday Raleigh NC, Septmber 18, 2010
Overview of using DMX (Data Mining Extensions) in Excel, SSMS (SQL Server Management Studio), BIDS (Business Intelligence Development Studio), and PowerShell



Total Views
Views on SlideShare
Embed Views



3 Embeds 26

http://marktab.net 15
http://www.marktab.net 10
http://www.linkedin.com 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Document Classification using DMX in SQL Server Analysis Services Document Classification using DMX in SQL Server Analysis Services Presentation Transcript

    • DocumentClassification usingDMX in AnalysisServicesMark Tabladillo Ph.D.http://marktab.netSeptember 18, 2010
    • SQL Saturday 46 -- Raleigh NC#sqlsat46 #MarkTabNet © 2010 Mark Tabladillo Ph.D. 2
    • MarkTab & Text Mining © 2010 Mark Tabladillo Ph.D.3
    • © 2010 Mark Tabladillo Ph.D.4
    • Outline © 2010 Mark Tabladillo Ph.D. Tools for DemosText Mining 5
    • Data Mining as a Service © 2010 Mark Tabladillo Ph.D.6
    • Text Mining ProductComparison from 2008 © 2010 Mark Tabladillo Ph.D. 7Feinerer, I., Hornik, K., & Meyer, D. (2008). Text Mining Infrastructure in R. Journal of Statistical Software, 25(5).
    • SQL Server Data MiningActivity HowPreprocess T-SQL; Integration Services; Data Mining Add-In for Excel; .NET programmingAssociate Microsoft Association Rules (algorithm) © 2010 Mark Tabladillo Ph.D.Cluster Microsoft Clustering (algorithm)Summarize Integration Services (Term Extraction, Term Lookup)Categorize Integration ServicesAPI Includes DMX, XMLA, AMO, ADOMD.NET 8
    • APIs for Data Mining Acronym Term Definition DMX Data Mining Extensions SQL-like queries (OLE DB for Data Mining) XMLA Extensible Markup Language for Client communication Analysis protocol © 2010 Mark Tabladillo Ph.D. AMO Analysis Management Objects .NET library to manage Analysis Services ADOMD.NET ActiveX Data Objects .NET Framework data (Multidimensional) for .NET provider 9
    • DMX Tasks• Data Definition • Create, Alter, Drop – Mining Structure • Create, Drop – Mining Model • Export and Import Models• Data Manipulation © 2010 Mark Tabladillo Ph.D. • Query Models, Content, Cases, Sample Cases, Dimension Content 10
    • SQL Server Data MiningApplications (User Interfaces)User Interface ActivityExcel (and PowerPivot for Excel) DMXBIDS (Business Intelligence Analysis Services Project; IntegrationDevelopment Studio) Services Project (T-SQL; DMX; XMLA) © 2010 Mark Tabladillo Ph.D.SSMS (SQL Server Management T-SQL; DMX; XMLAStudio)PowerShell version 2.0 T-SQL; DMX; XMLA AMO; ADOMD.NETSharePoint (Requires Setup or Customization)Your Name Here (Develop Your Own) ? 11
    • Outline © 2010 Mark Tabladillo Ph.D. Tools for DemosText Mining 12
    • Data: Presidential Addresses © 2010 Mark Tabladillo Ph.D. 13 http://www.wiley.com/WileyCDA/WileyTitle/productCd-0470277742,descCd-DOWNLOAD.html
    • Excel• Use the 32-bit Excel add-in for Data Mining • Written for SQL Server 2008, ok for 2008 R2 • Written for Office 2007, ok for 2010• (Optional) Add the free PowerPivot add-in (http://powerpivot.com) © 2010 Mark Tabladillo Ph.D. 14
    • Click to edit Master title style Datasets & Models Public Cloud or On- Premise Private Cloud SQL Server • SQL Server PowerPivot Analysis • Access Data Sources Services • Oracle • Teradata • Sybase • Informix • DB2 • Data Feeds • Text Files ©2010 Predixion Software
    • BIDS• The preferred application for production data mining• Analysis Services Projects • Make Mining Structures and Models • Data Mining for OLAP Cubes • Excellent for Experimentation © 2010 Mark Tabladillo Ph.D.• Integration Services Projects • Term Extraction and Term Lookup Text Mining • Excellent for Production• Reporting Services Projects • Similar to Crystal Reports 16
    • SSMS• Production management and maintenance• Scripts can become stored procedures• T-SQL, DMX, MDX, XMLA © 2010 Mark Tabladillo Ph.D. 17
    • PowerShell• Object-oriented command prompt, now in version 2• Provides complete access to AMO, ADOMD.NET and DMX © 2010 Mark Tabladillo Ph.D. 18
    • Excel in Production• Can create and manage permanent data mining models• Can document data mining models• Can do some preprocessing (ETL) © 2010 Mark Tabladillo Ph.D. 19
    • BIDS in Production• Can create a production workflow with Integration Services projects• Can create production data mining models with Analysis Services projects © 2010 Mark Tabladillo Ph.D. 20
    • SSMS in Production• The standard production user interface for SQL Server• Also the standard production user interface for Analysis Services Databases• Built for • Scripting (T-SQL, MDX, DMX, XMLA) © 2010 Mark Tabladillo Ph.D. • Security • Assembly Registration (Analysis Services) • Stored Procedures (SQL Server) 21
    • PowerShell in Production• Features • Object-oriented • Command window or ISE (Integrated Scripting Environment) • Accesses .NET libraries and WMI (Windows Management Instrumentation) © 2010 Mark Tabladillo Ph.D. • Version two adds event and exception handling 22
    • Resources• MarkTab.NET Blog, links, video resources and information for data mining• Blog: http://marktab.net/datamining © 2010 Mark Tabladillo Ph.D.• Twitter: @MarkTabNet 23
    • Regroup and Conclusion• Main Points from this Presentation © 2010 Mark Tabladillo Ph.D. 24
    • Contact Information• Mark Tabladillo http://marktab.net• Also on: Twitter @marktabnet © 2010 Mark Tabladillo Ph.D. Linked In 25