Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Microsoft Data Science Technologies 201608


Published on

Delivered to SQL Saturday Columbus, GA
Microsoft provides several technologies which can be used for casual to serious data science. This presentation provides an authoritative overview of two major categories: products and services. The products include: SQL Server Analysis Services, Excel Add-in for SSAS, Semantic Search, SQL Server R Services, Microsoft R Technologies, and F#. The services include Cortana Intelligence and Bing Predicts. These technologies have been used by the presenter in various companies and industries, and he will be speaking toward how Microsoft uses these technologies today for its largest Azure customers.

Published in: Data & Analytics
  • Be the first to comment

Microsoft Data Science Technologies 201608

  1. 1. Microsoft Technologies for Data Science Mark Tabladillo, Ph.D. Solution Architect (Data Scientist) Microsoft August 2016: SQL Saturday Columbus GA
  2. 2. Networking Interactive
  3. 3.     
  4. 4.          
  5. 5. Terms Definition Data Science Machine Learning Data Mining Applied Statistics the automated or semi- automated process of discovering patterns in data Applied scientific method
  6. 6. data-mining-data-science-software-used.html us/server-cloud/products/sql-server/ us/services/hdinsight/
  7. 7.
  8. 8. Technology Choices SQL SERVER ANALYSIS SERVICES Enterprise Business Intelligence EXCEL ADD-IN FOR SSAS Office 365 Office 2013 or Higher x64 SEMANTIC SEARCH Enterprise Business Intelligence Standard Web Express with Advanced Services MICROSOFT AZURE ML Free (Size Limited) Paid (Web Service): Experiment + Query F# Open Source SQL SERVER R SERVICES SQL Server 2016 or higher
  9. 9. 4351-4434-A78A- 3384CA7515BF/SQL_Server_2016_Deeper_Insights_Across_D ata_White_Paper.pdf
  10. 10. SS SQL AS NoSQL
  11. 11. Data mining add-in for business analysts • Ease of use • Rich data mining • Scalable
  12. 12. Rowset Output with Scores Varchar NVarchar Office PDF
  13. 13. Documents Full-Text Keyword Index “FTI” iFilters Semantic Document Similarity Index “DSI” Semantic Database Semantic Key Phrase Index – Tag Index “TI”
  14. 14. Simplified Chinese British English Portuguese Chinese (Hong Kong SAR, PRC) Spanish Chinese (Singapore) Chinese (Macau SAR)
  15. 15. Time in Seconds vs. Number of Documents (2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
  16. 16. Features Microsoft R Open R Distribution (Free) Microsoft R Client Free Microsoft R Server Commercial Big Data In-memory bound Can only process datasets that fit into the available memory In-memory bound Can process datasets that fit into the available memory Operates on large volumes when connected to R Server Disk scalability Operates on bigger volumes & factors Speed of Analysis Multi-threaded when MKL is installed for non-ScaleR functions Multi-threaded with MKL for non-ScaleR functions Up to 2 threads for ScaleR functions with a local compute context Full parallel threading & processing Enterprise Readiness Community support Community support Commercial support Analytic Breadth & Depth 8000+ open source packages Leverage & optimize open source R packages plus 'Big Data'-ready ScaleR packages Leverage & optimize open source R packages plus 'Big Data'-ready + Multithreaded ready ScaleR packages Commercial Viability Risk of deployment to open source Free for everyone Commercial licenses DeployR Enterprise Not available Not available Included
  17. 17. Microsoft R Server Editions Description Install ScaleR Get Started R Server for Hadoop Scale your analysis transparently by distributing work across nodes without complex programming Doc Doc R Server for Teradata DB Run advanced analytics in- database for seamless data analysis Doc Doc R Server for Linux Bring predictive and prescriptive analytics power to your Linux environments Doc Doc
  18. 18.  
  19. 19. Mutable Immutable Classic Open Source Java Scala .NET Now Open Source C#, C++, VB.NET F#
  20. 20.   
  21. 21.
  22. 22. Capabilities Products Preconfigured solutions •Business scenarios •Forecasting, churn, etc. Intelligence •Integration with Cortana •Bot services •Cognitive services •Cortana •Bot Framework •Cognitive Services Dashboards and visualizations •Dashboards and visualizations •Power BI Machine learning and advanced analytics •Machine learning •Hadoop •Distributed analytics •Complex event processing •Machine Learning •HDInsight (Data Lake service) •Data Lake analytics •Stream Analytics Big data stores •Big Data repository •Elastic data warehouse •Data Lake store, Blobs •SQL Data Warehouse Information management •Data orchestration •Data catalog •Event ingestion •Data Factory •Data catalog •Event Hubs
  23. 23.  
  24. 24. 
  25. 25. 
  26. 26.  https://academy.microso US/professional- degree/data-science/  https://borntolearn.msle announcing-the- microsoft-professional- degree-mpd-program
  27. 27. books.html
  28. 28.     
  29. 29. US/home?forum=MachineLearning videos-february-2015
  30. 30.    
  31. 31.