Your SlideShare is downloading. ×
The Microsoft BigData Story
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

The Microsoft BigData Story

3,437
views

Published on

deck from my talk at Big Data Tech Con in Boston April 2013

deck from my talk at Big Data Tech Con in Boston April 2013

Published in: Technology

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,437
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
99
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • SSIS Tasks - Lookup transformation - (this for that, substitutions)Cache transformation - (multiple lookups)Fuzzy Lookup - (lookup based on threshold matching)Fuzzy Grouping - (grouping based on thresholds)Data Mining Query - (based on mining model algorithms)DQS Cleansing - (uses a KB)
  • Comparison of features from MSDN -- http://msdn.microsoft.com/en-us/library/hh212940(v=sql.110).aspx
  • Lynn
  • Transcript

    • 1. Microsoft’s BigData Story @LynnLangit April 2013 – Big Data Tech Con
    • 2. Data Expertise / Lynn Langit• Industry awards – Microsoft – MVP for SQL Server – Google – GDE for Cloud Platform – 10Gen – Master for MongoDB• Practicing Architect• Technical author / trainer – Pluralsight – Google Cloud Series – DevelopMentor – SQL Server Series – 2 books on SQL Server BI• Former MSFT FTE – 4 years
    • 3. In a Relationship? BigData NoSQL
    • 4. BigData, NoSQL… => No Microsoft? Big Data => keeping / getting more data • Cheap Storage • Cloud Storage • Open Source data projects (Hadoop) NoSQL => schema-lite, scalable storage • NoSQL data projects • Mostly open source • Sharded replicas
    • 5. In a (Open Source) Relationship? NoSQL Hadoop Cloud MongoDB Neo4j Riak AWS Heroku RackSpace OpenStack Cassandra
    • 6. Data ServicesDEMOHDINSIGHT (HADOOP)
    • 7. The Reality BigData Small BigData
    • 8. BigData Lifecycle Management Locate Quantify Qualify Replicate Process Present
    • 9. Locating the data • you buy it Private source Public source • you find it Your source • in SQL Server • on desktops
    • 10. Finding Data in Data Markets• Windows Azure Data Market• DataMarket.com• Factual.com• InfoChimps
    • 11. Data ServicesDEMOAZURE DATAMARKET
    • 12. Database Lifecycle Management• Evaluating current processes• Improving processes• Adding new tools – SSDT• Data synchronization processes
    • 13. Storing the data Relational • SQL Server – can use partitioning for scalability Beyond relational via relational • Specialized data types • XML, Hierarchy, Filestream/Filetable, Geospatial • Columnstore index Multi-dimensional / in-memory • OLAP cubes / Mining Models • Tabular models
    • 14. Big Data in SQL Server 2012 – Relational EnhancementsDEMOCOLUMNSTORE, XML, FILETABLE
    • 15. Data ProcessingRaw data Pre-processed data Detail data Aggregate data Views
    • 16. Valuing the data• De-duplicating• Validating• Correcting errors• Aggregating• Ranking / rating – Social rating ,i.e. Yelp-like – Social scoring, i.e. Freebase-like
    • 17. Data ServicesDEMODATA QUALITY SERVICES
    • 18. Types of Data Quality Projects T-SQL scripts (boolean • Exact matches WHERE = , WHERE WHERE <>, IN match) • LIKE string matching % -- Full-text matching (semantic word match) • CONTAINS Semantic Search • SEMANTICSIMIALARITIESTABLE(semantic phrase match)SSIS tasks - (transactional, multi-valued matching) • List below • KnowledgeBase rules/matches - DQS (KB matching) • DataQualityproject clean correctdata - /MDS (One view of truth) • Versioned Entities, Attributes and Rules
    • 19. Data Presentation• View-only client• View & manipulate (hide-only) client• View & query (aggregate) client• View & query (drill through) client• View & mash-up (add new data) client• View & update client• Timeliness of data (latency)• Beauty of data
    • 20. But, does it work in Excel? Mash-up Clean up Extract- Authorize data with data with Transform- with 3rd party –Import PowerPivot Data Load with Master Mine with Data – including Quality Data Data Predixion Hadoop via Services Explorer Services ODBC
    • 21. From Pivot tables to Visualized Data Mash-ups with MiningDEMOTHE POWER OF EXCEL
    • 22. What about the UDM?• UDM / Data Mining is fully supported in SSAS• Must be installed in this mode – Mutually exclusive to Tabular mode• But, should you use it anymore?
    • 23. Big Data in SQL Server 2012– Non-Relational FeaturesDEMOTABULAR MODELSDATA MINING
    • 24. Data Consumability (Accurate) Valid (Meaningful) Recognizable (Useful) Appropriate (Appealing) Beautiful (Satisfying) Enjoyable
    • 25. PowerView forTabular ModelsDEMOPOWERVIEW
    • 26. Data Fluency and Job RolesConsumer Analyzer Cleaner Artist• View and • View, • Validate • Visualize understand manipulate and update and present and decide
    • 27. BigData in SQL Server 2012 • Scaling via • Partitioning for Tables, indexes • PDW Relational • Columnstore indexes engine • Special Data Types • XML, Hierarchy, Filetable • OLAP Cubes Analysis • Tabular Models service engines • Data Mining Models • Data Quality Services Other • Master Data Services services • StreamInsight
    • 28. Other Data Services from Microsoft Windows Azure SQL Azure Marketplace Data Power Pivot Explorer
    • 29. NoSQL – New Products / Betas SSRS on Semantic Azure Search HDInsight PowerView (Hadoop on Azure) Cloud-based Data Explorer
    • 30. Announced Futures
    • 31. The Changing Data Landscape Other ServicesRDBMS NoSQL
    • 32. • recipes) www.TeachingKidsProgramming.org • Free Courseware • Do a Recipe  Teach a Kid (Ages 10 ++) • Java or Microsoft SmallBasic • C# on Pluralsight
    • 33. Toward Data Craftsmanship… Follow me • @LynnLangit • www.LynnLangit.com • YouTube - SoCalDevGal Hire me • To help build your BI/Big Data solution • To teach your team next gen BI • To learn more about using NoSQL solutions