Pass bac jd_sm

860 views

Published on

  • Be the first to comment

  • Be the first to like this

Pass bac jd_sm

  1. 1. Using Power View and Hiveto Gain Business InsightsFinding Hidden Answers in DataJoey D’Antoni, Comcast CableStacia Misner, Data Inspirations April 10-12 | Chicago, IL
  2. 2. Please silencecell phones April 10-12 | Chicago, IL
  3. 3. About Us Joey D’Antoni Stacia Misner• Principal Architect for SQL Server at Comcast • Principal Consultant at Data Inspirations Cable • @StaciaMisner on Twitter• @jdanton on Twitter • blog.datainspirations.com• Joedantoni.wordpress.com 3
  4. 4. Agenda• Introducing Big Data• Overview and Summary of Data Set• Insights into the Data• Conclusions 4
  5. 5. Classic Data Analysis Loading Analyzing Visualization 5
  6. 6. Classic Data Analysis …Uses Just a Subset Data Warehouse & BI Solutions ETL
  7. 7. Classic Data Analysis …Requires Structure Data Warehouse & BI Solutions ETL
  8. 8. Why Leave the RDBMS 8
  9. 9. Key Differences Basically Available Soft-state Eventually Scale Out As Needed Impose Schema consistentWith Commodity Hardware On Read
  10. 10. Hadoop Ecosystem Note: This is only a subset of ecosystem! MapReduce HDFS
  11. 11. Hadoop and Hive Demo 11
  12. 12. Extract, Transform, Load (ETL) Process Some Process YourSome Database Business Doesn’t Care About Credit—Buck Woody, Microsoft 12
  13. 13. Our ETL Process Collection HDFS Server Hive is a Data Warehouse System that connects to Hadoop and allows SQL queries to be written against data sets in Hadoop 13
  14. 14. The Data SetSet Top Box Engagement Times• Max Set Top Boxes Viewing Channels• Aggregate Viewing Seconds• Potential Total Seconds Watched• Recorded in 5, 15 and 60 minute aggregatesThis data is from the week of 11-17, July 2012 14
  15. 15. Preparation for Data Analysis • Define question to answer • Define ideal data set • Find data 15
  16. 16. Remember Legal and Privacy Issues 16
  17. 17. Diving into Data Analysis • Cleanse • Reformat as needed • Decide what is usable • Explore • Create summaries • Perform statistical analysis • Use visualizations 17
  18. 18. Aggregate Statistics on Data 18
  19. 19. ResourcesConnecting Excel to Hive (Hive ODBC Driver, Excel Hive Add-in)• http://social.technet.microsoft.com/wiki/contents/articles/6226.how-to- connect-excel-to-hadoop-on-azure-via-hiveodbc.aspxConnecting PowerPivot to Hadoop on Azure• http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on- azure-self-service-bi-to-big-data-in-the-cloud/Connecting Power View to Hadoop on Azure• http://dennyglee.com/2012/02/10/connecting-power-view-to-hadoop-on- azurean-awesomesauce-way-to-view-big-data-in-the-cloud/ 19
  20. 20. Thank you!Diamond Sponsor April 10-12 | Chicago, IL

×