Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DOVAL & IBON LANDA at Big Data Spain 2012
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DOVAL & IBON LANDA at Big Data Spain 2012

on

  • 1,466 views

Session presented at Big Data Spain 2012 Conference ...

Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/building-a-heterogeneous-hadoop-olap-system-with-microsoft-bi-stack/pablo-doval-and-ibon-landa

Statistics

Views

Total Views
1,466
Views on SlideShare
1,464
Embed Views
2

Actions

Likes
0
Downloads
14
Comments
0

1 Embed 2

http://www.slashdocs.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Usual presentation and contactstuf…Greet Ibon, he couldn’tmakeitto Madrid.Threequestions: - Are youengaged in any Hadoop projects? - HaveyouplayedwithMicrosoft’s Hadoop Distribution - Didyouknowtherewas a Microsoft’s Hadoop Distribution? ;)Microsoft’s Big Data IncubationProgram.
  • Development as a Proof of Concept allowsfor new scenariosto be thought and developed in futureiterationswith mínimum risk.Wewouldstartwith a 10Min data storage and DataWarehouse, and 1Min data storage. Thenanalytical proceses.
  • Show HDInsightService and HDInsight Server.

Building a heterogeneous Hadoop Olap system with Microsoft BI stack. PABLO DOVAL & IBON LANDA at Big Data Spain 2012 Presentation Transcript

  • 1. BUILDING AN HETEROGENEOUS HADOOP/OLAP SYSTEM WITH MICROSOFTS BI STACK
  • 2. WHO…… AM I? • SQL/BI Team Lead at Plain Concepts • e-mail: pablod@plainconcepts.com • Blog: http://geek.ms/blogs/palvarez • Twitter: @PabloDoval… ARE YOU? • Quick Poll in the Room 
  • 3. WHAT…… ARE WE GOING TO SEE?… I’M NOT GOING TO SHOW?
  • 4. SOME PICS…
  • 5. SHARPOverview SCADA Historical Analysis and Reporting Platform Demonstrate the feasibility of a custom end to end global architecture: • SCADA: Local, Mobile and Central • Historical Data: High speed and High volume • Reporting • Analysis
  • 6. SHARP MAGUS MongoDB MongoDB Capped collections Capped collections For each Production CenterMAGUS 2 months of 1s data MAGUS 2 months of 1s data Central 1 year of 10m data 1 year of 10m data MAGUS Local Operation Mobile Operation Production Center A MAGUS Remote Operation MongoDB Capped collectionsMAGUS 2 months of 1s data 1 year of 10m data Mongo MAGUS DAT Files Export Local Operation Mobile Operation Production Center B Production Centers Central
  • 7. SHARPHistorical Data MAGUS MAGUS Mongo Central Export Source 1 Loader DAT Source2 DAT Loader DAT Source3 DAT Loader DWH Hadoop Source4 Loader DAT Source5 Loader DAT DAT DAT Loader Source6 Loader Source7 Production Centers Central
  • 8. SHARP Analysis and Reporting Events Power Pivot DWH StreamInsight Microsoft Office • Dynamic reports Reporting • Scheduled reports Services • Automatic Distribution OLAP • Multiformat (PDF, XLS, etc.) Tabular Power View OLAP Tabular Future ¿Cloud?Production Centers Central
  • 9. INITIAL ASSESMENT Proof of Concept Microsoft Ecosystem On PremiseInfrastructure
  • 10. TOOLS OF THE TRADE PowerPivot Power View
  • 11. SO… WHAT DOES IT LOOK LIKE?
  • 12. CURRENT SHARP IMPLEMENTATION Map Reduce HDFS LoadService HIVE DWH Hadoop Azure Storage SSIS SSRS PowerView
  • 13. LET’S TAKE A DEEPER LOOK…
  • 14. FUTURE IMPROVEMENTSNew Analytical ProcessesCEP Integration with Stream InsightImprovements on the Higher Resolution data
  • 15. COMPLEX EVENT PROCESSING StreamInsight Events Power Pivot DWH StreamInsight Microsoft Office • Dynamic reports Reporting • Scheduled reports Services • Automatic Distribution OLAP • Multiformat (PDF, XLS, etc.) Tabular Power View OLAP Tabular Future ¿Cloud?Production Centers Central
  • 16. COMPLEX EVENT PROCESSING StreamInsight Events StreamInsightProduction Centers Central
  • 17. IMPROV. TO HIGHER RESOLUTIONDATAThe GoalAbility to work with data in DW and Hive seamlessly and in aperformant way. Export
  • 18. IMPROV. TO HIGHER RESOLUTIONDATASqoop Refresher
  • 19. IMPROV. TO HIGHER RESOLUTIONSqoop with PDW…DATA Map/ Sqoop Reduce Job … SQL Server SQL Server SQL Server SQL Server
  • 20. IMPROV. TO HIGHER RESOLUTIONDATASqoop refresher… … SQL Server SQL Server SQL Server SQL Server Sqoop Hadoop Cluster
  • 21. IMPROV. TO HIGHER RESOLUTIONThe Goal – Polybase!DATAAbility to work with data in DW and Hive seamlessly and in aperformant way. T-SQL Queries SQL Server (PDW) SQL HDF
  • 22. IMPROV. TO HIGHER RESOLUTIONDATAPolybase parallelism via DMS … SQL Server SQL Server SQL Server SQL Server Hadoop Cluster
  • 23. IMPROV. TO HIGHER RESOLUTIONDATAParallelism
  • 24. IMPROV. TO HIGHER RESOLUTIONThat’s just the beginning…DATAUses the same T-SQL Syntax to query both worlds at the sametimeThe QO is able to check what data to push into whatenvironment to process optimally.
  • 25. STORIES WE COULD TELLWhat went right… Cloud Environment Tabular Model for OLAP SSIS for ETL via ODBC Hive Driver
  • 26. STORIES WE COULD TELLWhat was not so good… Mappers and Reducers in C# via Hadoop Streaming
  • 27. CALL TO ACTIONLEARN MORE 1. Microsoft Big Data Solution: www.microsoft.com/bigdata 2. Windows Azure: www.windowsazure.com/en- us/home/scenarios/big-dataTRY NOW 1. Preview of the Windows Azure HDInsight Service: https://www.hadooponazure.com 2. Developer CTP of Microsoft HDInsight Server for Windows Server: http://www.microsoft.com/bigdata