Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Impala Unlocks Interactive BI on Hadoop

17,390 views

Published on

The Cloudera Impala project is pioneering the next generation of Hadoop capabilities: the convergence of interactive SQL queries with the capacity, scalability, and flexibility of a Hadoop cluster. In this webinar, join Cloudera and MicroStrategy to learn how Impala works, how it is uniquely architected to provide an interactive SQL experience native to Hadoop, and how you can leverage the power of MicroStrategy 9.3.1 to easily tap into more data and make new discoveries.

Published in: Technology
  • Be the first to comment

Impala Unlocks Interactive BI on Hadoop

  1. 1. Impala unlocks Interactive BI onHadoop with MicroStrategyJustin Erickson | Cloudera | Senior Product ManagerJochen Demuth | MicroStrategy | Director, Partner EngineeringMay 2013
  2. 2. Agenda• Why Impala?• Architectural Overview• Real-World Use Cases• Interactive Analytics with MicroStrategy• Taking Big Data Out of Isolation©2013 Cloudera, Inc. All RightsReserved.2
  3. 3. Why Hadoop?• Scalability• Simply scales just by adding nodes• Local processing to avoid network bottlenecks• Flexibility• All kinds of data (blobs, documents, records, etc)• In all forms (structured, semi-structured, unstructured)• Store anything then later analyze what you need• Efficiency• Cost efficiency (<$1k/TB) on commodity hardware• Unified storage, metadata, security (no duplication orsynchronization)©2013 Cloudera, Inc. All RightsReserved.3
  4. 4. What’s Impala?• Interactive SQL• Typically 5-65x faster than Hive (observed up to 100x faster)• Responses in seconds instead of minutes (sometimes sub-second)• Nearly ANSI-92 standard SQL queries with Hive SQL• Compatible SQL interface for existing Hadoop/CDH applications• Based on industry standard SQL• Natively on Hadoop/HBase storage and metadata• Flexibility, scale, and cost advantages of Hadoop• No duplication/synchronization of data and metadata• Local processing to avoid network bottlenecks• Separate runtime from MapReduce• MapReduce is designed and great for batch• Impala is purpose-built for low-latency SQL queries on Hadoop©2013 Cloudera, Inc. All RightsReserved.4
  5. 5. Benefits of Impala5More & Faster Value from “Big Data” BI tools impractical on Hadoop before Impala Move from 10s of Hadoop users per cluster to 100s of SQL users No delays from data migrationFlexibility Query across existing data Select best-fit file formats (Parquet, Avro, etc.) Run multiple frameworks on the same data at the same timeCost Efficiency Reduce movement, duplicate storage & compute 10% to 1% the cost of analytic DBMSFull Fidelity Analysis No loss from aggregations or fixed schemas©2013 Cloudera, Inc. All RightsReserved.
  6. 6. Impala and Hive6Shares Everything Client-Facing Metadata (table definitions) ODBC/JDBC drivers SQL syntax (Hive SQL) Flexible file formats Machine pool Hue GUIBut Built for Different Purposes Hive: runs on MapReduce and ideal for batchprocessing Impala: native MPP query engine ideal forinteractive SQLStorageIntegrationResource ManagementMetadataHDFS HBaseTEXT, RCFILE, PARQUET, AVRO, ETC. RECORDSHiveSQL Syntax ImpalaSQL Syntax +Compute FrameworkMapReduceCompute FrameworkBatchProcessingInteractiveSQL©2013 Cloudera, Inc. All RightsReserved.
  7. 7. Not All SQL on Hadoop is Created Equal7Batch MapReduceMake MapReduce fasterSlow, still batchRemote QueryPull data from HDFS overthe network to the DWcompute layerSlow, expensiveSiloed DBMSLoad data into aproprietary database fileRigid, siloed data,slow ETLImpalaNative MPP query enginethat’s integrated intoHadoopFast, flexible,cost-effective$©2013 Cloudera, Inc. All RightsReserved.
  8. 8. Our Design Strategy8StorageIntegrationResource ManagementMetadataBatchProcessingMAPREDUCE,HIVE & PIG…InteractiveSQLIMPALAMachineLearningMAHOUT, DATAFUHDFS HBaseTEXT, RCFILE, PARQUET, AVRO, ETC. RECORDSEnginesOne pool of dataOne metadata modelOne security frameworkOne set of system resourcesAn Integrated Part ofthe Hadoop System©2013 Cloudera, Inc. All RightsReserved.
  9. 9. Impala Use Cases9Interactive BI/analytics on more dataAsking new questionsQuery-able archive w/ full fidelityData processing with tight SLAsCost-effective, ad hoc query environment thatoffloads the data warehouse for:©2013 Cloudera, Inc. All RightsReserved.
  10. 10. Global Financial Services Company10Saving 90% on incremental EDW spend &improving performance by 5xOffload data warehouse for query-able archiveStore decades of data cost-effectivelyProcess & analyze on the same systemImprove capabilities through interactive queryon more data©2013 Cloudera, Inc. All RightsReserved.
  11. 11. Six3 Systems11Boosting performance by 20X for mission-critical, real-time cyber securityAnalyze unstructured data with flexibility &real-time responseIntegrate with existing desktop & BI toolsDeploy in minutes with Cloudera Manager©2013 Cloudera, Inc. All RightsReserved.
  12. 12. Expedia12Implementing self-service BI on big data,reducing data latency by 50%Offload data warehouse for archiving, ETL &analyticsUnify IT environmentContinuously ingest & analyze at scaleDrive greater usability & adoption of big datastack©2013 Cloudera, Inc. All RightsReserved.
  13. 13. CONFIDENTIALThe Information Contained In This Presentation Is Confidential And Proprietary To MicroStrategy. The Recipient Of This Document Agrees That They Will Not Disclose Its Contents ToAny Third Party Or Otherwise Use This Presentation For Any Purpose Other Than An Evaluation Of MicroStrategys Business Or Its Offerings. Reproduction or Distribution Is Prohibited.14About MicroStrategyInnovator and Leader In Interactive BICompany• Top independent analyticssoftware platform vendor• 20+ years old, publicly traded• Approximately $600M revenue in 2012.No debt, $200M+ cash in the bank• Global presence with operations in 23countriesTechnology• Long-time market leader andinnovator in analytics• Unique unitary architecture,known for high performance and scalability• Revolutionary Cloud-based analytics services• Innovations in mobile commerce and identityAnalysts• Leader for six consecutiveyears in Gartner’s BIMagic Quadrant• Leader in Forrester BI Self Service Wave• #1 Ranked Mobile BI Vendor by Gartner &Dresner Advisory• Top ranking BI vendor in the BI ScorecardCustomers• Millions of business users• Thousands of mission-criticalapplications• Nearly 4,000 customer institutions globallyacross all industries and government• Customers range from Global 500 giants likeChevron and Carrefour to cutting edgetechnology innovators like eBay and LinkedIn
  14. 14. CONFIDENTIALThe Information Contained In This Presentation Is Confidential And Proprietary To MicroStrategy. The Recipient Of This Document Agrees That They Will Not Disclose Its Contents ToAny Third Party Or Otherwise Use This Presentation For Any Purpose Other Than An Evaluation Of MicroStrategys Business Or Its Offerings. Reproduction or Distribution Is Prohibited.15RetailFinancial ServicesCommunicationsOther Major CompaniesInnovators and Leaders Worldwide4 of the Top 5 Global RetailersManufacturing5 of the Top 10 AutomotiveCompanies8 of the Top 10 Communications& Media CompaniesPharmaceuticals7 of the Top 10 Healthcare & LifeSciences Companies6 of the Top 10 Financial ServiceCompaniesGovernmentFederal, State, and LocalGovernment InstitutionsConsumer Packaged GoodsLeading Consumer PackagedGoods CompaniesOur Customers Are Leaders In All IndustriesSupporting the Most Demanding, Mission Critical BI Applications
  15. 15. CONFIDENTIALThe Information Contained In This Presentation Is Confidential And Proprietary To MicroStrategy. The Recipient Of This Document Agrees That They Will Not Disclose Its Contents ToAny Third Party Or Otherwise Use This Presentation For Any Purpose Other Than An Evaluation Of MicroStrategys Business Or Its Offerings. Reproduction or Distribution Is Prohibited.16Intuitive Interface• Interactive dashboards• Visual Analytics• Build once, deployanywhereData Federation• Virtual model spanningmultiple data sourcesHigh Performance• Push-down analytics• In-memory cubeaccelerationFlexible, Reliable, andEasy to Manage• High-efficiency object reuse• Powerful SDK• Comprehensive admin toolsMicroStrategy Analytics PlatformComprehensive Analytics Suite for Big DataWeb | Mobile | Portals | Office™Data from Across the EnterpriseDashboards Statements Visual DiscoveryMicroStrategy Analytics PlatformReportsData MartsRelationalDatabasesCloudera Impalafor HadoopMultiDimensionalSources
  16. 16. CONFIDENTIALThe Information Contained In This Presentation Is Confidential And Proprietary To MicroStrategy. The Recipient Of This Document Agrees That They Will Not Disclose Its Contents ToAny Third Party Or Otherwise Use This Presentation For Any Purpose Other Than An Evaluation Of MicroStrategys Business Or Its Offerings. Reproduction or Distribution Is Prohibited.17MicroStrategyVisual Insight• Stunningvisualizations• On-screen filtering• Speed-of-thought inmemory databaseCommon UseCases• Interactive dataexploration and rootcause analysis• Dashboard creation• Self-service BIMicroStrategy Visual InsightInteractive Analysis, Drag-and-Drop to Build Intuitive Dashboards in MinutesData MartsRelationalDatabasesMultiDimensionalSourcesData from Across the EnterpriseCloudera Impalafor Hadoop
  17. 17. CONFIDENTIALThe Information Contained In This Presentation Is Confidential And Proprietary To MicroStrategy. The Recipient Of This Document Agrees That They Will Not Disclose Its Contents ToAny Third Party Or Otherwise Use This Presentation For Any Purpose Other Than An Evaluation Of MicroStrategys Business Or Its Offerings. Reproduction or Distribution Is Prohibited.18Combine Data from Multiple Federated SourcesTake Big Data Out of IsolationPut Big Data analysis in context with information from federateddata sources into one single dashboardUser / DepartmentalDataData WarehouseAppliancesHadoopDatabasesRelational DatabasesMultidimensionalDatabasesColumnarDatabases112232 & 3Bring AllRelevant Datato DecisionMakers,No MatterWhere ItResides
  18. 18. CONFIDENTIALThe Information Contained In This Presentation Is Confidential And Proprietary To MicroStrategy. The Recipient Of This Document Agrees That They Will Not Disclose Its Contents ToAny Third Party Or Otherwise Use This Presentation For Any Purpose Other Than An Evaluation Of MicroStrategys Business Or Its Offerings. Reproduction or Distribution Is Prohibited.19Browsers Portals EnterpriseApplicationsWebEmailEmailPDF OfficeDocumentsMobileAndroidiOSBlackBerryBuild Once, Deploy AnywhereMakes Big Data Accessible to a Wider Business AudienceBuildonceDeploy via any media12
  19. 19. CONFIDENTIALThe Information Contained In This Presentation Is Confidential And Proprietary To MicroStrategy. The Recipient Of This Document Agrees That They Will Not Disclose Its Contents ToAny Third Party Or Otherwise Use This Presentation For Any Purpose Other Than An Evaluation Of MicroStrategys Business Or Its Offerings. Reproduction or Distribution Is Prohibited.20From Big Data to Business ValueMicroStrategy Delivers Insights on Big Data FasterAny and AllDataWorld’s MostIntuitive InterfaceBenchmarkingProjectionsTrend AnalysisData SummarizationRelationshipAnalysisRelationalMultidimensionalHadoop-basedStructuredSemi-StructuredUnstructuredComprehensiveAnalyticsCloudera Impalafor HadoopShortened time-to-value for data scientistEnables self-service for the business user
  20. 20. • Submit questions in the Q&A panel• Watch this webinar on-demand athttp://cloudera.com• Follow Cloudera at @Cloudera• Follow MicroStrategy at @microstrategy• Thank you for attending!Learn more about the ClouderaMicroStrategy partnershiphttp://cloudera.com/Microstrategy.htmlDownload Impalahttp://cloudera.com/downloadsLearn more about Impala athttp://cloudera.com/impala

×