• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Modern Data Architecture: In-Memory with Hadoop - the new BI
 

Modern Data Architecture: In-Memory with Hadoop - the new BI

on

  • 316 views

Is Hadoop ready for high-concurrency complex BI and Advanced Analytics? Roaring performance and fast, low-latency execution is possible when an in-memory analytical platform is paired with the Apache ...

Is Hadoop ready for high-concurrency complex BI and Advanced Analytics? Roaring performance and fast, low-latency execution is possible when an in-memory analytical platform is paired with the Apache Hadoop framework. Join Hortonworks and Kognitio for an informative Web Briefing on putting Hadoop at the center of your modern data architecture—with zero disruption to business users.

Statistics

Views

Total Views
316
Views on SlideShare
316
Embed Views
0

Actions

Likes
2
Downloads
44
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Modern Data Architecture: In-Memory with Hadoop - the new BI Modern Data Architecture: In-Memory with Hadoop - the new BI Presentation Transcript

    • Hadoop and the new BI: The Modern Data Architecture …for in memory Big Data Analytics 10 December 2013
    • Quick Housekeeping Q&A box is available for your questions Webinar will be recorded for future viewing Thank You for joining! © Hortonworks Inc. 2013
    • Modern Data Architecture …for in memory Big Data Analytics © Hortonworks Inc. 2013 Page 3
    • Your Presenters • Paul Groom (@datagroom) – Chief Innovation Officer – 28 years buried in the big data of the data guiding business users to value – Two wheels are more fun than four • John Kreisa (@marked_man) – VP Strategic Marketing, Hortonworks – Over 20 years in data management as a developer and a marketer – Avid camper © Hortonworks Inc. 2013 Page 4
    • Today’s Topics • Introduction • Drivers for the Modern Data Architecture (MDA) • Apache Hadoop in the MDA • Kognitio’s role in the MDA • Q&A © Hortonworks Inc. 2013 Page 5
    • APPLICATIONS Existing Data Architecture Business  Analytics Custom  Applications Packaged Applications DATA  SYSTEM 2.8 ZB in 2012 85% from New Data Types RDBMS EDW MPP REPOSITORIES 15x Machine Data by 2020 40 ZB by 2020 SOURCES Source: IDC Existing Sources  (CRM, ERP, Clickstream, Logs) © Hortonworks Inc. 2013 Page 6
    • APPLICATIONS Modern Data Architecture Enabled Business  Analytics Custom  Applications Packaged Applications DEV & DATA TOOLS SOURCES DATA  SYSTEM BUILD &  TEST OPERATIONAL TOOLS RDBMS EDW MANAGE &  MONITOR MPP REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 Page 7
    • Hadoop Powers Modern Data Architecture Hadoop Cluster compute & storage . . . . . . . . . . compute & storage Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment. © Hortonworks Inc. 2013 Page 8
    • Drivers of Hadoop Adoption New Business Applications From NEW types of Data (or existing types for longer) © Hortonworks Inc. 2013 Page 9
    • Most Common NEW TYPES OF DATA 1. Sentiment Understand how your customers feel about your brand and products – right now 2. Clickstream Capture and analyze website visitors’ data trails and optimize your website 3. Sensor/Machine Discover patterns in data streaming automatically from remote sensors and machines 4. Geographic Analyze location-based data to manage operations where they occur 5. Server Logs Research logs to diagnose process failures and prevent security breaches 6. Unstructured (txt, video, pictures, etc..) Understand patterns in files across millions of web pages, emails, and documents © Hortonworks Inc. 2013 Value
    • Keep Existing Data Around Longer • Online archive – Data that was once moved to tape can now be queried to understand long term trends • Compliance retention – Industry specific requirements for retention of data Value • Combine with external historical data sources – Weather, survey, research, purchased, etc. © Hortonworks Inc. 2013
    • Drivers of Hadoop Adoption Architectural A Modern Data Architecture Complement your existing data systems: the right workload in the right place New Business Applications © Hortonworks Inc. 2013 Page 12
    • Requirements for Hadoop Adoption Requirements for Hadoop’s Role in the Modern Data Architecture Integrated Key Services Interoperable with existing data center investments Platform, operational and data services essential for the enterprise Skills Leverage your existing skills: development, operations, analytics © Hortonworks Inc. 2013 Page 13
    • Requirements for Enterprise Hadoop 1 2 3 Key Services Platform, Operational and Data services essential for the enterprise OPERATIONAL  SERVICES AMBARI HBASE PIG SQOOP HIVE & HCATALOG LOAD &  EXTRACT Skills NFS CORE PLATFORM  SERVICES Integrated WebHDFS KNOX* MAP  REDUCE TEZ YARN   HDFS Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS  DATA PLATFORM (HDP) Engineered with existing data center investments OS/VM © Hortonworks Inc. 2013 FLUME FALCON* OOZIE Leverage your existing skills: development, analytics, operations DATA SERVICES Cloud Appliance Page 14
    • Requirements for Enterprise Hadoop 3 Leverage your existing skills: development, analytics, operations Integration DEVELOP ANALYZE 2 Skills Platform, operational and data services essential for the enterprise OPERATE 1 Key Services COLLECT PROCESS BUILD EXPLORE QUERY DELIVER PROVISION MANAGE MONITOR Engineered with existing data center investments © Hortonworks Inc. 2013 Page 15
    • Familiar and Existing Tools 3 Leverage your existing skills: development, analytics, operations Integration DEVELOP ANALYZE 2 Skills Platform, operational and data services essential for the enterprise OPERATE 1 Key Services COLLECT PROCESS BUILD EXPLORE QUERY DELIVER PROVISION MANAGE MONITOR Engineered with existing data center investments © Hortonworks Inc. 2013 Page 16
    • APPLICATIONS Requirements for Enterprise Hadoop Business  Analytics Custom  Applications Packaged Applications Integrated with DEV & DATA TOOLS Applications BUILD &  DATA  SYSTEM Business Intelligence, TEST Developer IDEs, Data Integration SOURCES 3 OPERATIONAL TOOLS RDBMS EDW MANAGE &  Systems MONITOR MPP Data Systems & Storage, Systems Management REPOSITORIES Platforms Integration Existing Sources  Engineered with existing (CRM, ERP, Clickstream, Logs) data center investments © Hortonworks Inc. 2013 Emerging Sources  (Sensor, Sentiment, Geo, Unstructured) Operating Systems, Virtualization, Cloud, Appliances Page 17
    • SOURCES DATA  SYSTEM APPLICATIONS A Modern Data Architecture Applied Business  Analytics Custom  Applications Packaged Applications Complement data systems RDBMS EDW MPP Right workload right place REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 - Confidential Page 18
    • APPLICATIONS Kognitio in the Modern Data Architecture Business  Analytics Business  Intelligence Tools OLAP Clients DEV & DATA TOOLS SOURCES DATA  SYSTEM In‐memory MPP Accelerator BUILD &  TEST OPERATIONAL TOOLS RDBMS EDW MANAGE &  MONITOR MPP REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 - Confidential Page 19
    • APPLICATIONS Kognitio in the Modern Data Architecture BusinessObjects BI DEV & DATA TOOLS DATA SYSTEM In‐memory MPP Accelerator OPERATIONAL TOOLS RDBMS HANA EDW MPP SOURCES INFRASTRUCTURE Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 - Confidential Page 20
    • Today’s Topics • Introduction • Drivers for the Modern Data Architecture (MDA) • Apache Hadoop’s role in the MDA • Kognitio’s role in the MDA • Q&A © Hortonworks Inc. 2013 Page 21
    • Hadoop and the new BI Requirements for Hadoop’s Role in the Modern Data Architecture 1 Integrated Interoperable with existing data center investments © Hortonworks Inc. 2013 2 Skills 3 Key Services Platform, operational and data services essential for the enterprise Leverage your existing skills: development, operations, analytics Page 22
    • Motivation • Historical architecture = Existing investment 1 Key Services Platform, Operational a Data services essential for the enterprise Cognos • Must plug-and-play with MDA – Do not disrupt, enhance! • Performance and behavior expectations – Dynamic ad-hoc access – Drill unlimited – Report on-demand © Hortonworks Inc. 2013 Page 23
    • Business [Intelligence] Desires More timely Lower latency Richer data model More granularity Better concurrency Self service © Hortonworks Inc. 2013 Page 24
    • BI Activity Insulate the Hadoop cluster © Hortonworks Inc. 2013 Page 25
    • In-memory analytical platform • Software only – Easy to deploy alongside HDP – Simple two stage install • Commodity Hardware 3 Integration Engineered with existing data center investments – X86/64 Linux Platform with 10GbE network – same as HDP – Biased to more RAM and less disk • Scale-out MPP – Same compute model as Hadoop – Strong focus on 100% effective CPU utilization for any given query • Exploits features of underlying persistent store – Simple ‘Pull data’ access methods – Parallelism – all HDP nodes intercommunicating with all Kognitio nodes • ANSI 2011 SQL – Mature fully featured – Transaction processing capable • Not-only-SQL 2 Skills Leverage your existing skills: development, analytics, operations – Any script or binaries executed in-line within SQL queries © Hortonworks Inc. 2013 Page 26
    • Tight Integration 3 • Map-reduce Connector – Filtered access © Hortonworks Inc. 2013 Integration Engineered with existing data center investments • HDFS Connector – Low Latency access Page 27
    • So why In-memory? INSTANT WAIT • Exploit the ‘Dynamic’ access element of ‘D’-RAM – Data placed in memory in structures best suited for CPUs, not for disks © Hortonworks Inc. 2013 Page 28
    • In-memory – getting work done © Hortonworks Inc. 2013 Page 29
    • Building Data Models • Hadoop is a great repository • Perfect to handle volume and variability without effort • Perfect to ‘triage’ the data, to reshape, filter and project into… • Data Virtualisation / Logical Data Warehouse … but with the associated horsepower to dynamically analyse the data • Plug standard tools straight in – not a Java programmer in sight! • Central control and security • Data model shelf life getting shorter – sandboxes and workbenches – Build on-demand to meet todays needs – just pull data from your HDP – Lots of project based discovery and analytics – World is changing rapidly – Ever tighter feedback loops © Hortonworks Inc. 2013 Page 30
    • Analytical Complexity Increasing Computation Machine learning algorithms Behaviour modelling Statistical Analysis Dynamic Simulation Clustering Dynamic Interaction Reporting & BPM Campaign Management Fraud detection Technology/Automation © Hortonworks Inc. 2013 Page 31
    • The Analytical Enterprise Data Scientist Systems Admin Business Analyst Key: “Graduation” • Projects will need to easily Graduate from the Data Science Lab and become part of Business as Usual © Hortonworks Inc. 2013
    • Mature SQL atop Hadoop Kognitio is an in‐memory  analytical platform that is tightly  integrated with Hadoop for high‐ performance advanced analytics  that make Big Data more  consumable for enterprises,  especially those with mature BI  environments or engrained  tools.  • Powering advanced analytics at  organizations worldwide, such as:  • Privately held • Invented the in‐memory analytical platform • Labs in the UK ‐ HQ in New York, NY  © Hortonworks Inc. 2013 Page 33
    • APPLICATIONS Kognitio in the Modern Data Architecture Business  Analytics Business  Intelligence Tools OLAP Clients DEV & DATA TOOLS SOURCES DATA  SYSTEM In‐memory MPP Accelerator BUILD &  TEST OPERATIONAL TOOLS RDBMS EDW MANAGE &  MONITOR MPP REPOSITORIES Existing Sources  Emerging Sources  (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 Page 34
    • Forrester Wave: a “strong performer” • • Kognitio’s EDW is a strong, cost-effective alternative to SAP HANA. • Kognitio…was designed from the start as an MPP (distributed) in-memory RDBMS, making extensive use of RAM-based processing for maximum performance. • © Forrester Corp. Used with permission. © Hortonworks Inc. 2013 Kognitio’s entirely in-memory, distributed EDW is appealing for customers looking for fast performance on commodity hardware Download a complimentary copy of the full report at www.kognitio.com/wave Page 35
    • The Modern Data Architecture …for in memory Big Data Analytics More about Kognito and Hortonworks http://hortonworks.com/partner/kognitio Get started with Hortonworks Sandbox http://hortonworks.com/hadoop-tutorial/ Follow us: @hortonworks @kognitio Question & Answer session will be conducted electronically, using the panel to the right of your screen Today’s Slides available at: www.slideshare.net/kognitio