Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides


Published on

Slides describing Cloudera and Karmasphere, and how combined their products can install a Hadoop cluster, import data, run queries and generate results.

Published in: Technology

Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides

  1. 1. March 13, 2011From Zero to Big Data Answers in Less Thanan HourDaniel Templeton | Cloudera Manager, Partner Program AdoptionRichard Guth | Karmasphere Chief Marketing Officer
  2. 2. The ‘Big Data’ Phenomenon Big Data Drivers: More Content More Devices  The proliferation of data capture and creation technologies  Increased “interconnectedness” drives consumption (creating more data) More New & Better Consumption Information  Inexpensive storage makes it possible to keep more, longer  Innovative software and analysis tools turn data into information  Every gigabyte of stored content can generate Big Data encompasses not a petabyte or more of transient data* only the content itself, but how it’s consumed.  The information about you is much greater than the information you create*Source: IDC 2011 2 ©2011 Cloudera, Inc. All Rights Reserved.
  3. 3. What is Apache Hadoop? CORE HADOOP COMPONENTS Apache Hadoop is a platform for data storage and processing that is… Hadoop Distributed File  Scalable System (HDFS) MapReduce  Fault tolerant  Open source File Sharing & Data Protection Across Distributed Computing Across Physical Servers Physical Servers Has the Flexibility to Store Excels at Scales and Mine Any Type of Data Processing Complex Data Economically Ask questions across structured and  Scale-out architecture divides  Can be deployed on commodity unstructured data that were previously workloads across multiple nodes hardware impossible to ask or solve  Flexible file system eliminates ETL  Open source platform guards Not bound by a single schema bottlenecks against vendor lock 3 ©2011 Cloudera, Inc. All Rights Reserved.
  4. 4. Who Is Cloudera?The trusted leader in We make Hadoop Unrivaled knowledge Strong executive Apache Hadoop. enterprise-easy. and experience. team with proven abilities. Package the #1  A distribution of Apache  Founders, committers and distribution of Apache Hadoop that is contributors to Apache Mike Olson Amr Awadallah Hadoop in commercial and tested, certified and Hadoop and related CEO VP, Engineering non-commercial supported projects Kirk Dunn Mary Rorabaugh environments COO VP, Finance  A suite of management  A wealth of experience in Jeff Charles Roadmap control or software for Hadoop the design and delivery of Hammerbacher Zedlewski Chief Scientist VP, Products influence over all Apache administrators enterprise software Doug Cutting Omer Trajman Hadoop-related projects Chief Architect VP, Customer  Training and certification Solutions Top contributor to the programs Apache ecosystem overall  Comprehensive support Tens of thousands of nodes and consulting services under management 4 ©2011 Cloudera, Inc. All Rights Reserved.
  5. 5. CDH Overview The #1 commercial and non-commercial Apache Hadoop distribution. Complete, Integrated Hadoop Stack CDH Components  Apache Hadoop – reliable, scalable distributed computing File System Mount UI Framework SDK FUSE-DFS HUE HUE SDK  Apache Hive – SQL-like language and metadata repository  Apache Pig – High level language for expressing data analysis programs Workflow Scheduling Metadata APACHE OOZIE APACHE OOZIE APACHE HIVE  Apache HBase – Hadoop database for random, real-time read/write access  Apache Zookeeper – Highly reliable distributed coordination service Languages / Compilers Data APACHE PIG, APACHE HIVE Fast  Apache Flume* – Distributed service for collecting and aggregating Read/Write log and event data Integration Access  Apache Whirr* – Library for running Hadoop in the cloud APACHE FLUME, APACHE SQOOP APACHE HBASE  Apache Sqoop* – Integrating Hadoop with RDBMS  Apache Oozie* – Server-based workflow engine for Hadoop Activities Coordination APACHE ZOOKEEPER  Fuse-DFS – Module within Hadoop for mounting HDFS as a traditional file system  Hue – Browser-based desktop interface for interacting with Hadoop* Currently undergoing Incubation at the Apache Software Foundation. 5 ©2011 Cloudera, Inc. All Rights Reserved.
  6. 6. Cloudera EnterpriseCloudera Enterprise makes CLOUDERA ENTERPRISE COMPONENTSopen source Hadoop enterprise-easy Simplify and Accelerate Hadoop Deployment Cloudera Production-Level Manager Support Reduce Adoption Costs and Risks Lower the Cost of Administration End-to-End Management Our Team of Experts On- Increase Transparency and Control Over Hadoop Application for Apache Call to Help You Meet Hadoop Your SLAs Leverage the Experience of Our Experts EFFECTIVENESS EFFICIENCY Ensuring You Enabling You to Get Value From Your Hadoop Deployment Affordably Run Hadoop in Production6 ©2011 Cloudera, Inc. All Rights Reserved.
  7. 7. Big Data Intelligence Applications for Enterprise Data Professionals www.karmasphere.com7 © Karmasphere 2012 All rights reserved
  8. 8. About Karmasphere Company Pure-play, singularly focused on Big Data Intelligence and Analytics on Hadoop and NoSQL, in the cloud and on-premise. Engineering Expertise Hadoop, analytics, web analytics, business intelligence, visualizations, programming languages, compilers, architecture, mathe matics, database Management Experience Google, Yahoo, Ask, Ning, Omniture, BEA, Oracle, Sybase, Actuate, Apple, Zend, Intel , BMC, Spotfire8 © Karmasphere 2012 All rights reserved
  9. 9. Karmasphere Mission Provide an EASY way to find INSIGHTS in Big Data to transform business Upcoming Skills Shortage “By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions” “Big Data: the next frontier for innovation, competition and productivity” McKinsey, May 20119 © Karmasphere 2012 All rights reserved
  10. 10. Karmasphere: Big Data Mining and Analytics ON Hadoop10 © Karmasphere 2012 All rights reserved
  11. 11. From Zero to Answers in 60 Minutes DEMO Our Process Marketing Analyst for Retail Chain • Access any cloud or on- 1 Connect to the preconfigured premise Cloudera CDH Cloudera CDH cluster • Assemble and organize 2 Access our structured point of sale unstructured and transactions data and bring up structured data in transactional data for lunch meals Hadoop 3 Correlate results with unstructured • Analyze the data using social media data to get some insight familiar SQL on our buyers and buying behavior 4 Infer from these results on underperforming stores and come up with an action plan to increase sales for these stores11 © Karmasphere 2012 All rights reserved
  12. 12. www.karmasphere.com12 © Karmasphere 2012 All rights reserved
  13. 13. From Zero to Big Data Answers in Less Than an Hour The webinar recording will be made available shortly at: • https://www1.gotomeeting.com/register/890391584 Contact Information: • info@cloudera.com • 1 (888) 789-1488 • info@karmasphere.com • 1 (650) 292-610013 © Karmasphere 2012 All rights reserved ©2011 Cloudera, Inc. All 13