Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to Use Apache Zeppelin with HWX HDB

950 views

Published on

Part five in a five-part series, this webcast will be a demonstration of the integration of Apache Zeppelin and Pivotal HDB. Apache Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more. This webinar will demonstrate the configuration of the psql interpreter and the basic operations of Apache Zeppelin when used in conjunction with Hortonworks HDB.

Published in: Technology
  • Be the first to comment

How to Use Apache Zeppelin with HWX HDB

  1. 1. How to use Apache Zeppelin with Hortonworks HDB Dan Baskette December 2016 HORTONWORKS HDBPowered by Apache HAWQ
  2. 2. 2© 2016 Pivotal Software, Inc. All rights reserved. Agenda ● Hortonworks HDB/HAWQ ● Apache Zeppelin ● Demo ● Resources
  3. 3. 3© 2016 Pivotal Software, Inc. All rights reserved. What is HDB / Apache HAWQ ? Hadoop-native SQL query engine and advanced analytics MPP database that offers high-performance interactive query execution and machine learning to Data Analysts & Data Scientists who want to find insights in large/complex datasets. Pivotal HDB HORTONWORKS HDBPowered by Apache HAWQ
  4. 4. 4© 2016 Pivotal Software, Inc. All rights reserved. Advanced Analytics Performance Exceptional MPP performance, low latency, ACID reliability, data federation ANSI SQL Compliance Higher degree of SQL compatibility, SQL-92, 99, 2003, OLAP (leverage existing SQL skills) Advanced Query Optimizer Maximize performance and do advanced queries with confidence Elastic Architecture for Scalability Scale-up/down or scale-in/out, expand/shrink clusters on the fly Integrated w/MADlib Machine Learning Advanced MPP analytics, data science at scale, directly on Hadoop data HDB / HAWQ Advantages MAD
  5. 5. 5© 2016 Pivotal Software, Inc. All rights reserved. Apache MADlib: In-Database Machine Learning • ApacheTM MADlib® (incubating) is an open-source library for scalable in-database analytics • Provides parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data • Supports Apache HAWQ, Greenplum Database and Postgres • Analytics on all data in-database, without sampling (produces more accurate results, less effort) http://madlib.incubator.apache.org
  6. 6. 6© 2016 Pivotal Software, Inc. All rights reserved. • A web-based notebook that enables interactive data analytics. • Used to build data-driven, interactive and collaborative documents with SQL, Scala and more. • Used for data ingestion, discovery, analytics, visualization, and collaboration • Very Extensible Apache Zeppelin
  7. 7. 7© 2016 Pivotal Software, Inc. All rights reserved. • Any language/data processing engine can be plugged into Zeppelin • Supports many engines out of the box • Support Apache HAWQ, Greenplum Database, and PostgreSQL via psql interpreter. This interface will be merging with JDBC interpreter. Apache Zeppelin Interpreters
  8. 8. 8© 2016 Pivotal Software, Inc. All rights reserved. Apache Zeppelin Example
  9. 9. 9© 2016 Pivotal Software, Inc. All rights reserved. Learn more http://hortonworks.com/apache/hawq/ Recording: http://hortonworks.com/webinar/use-apache-zeppelin- hortonworks-hdb/
  10. 10. 10© 2016 Pivotal Software, Inc. All rights reserved.

×