Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI World London

675 views

Published on

This talk was recorded in London on October 30, 2018.

KNIME Analytics Platform is an easy to use and comprehensive open source data integration, analysis, and exploration platform, enabling data scientists to visually compose end to end data analysis workflows. The over 2,000 available modules ("nodes") cover each step of the analysis workflow, including blending heterogeneous data types, data transformation, wrangling and cleansing, advanced data visualization, or model training and deployment.

Many of these nodes are provided through open source integrations (why reinvent the wheel?). This provides seamless access to large open source projects such as Keras and Tensorflow for deep learning, Apache Spark for big data processing, Python and R for scripting, and more. These integrations can be used in combination with other KNIME nodes meaning that data scientists can freely select from a vast variety of options when tackling an analysis problem.

The integration of H2O in KNIME offers an extensive number of nodes and encapsulating functionalities of the H2O open source machine learning libraries, making it easy to use H2O algorithms from a KNIME workflow without touching any code - each of the H2O nodes looks and feels just like a normal KNIME node - and the data scientist benefits from the high performance libraries and proven quality of H2O during execution. For prototyping these algorithms are executed locally, however training and deployment can easily be scaled up using a Sparkling Water cluster.

In our talk we give a short introduction to KNIME Analytics Platform and then demonstrate how data scientists benefit from using KNIME Analytics Platform and H2O Machine Learning in combination by using a real world analysis example.

Bio: Christian received a Master’s degree in Computer Science from the University of Konstanz. Having gained experience as a research software engineer at the University of Konstanz, where he developed frameworks and libraries in the fields of bioimage analysis and machine learning, Christian moved on to become a software engineer at KNIME. He now focuses on developing new functionalities and extensions for KNIME Analytics Platform. Some of his recent projects include deep learning integrations built upon Keras and Tensorflow, extensions for image analysis and active learning, and the integration of H2O Machine Learning and H2O Sparkling Water in KNIME Analytics Platform.

Published in: Technology
  • Be the first to comment

H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI World London

  1. 1. © 2018 KNIME AG. All rights reserved. Leveraging H2O Machine Learning with KNIME Analytics Platform Christian Dietz KNIME
  2. 2. H2O Distributed Machine Learning Algorithms Supervised Learning • Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson and Tweedie • Naïve Bayes Statistical Analysis Ensembles • Distributed Random Forest: Classification or regression models • Gradient Boosting Machine: Produces an ensemble of decision trees with increasing refined approximations Deep Neural Networks • Deep learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations Unsupervised Learning • K-means: Partitions observations into k clusters/groups of the same spatial size. Automatically detect optimal k Clustering Dimensionality Reduction • Principal Component Analysis: Linearly transforms correlated variables to independent components • Generalized Low Rank Models: extend the idea of PCA to handle arbitrary data consisting of numerical, Boolean, categorical, and missing data Anomaly Detection • Autoencoders: Find outliers using a nonlinear dimensionality reduction using deep learning
  3. 3. Platforms with H2O Integration H2O + KNIME Talk at KNIME Summit March 2018
  4. 4. © 2018 KNIME AG. All rights reserved. 4 KNIME® • KNIME AG founded in 2008 • Offices in Zurich (HQ), Konstanz, Berlin, and Austin • Maintainer of the Open Source KNIME Analytics Platform – comprehensive data loading, processing, analysis, modeling platform – visual frontend – open: to all sorts of data, other tools (R and Python, etc.), various user personas – 20+ open source releases since 2006 – open source. • KNIME Server – 14 commercial product releases since 2008 • KNIME cloud offerings
  5. 5. © 2018 KNIME AG. All rights reserved. 5 KNIME® Software
  6. 6. © 2018 KNIME AG. All rights reserved. 6 KNIME® Analytics Platform
  7. 7. © 2018 KNIME AG. All rights reserved. 7 Analysis & Mining Statistics Data Mining Machine Learning Deep Learning Web Analytics Text Mining Network Analysis Social Media Analysis R, Weka, Python, H2O Community / 3rd Data Access MySQL, Oracle, ... SAS, SPSS, ... Excel, Flat, ... Hive, Impala, ... XML, JSON, PMML Text, Doc, Image, ... Web Crawlers Industry Specific Community / 3rd Transformation Row, Column Matrix Text, Image Time Series Java Python Community / 3rd Visualization R Python JavaScript Community / 3rd Deployment via BIRT PMML XML, JSON Databases Excel, Flat, etc. Text, Doc, Image Industry Specific Community / 3rd Over 2000 Native and Embedded Nodes Included
  8. 8. © 2018 KNIME AG. All rights reserved. 8 KNIME H2O Machine Learning Integration • Offer our users high-performance machine learning algorithms from H2O in KNIME • Allow to mix & match with other KNIME functionality – Data wrangling KNIME Analytics Platform functionality – KNIME Big-Data Connectors – Text Mining, Image Processing, Cheminformatics, … – and more!
  9. 9. © 2018 KNIME AG. All rights reserved. 9 KNIME H2O Machine Learning Integration
  10. 10. © 2018 KNIME AG. All rights reserved. 10 The Data Date Store ID Visitors 2016-01-01 ba937bf13d40fb24 28 … … … 2017-04-22 324f7c39a8410e7c 216 Date Store ID Visitors 2017-04-23 e8ed9335d0c38333 ? … … … 2017-05-31 8f13ef0f5e8c64dd ? Provided data: • Number of visitors • Reservations • Store information • Calendar date info
  11. 11. © 2018 KNIME AG. All rights reserved. 11 Visitor Forecasting Data preparation Model training Model optimization Model evaluation Deployment
  12. 12. © 2018 KNIME AG. All rights reserved. 12 Visitor Forecasting Data preparation Model training Model optimization Model evaluation Deployment
  13. 13. © 2018 KNIME AG. All rights reserved. 13 Data Preparation with KNIME Nodes
  14. 14. © 2018 KNIME AG. All rights reserved. 15 Visitor Forecasting Data preparation Model training Model optimization Model evaluation Deployment
  15. 15. © 2018 KNIME AG. All rights reserved. 16 Visitor Forecasting Data preparation Model training Model optimization Model evaluation Deployment
  16. 16. © 2018 KNIME AG. All rights reserved. 17 Modeling with the H2O Nodes
  17. 17. © 2018 KNIME AG. All rights reserved. 18 Modeling with the H2O Nodes
  18. 18. © 2018 KNIME AG. All rights reserved. 19 Modeling with the H2O Nodes
  19. 19. © 2018 KNIME AG. All rights reserved. 20 Modeling with the H2O Nodes
  20. 20. © 2018 KNIME AG. All rights reserved. 21 Visitor Forecasting Data preparation Model training Model optimization Model evaluation Deployment
  21. 21. © 2018 KNIME AG. All rights reserved. 22 Visitor Forecasting . Data preparation Model training Model optimization Model evaluation Deployment
  22. 22. © 2018 KNIME AG. All rights reserved. 24 Blend H2O with…Python, Java and R Scripting… 24
  23. 23. © 2018 KNIME AG. All rights reserved. 25 …Image Processing... 25
  24. 24. © 2018 KNIME AG. All rights reserved. 26 …Deep Learning...
  25. 25. © 2018 KNIME AG. All rights reserved. 27 ...Text Processing...
  26. 26. © 2018 KNIME AG. All rights reserved. 28 ...Databases...
  27. 27. © 2018 KNIME AG. All rights reserved. 29 ...a growing Big Data Integration.
  28. 28. © 2018 KNIME AG. All rights reserved. 30 H2O Sparkling Water in KNIME
  29. 29. © 2018 KNIME AG. All rights reserved. 31 H2O Sparkling Water in KNIME
  30. 30. © 2018 KNIME AG. All rights reserved. 32 H2O Sparkling Water in KNIME
  31. 31. © 2018 KNIME AG. All rights reserved. 33 H2O Sparkling Water in KNIME
  32. 32. © 2018 KNIME AG. All rights reserved. 34 H2O Sparkling Water in KNIME
  33. 33. © 2018 KNIME AG. All rights reserved. 35 Scoring with H2O MOJOs on Apache Spark
  34. 34. © 2018 KNIME AG. All rights reserved. 36 Thank You! www.knime.com
  35. 35. 37© 2018 KNIME AG. All rights reserved. The KNIME® trademark and logo and OPEN FOR INNOVATION® trademark are used by KNIME AG under license from KNIME GmbH, and are registered in the United States. KNIME® is also registered in Germany.

×