SlideShare a Scribd company logo
© Hitachi America, Ltd. 2017. All rights reserved.
Updates on webSpoon
and other innovations from Hitachi R&D
11/11/2017
Researcher at Hitachi America, Ltd.
Hiromu Hota, PhD
@HiromuHota, hiromu.hota@hal.hitachi.com
© Hitachi America, Ltd. 2017. All rights reserved.
Contents
1
• webSpoon
– Demo
– Updates since PCM16 and missings
– Use cases
© Hitachi America, Ltd. 2017. All rights reserved.
webSpoon: a browser-based Spoon
2
• webSpoon works on any latest browser, accessible over a network.
• webSpoon has advantages:
• webSpoon is NOT supported by Pentaho or by Hitachi.
Smartphone/tablet
CloudData security Ease of mgmt.Remote use
Desktop/laptop
© Hitachi America, Ltd. 2017. All rights reserved.
Demo
3
1. webSpoon (demo instance in AWS)
2. Multi-tenancy (demo instance in local Docker)
© Hitachi America, Ltd. 2017. All rights reserved.
Updates since PCM16
4
• webSpoon became matured in its stability, functionality, and usability.
• Stability
– Fixed many things: menubar, shortcuts, scrollbar/zooming, copy/paste
– Automated UI Testing
– CI/CD (nightly build for every commit)
• Functionality
– Lots of steps/job entries and other type of plugins confirmed to be compatible
– Carte integration
– Multi-user/-tenant
• Usability
– FileDialog to open from the server’s file system / Import from the client’s
– Dockerized
– No longer 9051 port mapping
© Hitachi America, Ltd. 2017. All rights reserved.
Cloud / Scalability
5
• webSpoon is easily deployable to the cloud
– E.g., AWS Elastic Beanstalk
• webSpoon is scalable
webSpoon
Load Balancer
Clients
webSpoon instances
© Hitachi America, Ltd. 2017. All rights reserved.
• webSpoon serves multiple users
– User authentication
– User configuration
– Incomplete privacy among users
• Alice can see Bob’s configuration files
• Alice can see Bob’s Kettle files
(only when they are locally stored)
• webSpoon serves multiple tenants
– I’d assign dedicated instances for each tenant for the privacy concern,
though some argues that this arch is multi-instance, not multi-tenancy [1].
Multi-user / Multi-tenancy
6
webSpoon instances
Alice
Bob
Carol
Dave
Tenant A Tenant B
[1] Krebs, Rouven (2012). "Architectural Concerns in Multi-tenant SaaS Applications".
Proc. 2nd Int. Conf. on Cloud Computing and Services Science (CLOSER 2012).
Alice
Bob
: User authentication
© Hitachi America, Ltd. 2017. All rights reserved.
Compatible with Python/R steps
7
• Most of steps/job entries have been confirmed to be compatible with
webSpoon, including
– Python (CPython Script Executor)
– R (Execute R Script)
– R (R script executor, EE only)
• The rest of steps/job entries is just left un-tested.
© Hitachi America, Ltd. 2017. All rights reserved.
What’s still missing?
8
• Security
– End-users inherit the privileges of the user who runs the Tomcat.
• If root runs the Tomcat, all end-users have the root permission.
– Incomplete privacy among end-users:
• Alice can see Bob’s configuration files.
• Alice can see Bob’s Kettle files (when they are locally stored).
• Integration with Pentaho Server
– Not realized yet due to un-resolved conflicts.
• Some EE features
– DET (Data Exploration Tool)
© Hitachi America, Ltd. 2017. All rights reserved.
Use cases
9
© Hitachi America, Ltd. 2017. All rights reserved.
Data Security: Keep data where they should be
10
Spoon webSpoon
• Data engineers should physically
be near data.
• They might be tempted to
download data to work in their
office.
• They can work from office, home,
or wherever comfortable.
Hospital/Government/Bank
Data
When data cannot leave facility/country due to some regulations,
© Hitachi America, Ltd. 2017. All rights reserved.
Data integration of sensor data in remote sites
12
Thai factory
Tokyo office
Skilled engineer
1. Kettle files need updating frequently for many reasons:
• New machine, new sensor, new analytics, etc.
2. But, remote desktop (RDP) is prohibited and travel costs.
Data copy/move
User interaction
Travel costs
RDP
Remote desktop prohibited
Sensor
Database
Spoon
*Kettle file: Transformation or Job written in PDI
(Cropped) Asia - Single Color by FreeVectorMaps.com
© Hitachi America, Ltd. 2017. All rights reserved.
Data integration of sensor data in remote sites
13
Tokyo office
*HTTPS: HTTP Secure
Skilled engineer
Data copy/move
User interaction
No travel
General protocol
HTTPS
Thai factory
Sensor
Database
webSpoon
(Cropped) Asia - Single Color by FreeVectorMaps.com
1. Kettle files need updating frequently for many reasons:
• New machine, new sensor, new analytics, etc.
2. But, remote desktop (RDP) is prohibited and travel costs.
© Hitachi America, Ltd. 2017. All rights reserved.
Managed Pentaho development environment
15
• Different version, plugin, etc.
slows down collaboration.
• Could possibly be
– Outdated.
– Malicious plugins & drivers.
Spoon webSpoon
Bob Alice
Ver. X.X Ver. Y.Y
Your Kettle file does not
run in my environment!
• All Kettle files run in coworker’s
screen.
• No installation/upgrade/update
required (by end-users).
• Only desired plugins & drivers.
Your Kettle file runs
in my environment!
*Kettle file: Transformation or Job written in PDI
Plugin A
Driver B
Plugin A
Driver C
Bob Alice
© Hitachi America, Ltd. 2017. All rights reserved.
webSpoon streamlines the ML Workflow even more
16
• Data engineers/scientists share
– Tools (Pentaho/Python/R)
– Data stores
– Git repository
– Computing resources (e.g., Hadoop, Spark)
• As a result, collaboration between them becomes even more seamless
– Less dependent on IT staffs to setup tools, data stores, etc.
– No data copy/movement, no data dispersion
Data scientistsData engineers
webSpoon
Data stores
© Hitachi America, Ltd. 2017. All rights reserved.
Resources
17
• Source and binary
– https://github.com/HiromuHota/pentaho-kettle
• Docker image
– https://hub.docker.com/r/hiromuhota/webspoon
© Hitachi America, Ltd. 2017. All rights reserved.
One more thing...
18
© Hitachi America, Ltd. 2017. All rights reserved.
SpoonGit (Git client integrated with Spoon)
19
© Hitachi America, Ltd. 2017. All rights reserved.
Resources
20
• Source and binary
– https://github.com/HiromuHota/pdi-git-plugin
• Binary
– Pentaho Marketplace (in preparation)
© Hitachi America, Ltd. 2016. All rights reserved.
Trademarks and copyrights
21
• Pentaho is a trademark registered by Hitachi Vantara.
• Apache Hadoop and its logo are either registered trademarks or trademarks of
the Apache Software Foundation (ASF).
• Apache Spark, Spark and the Spark logo are trademarks of ASF.
• The Git Logo by Jason Long is licensed under the Creative Commons Attribution
3.0 Unported License.
• The R logo is © 2016 The R Foundation.
• RStudio and the RStudio logo are all registered trademarks of RStudio.
• The Python logo is a trademark of the Python Software Foundation.
• Jupyter and the Jupyter logs are trademarks of the NumFOCUS foundation.
• Docker and the Docker logo are trademarks or registered trademarks of Docker,
Inc. in the United States and/or other countries.
• The Jenkins logo is licensed under the Creative Commons Attribution-ShareAlike
3.0 Unported License.
• GitHub is a trademark registered in the United States by GitHub, Inc.
• Other company and product names mentioned in this document may be the
trademarks of their respective owners.
© Hitachi America, Ltd. 2017. All rights reserved.
Appendix
23
© Hitachi America, Ltd. 2017. All rights reserved.
webSpoon = Spoon - SWT + RWT
24
• Spoon relies on SWT for UI widgets (e.g., button, dialog, canvas).
• RWT is a web alternative to SWT and “largely” implements SWT APIs,
meaning Spoon can become a web app with most codes intact.
• There are
– Unimplemented SWT APIs (e.g., a part of GC, some Mouse events)
– RWT-specific additional APIs (e.g., Multi-user, File Up/Download).
Image adapted from https://angelozerr.wordpress.com/2011/05/24/rap_step5/
Operating System Servlet Container Web Browser
SWT RWT (Server) RWT (Client)
JFace JFace
HTTP
Spoon webSpoon
© Hitachi America, Ltd. 2017. All rights reserved.
1. Local files
– Spoon: local files of the laptop/desktop
– webSpoon: local files of the (remote) server
2. Clipboard
– Spoon and webSpoon do not share the clipboard.
– In other words, no copy & paste between Spoon and webSpoon.
How is webSpoon different from Spoon?
25
File A is local File B is local
File A is local
File B is remote
File A File B
webSpoon
Spoon

More Related Content

What's hot

Git and Github slides.pdf
Git and Github slides.pdfGit and Github slides.pdf
Git and Github slides.pdf
Tilton2
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Git & GitHub for Beginners
Git & GitHub for BeginnersGit & GitHub for Beginners
Git & GitHub for Beginners
Sébastien Saunier
 
Airflow - a data flow engine
Airflow - a data flow engineAirflow - a data flow engine
Airflow - a data flow engine
Walter Liu
 
HDFS Namenode High Availability
HDFS Namenode High AvailabilityHDFS Namenode High Availability
HDFS Namenode High Availability
Hortonworks
 
Docker by Example - Basics
Docker by Example - Basics Docker by Example - Basics
Docker by Example - Basics
CodeOps Technologies LLP
 
Airflow for Beginners
Airflow for BeginnersAirflow for Beginners
Airflow for Beginners
Varya Karpenko
 
Red Hat OpenShift on Bare Metal and Containerized Storage
Red Hat OpenShift on Bare Metal and Containerized StorageRed Hat OpenShift on Bare Metal and Containerized Storage
Red Hat OpenShift on Bare Metal and Containerized Storage
Greg Hoelzer
 
SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...
SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...
SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...
Sascha Junkert
 
Vue d'ensemble Dremio
Vue d'ensemble DremioVue d'ensemble Dremio
Vue d'ensemble Dremio
Modern Data Stack France
 
The 12 Factor App
The 12 Factor AppThe 12 Factor App
The 12 Factor App
rudiyardley
 
Cloudera Customer Success Story
Cloudera Customer Success StoryCloudera Customer Success Story
Cloudera Customer Success Story
Xpand IT
 
Airflow introduction
Airflow introductionAirflow introduction
Airflow introduction
Chandler Huang
 
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow managementIntro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Burasakorn Sabyeying
 
Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016
Sid Anand
 
Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...
Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...
Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...
GeoSolutions
 
The Path Towards Spring Boot Native Applications
The Path Towards Spring Boot Native ApplicationsThe Path Towards Spring Boot Native Applications
The Path Towards Spring Boot Native Applications
VMware Tanzu
 
Container Runtime Security with Falco
Container Runtime Security with FalcoContainer Runtime Security with Falco
Container Runtime Security with Falco
Michael Ducy
 
Flagger: Istio Progressive Delivery Operator
Flagger: Istio Progressive Delivery OperatorFlagger: Istio Progressive Delivery Operator
Flagger: Istio Progressive Delivery Operator
Weaveworks
 
Github basics
Github basicsGithub basics
Github basics
Radoslav Georgiev
 

What's hot (20)

Git and Github slides.pdf
Git and Github slides.pdfGit and Github slides.pdf
Git and Github slides.pdf
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Git & GitHub for Beginners
Git & GitHub for BeginnersGit & GitHub for Beginners
Git & GitHub for Beginners
 
Airflow - a data flow engine
Airflow - a data flow engineAirflow - a data flow engine
Airflow - a data flow engine
 
HDFS Namenode High Availability
HDFS Namenode High AvailabilityHDFS Namenode High Availability
HDFS Namenode High Availability
 
Docker by Example - Basics
Docker by Example - Basics Docker by Example - Basics
Docker by Example - Basics
 
Airflow for Beginners
Airflow for BeginnersAirflow for Beginners
Airflow for Beginners
 
Red Hat OpenShift on Bare Metal and Containerized Storage
Red Hat OpenShift on Bare Metal and Containerized StorageRed Hat OpenShift on Bare Metal and Containerized Storage
Red Hat OpenShift on Bare Metal and Containerized Storage
 
SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...
SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...
SAP Inside Track Munich 2018 - DevOps and Deployment Pipelines in ABAP Landsc...
 
Vue d'ensemble Dremio
Vue d'ensemble DremioVue d'ensemble Dremio
Vue d'ensemble Dremio
 
The 12 Factor App
The 12 Factor AppThe 12 Factor App
The 12 Factor App
 
Cloudera Customer Success Story
Cloudera Customer Success StoryCloudera Customer Success Story
Cloudera Customer Success Story
 
Airflow introduction
Airflow introductionAirflow introduction
Airflow introduction
 
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow managementIntro to Airflow: Goodbye Cron, Welcome scheduled workflow management
Intro to Airflow: Goodbye Cron, Welcome scheduled workflow management
 
Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016Introduction to Apache Airflow - Data Day Seattle 2016
Introduction to Apache Airflow - Data Day Seattle 2016
 
Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...
Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...
Crunching Data In GeoServer: Mastering Rendering Transformations, WPS Process...
 
The Path Towards Spring Boot Native Applications
The Path Towards Spring Boot Native ApplicationsThe Path Towards Spring Boot Native Applications
The Path Towards Spring Boot Native Applications
 
Container Runtime Security with Falco
Container Runtime Security with FalcoContainer Runtime Security with Falco
Container Runtime Security with Falco
 
Flagger: Istio Progressive Delivery Operator
Flagger: Istio Progressive Delivery OperatorFlagger: Istio Progressive Delivery Operator
Flagger: Istio Progressive Delivery Operator
 
Github basics
Github basicsGithub basics
Github basics
 

Viewers also liked

Understanding the Pentaho CDE NewMapComponent
Understanding the Pentaho CDE NewMapComponentUnderstanding the Pentaho CDE NewMapComponent
Understanding the Pentaho CDE NewMapComponent
Kleyson Rios
 
Pentaho 8 Reporting for Java Developers - Because details matter
Pentaho 8 Reporting for Java Developers - Because details matterPentaho 8 Reporting for Java Developers - Because details matter
Pentaho 8 Reporting for Java Developers - Because details matter
Francesco Corti
 
Pentaho PDI and the Jare Ruleengine
Pentaho PDI and the Jare RuleenginePentaho PDI and the Jare Ruleengine
Pentaho PDI and the Jare Ruleengine
uwe geercken
 
Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)
Slawomir Chodnicki
 
javascriptのデータ構造の話
javascriptのデータ構造の話javascriptのデータ構造の話
javascriptのデータ構造の話
Taketoshi 青野健利
 
私はこうやってSlackを社内で流行らせました
私はこうやってSlackを社内で流行らせました私はこうやってSlackを社内で流行らせました
私はこうやってSlackを社内で流行らせました
NHN テコラス株式会社
 

Viewers also liked (6)

Understanding the Pentaho CDE NewMapComponent
Understanding the Pentaho CDE NewMapComponentUnderstanding the Pentaho CDE NewMapComponent
Understanding the Pentaho CDE NewMapComponent
 
Pentaho 8 Reporting for Java Developers - Because details matter
Pentaho 8 Reporting for Java Developers - Because details matterPentaho 8 Reporting for Java Developers - Because details matter
Pentaho 8 Reporting for Java Developers - Because details matter
 
Pentaho PDI and the Jare Ruleengine
Pentaho PDI and the Jare RuleenginePentaho PDI and the Jare Ruleengine
Pentaho PDI and the Jare Ruleengine
 
Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)Continuous ETL Testing for Pentaho Data Integration (kettle)
Continuous ETL Testing for Pentaho Data Integration (kettle)
 
javascriptのデータ構造の話
javascriptのデータ構造の話javascriptのデータ構造の話
javascriptのデータ構造の話
 
私はこうやってSlackを社内で流行らせました
私はこうやってSlackを社内で流行らせました私はこうやってSlackを社内で流行らせました
私はこうやってSlackを社内で流行らせました
 

Similar to Updates on webSpoon and other innovations from Hitachi R&D

Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)
Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)
Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)
Hiromu Hota
 
INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...
INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...
INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...
Heiko Voigt
 
IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...
IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...
IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...
darwinodb
 
Http Services in Rust on Containers
Http Services in Rust on ContainersHttp Services in Rust on Containers
Http Services in Rust on Containers
Anton Whalley
 
Cloud Foundry Summit 2017
Cloud Foundry Summit 2017Cloud Foundry Summit 2017
Serverless computing with Google Cloud
Serverless computing with Google CloudServerless computing with Google Cloud
Serverless computing with Google Cloud
wesley chun
 
Approaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsApproaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC apps
Rogue Wave Software
 
Serverless Computing with Google Cloud
Serverless Computing with Google CloudServerless Computing with Google Cloud
Serverless Computing with Google Cloud
wesley chun
 
How to integrate OpenStack Swift to your "legacy" system
How to integrate OpenStack Swift to your "legacy" systemHow to integrate OpenStack Swift to your "legacy" system
How to integrate OpenStack Swift to your "legacy" system
Masaaki Nakagawa
 
Ai pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksAi pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooks
Luciano Resende
 
Advanced Strategies for Testing Responsive Web
Advanced Strategies for Testing Responsive WebAdvanced Strategies for Testing Responsive Web
Advanced Strategies for Testing Responsive Web
Perfecto by Perforce
 
Sam segal resume
Sam segal resumeSam segal resume
Sam segal resume
samuel segal
 
Apache deep learning 101
Apache deep learning 101Apache deep learning 101
Apache deep learning 101
DataWorks Summit
 
Web Technologies in Automotive & Robotics (BlinkOn 10)
Web Technologies in Automotive & Robotics (BlinkOn 10)Web Technologies in Automotive & Robotics (BlinkOn 10)
Web Technologies in Automotive & Robotics (BlinkOn 10)
Igalia
 
02 intro
02   intro02   intro
02 intro
babak mehrabi
 
Apache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFiApache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFi
Timothy Spann
 
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e... Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
VMware Tanzu
 
Streams GitHub Products Overview for IBM InfoSphere Streams V4.0
Streams GitHub Products Overview for IBM InfoSphere Streams V4.0Streams GitHub Products Overview for IBM InfoSphere Streams V4.0
Streams GitHub Products Overview for IBM InfoSphere Streams V4.0
lisanl
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
DataWorks Summit
 

Similar to Updates on webSpoon and other innovations from Hitachi R&D (20)

Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)
Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)
Overview of webSpoon @ Pentaho Community Meeting 2016 (PCM16)
 
SamSegalResume
SamSegalResumeSamSegalResume
SamSegalResume
 
INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...
INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...
INF104 - HCL Domino AppDev Pack – The Future of Domino App Dev Nobody Knows A...
 
IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...
IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...
IBM ConnectED SPOT104: Lightning-Fast Development of Native Mobile Apps for I...
 
Http Services in Rust on Containers
Http Services in Rust on ContainersHttp Services in Rust on Containers
Http Services in Rust on Containers
 
Cloud Foundry Summit 2017
Cloud Foundry Summit 2017Cloud Foundry Summit 2017
Cloud Foundry Summit 2017
 
Serverless computing with Google Cloud
Serverless computing with Google CloudServerless computing with Google Cloud
Serverless computing with Google Cloud
 
Approaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsApproaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC apps
 
Serverless Computing with Google Cloud
Serverless Computing with Google CloudServerless Computing with Google Cloud
Serverless Computing with Google Cloud
 
How to integrate OpenStack Swift to your "legacy" system
How to integrate OpenStack Swift to your "legacy" systemHow to integrate OpenStack Swift to your "legacy" system
How to integrate OpenStack Swift to your "legacy" system
 
Ai pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksAi pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooks
 
Advanced Strategies for Testing Responsive Web
Advanced Strategies for Testing Responsive WebAdvanced Strategies for Testing Responsive Web
Advanced Strategies for Testing Responsive Web
 
Sam segal resume
Sam segal resumeSam segal resume
Sam segal resume
 
Apache deep learning 101
Apache deep learning 101Apache deep learning 101
Apache deep learning 101
 
Web Technologies in Automotive & Robotics (BlinkOn 10)
Web Technologies in Automotive & Robotics (BlinkOn 10)Web Technologies in Automotive & Robotics (BlinkOn 10)
Web Technologies in Automotive & Robotics (BlinkOn 10)
 
02 intro
02   intro02   intro
02 intro
 
Apache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFiApache MXNet for IoT with Apache NiFi
Apache MXNet for IoT with Apache NiFi
 
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e... Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 
Streams GitHub Products Overview for IBM InfoSphere Streams V4.0
Streams GitHub Products Overview for IBM InfoSphere Streams V4.0Streams GitHub Products Overview for IBM InfoSphere Streams V4.0
Streams GitHub Products Overview for IBM InfoSphere Streams V4.0
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
 

Recently uploaded

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 

Recently uploaded (20)

Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 

Updates on webSpoon and other innovations from Hitachi R&D

  • 1. © Hitachi America, Ltd. 2017. All rights reserved. Updates on webSpoon and other innovations from Hitachi R&D 11/11/2017 Researcher at Hitachi America, Ltd. Hiromu Hota, PhD @HiromuHota, hiromu.hota@hal.hitachi.com
  • 2. © Hitachi America, Ltd. 2017. All rights reserved. Contents 1 • webSpoon – Demo – Updates since PCM16 and missings – Use cases
  • 3. © Hitachi America, Ltd. 2017. All rights reserved. webSpoon: a browser-based Spoon 2 • webSpoon works on any latest browser, accessible over a network. • webSpoon has advantages: • webSpoon is NOT supported by Pentaho or by Hitachi. Smartphone/tablet CloudData security Ease of mgmt.Remote use Desktop/laptop
  • 4. © Hitachi America, Ltd. 2017. All rights reserved. Demo 3 1. webSpoon (demo instance in AWS) 2. Multi-tenancy (demo instance in local Docker)
  • 5. © Hitachi America, Ltd. 2017. All rights reserved. Updates since PCM16 4 • webSpoon became matured in its stability, functionality, and usability. • Stability – Fixed many things: menubar, shortcuts, scrollbar/zooming, copy/paste – Automated UI Testing – CI/CD (nightly build for every commit) • Functionality – Lots of steps/job entries and other type of plugins confirmed to be compatible – Carte integration – Multi-user/-tenant • Usability – FileDialog to open from the server’s file system / Import from the client’s – Dockerized – No longer 9051 port mapping
  • 6. © Hitachi America, Ltd. 2017. All rights reserved. Cloud / Scalability 5 • webSpoon is easily deployable to the cloud – E.g., AWS Elastic Beanstalk • webSpoon is scalable webSpoon Load Balancer Clients webSpoon instances
  • 7. © Hitachi America, Ltd. 2017. All rights reserved. • webSpoon serves multiple users – User authentication – User configuration – Incomplete privacy among users • Alice can see Bob’s configuration files • Alice can see Bob’s Kettle files (only when they are locally stored) • webSpoon serves multiple tenants – I’d assign dedicated instances for each tenant for the privacy concern, though some argues that this arch is multi-instance, not multi-tenancy [1]. Multi-user / Multi-tenancy 6 webSpoon instances Alice Bob Carol Dave Tenant A Tenant B [1] Krebs, Rouven (2012). "Architectural Concerns in Multi-tenant SaaS Applications". Proc. 2nd Int. Conf. on Cloud Computing and Services Science (CLOSER 2012). Alice Bob : User authentication
  • 8. © Hitachi America, Ltd. 2017. All rights reserved. Compatible with Python/R steps 7 • Most of steps/job entries have been confirmed to be compatible with webSpoon, including – Python (CPython Script Executor) – R (Execute R Script) – R (R script executor, EE only) • The rest of steps/job entries is just left un-tested.
  • 9. © Hitachi America, Ltd. 2017. All rights reserved. What’s still missing? 8 • Security – End-users inherit the privileges of the user who runs the Tomcat. • If root runs the Tomcat, all end-users have the root permission. – Incomplete privacy among end-users: • Alice can see Bob’s configuration files. • Alice can see Bob’s Kettle files (when they are locally stored). • Integration with Pentaho Server – Not realized yet due to un-resolved conflicts. • Some EE features – DET (Data Exploration Tool)
  • 10. © Hitachi America, Ltd. 2017. All rights reserved. Use cases 9
  • 11. © Hitachi America, Ltd. 2017. All rights reserved. Data Security: Keep data where they should be 10 Spoon webSpoon • Data engineers should physically be near data. • They might be tempted to download data to work in their office. • They can work from office, home, or wherever comfortable. Hospital/Government/Bank Data When data cannot leave facility/country due to some regulations,
  • 12. © Hitachi America, Ltd. 2017. All rights reserved. Data integration of sensor data in remote sites 12 Thai factory Tokyo office Skilled engineer 1. Kettle files need updating frequently for many reasons: • New machine, new sensor, new analytics, etc. 2. But, remote desktop (RDP) is prohibited and travel costs. Data copy/move User interaction Travel costs RDP Remote desktop prohibited Sensor Database Spoon *Kettle file: Transformation or Job written in PDI (Cropped) Asia - Single Color by FreeVectorMaps.com
  • 13. © Hitachi America, Ltd. 2017. All rights reserved. Data integration of sensor data in remote sites 13 Tokyo office *HTTPS: HTTP Secure Skilled engineer Data copy/move User interaction No travel General protocol HTTPS Thai factory Sensor Database webSpoon (Cropped) Asia - Single Color by FreeVectorMaps.com 1. Kettle files need updating frequently for many reasons: • New machine, new sensor, new analytics, etc. 2. But, remote desktop (RDP) is prohibited and travel costs.
  • 14. © Hitachi America, Ltd. 2017. All rights reserved. Managed Pentaho development environment 15 • Different version, plugin, etc. slows down collaboration. • Could possibly be – Outdated. – Malicious plugins & drivers. Spoon webSpoon Bob Alice Ver. X.X Ver. Y.Y Your Kettle file does not run in my environment! • All Kettle files run in coworker’s screen. • No installation/upgrade/update required (by end-users). • Only desired plugins & drivers. Your Kettle file runs in my environment! *Kettle file: Transformation or Job written in PDI Plugin A Driver B Plugin A Driver C Bob Alice
  • 15. © Hitachi America, Ltd. 2017. All rights reserved. webSpoon streamlines the ML Workflow even more 16 • Data engineers/scientists share – Tools (Pentaho/Python/R) – Data stores – Git repository – Computing resources (e.g., Hadoop, Spark) • As a result, collaboration between them becomes even more seamless – Less dependent on IT staffs to setup tools, data stores, etc. – No data copy/movement, no data dispersion Data scientistsData engineers webSpoon Data stores
  • 16. © Hitachi America, Ltd. 2017. All rights reserved. Resources 17 • Source and binary – https://github.com/HiromuHota/pentaho-kettle • Docker image – https://hub.docker.com/r/hiromuhota/webspoon
  • 17. © Hitachi America, Ltd. 2017. All rights reserved. One more thing... 18
  • 18. © Hitachi America, Ltd. 2017. All rights reserved. SpoonGit (Git client integrated with Spoon) 19
  • 19. © Hitachi America, Ltd. 2017. All rights reserved. Resources 20 • Source and binary – https://github.com/HiromuHota/pdi-git-plugin • Binary – Pentaho Marketplace (in preparation)
  • 20. © Hitachi America, Ltd. 2016. All rights reserved. Trademarks and copyrights 21 • Pentaho is a trademark registered by Hitachi Vantara. • Apache Hadoop and its logo are either registered trademarks or trademarks of the Apache Software Foundation (ASF). • Apache Spark, Spark and the Spark logo are trademarks of ASF. • The Git Logo by Jason Long is licensed under the Creative Commons Attribution 3.0 Unported License. • The R logo is © 2016 The R Foundation. • RStudio and the RStudio logo are all registered trademarks of RStudio. • The Python logo is a trademark of the Python Software Foundation. • Jupyter and the Jupyter logs are trademarks of the NumFOCUS foundation. • Docker and the Docker logo are trademarks or registered trademarks of Docker, Inc. in the United States and/or other countries. • The Jenkins logo is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. • GitHub is a trademark registered in the United States by GitHub, Inc. • Other company and product names mentioned in this document may be the trademarks of their respective owners.
  • 21.
  • 22. © Hitachi America, Ltd. 2017. All rights reserved. Appendix 23
  • 23. © Hitachi America, Ltd. 2017. All rights reserved. webSpoon = Spoon - SWT + RWT 24 • Spoon relies on SWT for UI widgets (e.g., button, dialog, canvas). • RWT is a web alternative to SWT and “largely” implements SWT APIs, meaning Spoon can become a web app with most codes intact. • There are – Unimplemented SWT APIs (e.g., a part of GC, some Mouse events) – RWT-specific additional APIs (e.g., Multi-user, File Up/Download). Image adapted from https://angelozerr.wordpress.com/2011/05/24/rap_step5/ Operating System Servlet Container Web Browser SWT RWT (Server) RWT (Client) JFace JFace HTTP Spoon webSpoon
  • 24. © Hitachi America, Ltd. 2017. All rights reserved. 1. Local files – Spoon: local files of the laptop/desktop – webSpoon: local files of the (remote) server 2. Clipboard – Spoon and webSpoon do not share the clipboard. – In other words, no copy & paste between Spoon and webSpoon. How is webSpoon different from Spoon? 25 File A is local File B is local File A is local File B is remote File A File B webSpoon Spoon