Building Hopsworks, a cloud-native managed feature store for machine learning

Jim Dowling
CEO, Logical Clocks
Cloud Native London Meetup, March 3 2021
Building Hopsworks, a cloud-native managed
feature store for machine learning

Can we make a Monolith ﬂy in the clouds?

The Hopsworks Feature Store - Available on all Platforms as Managed, Enterprise, and Community
hopsworks.ai
(managed platform)
Enterprise Hopsworks
(self-hosted platform)
Community Hopsworks**
(self-hosted platform)
Runs on any Platform*
(On-premise, Cloud, VMs, etc)
Runs on any Platform*
(On-premise, Cloud, VMs, etc)
*Supported operating systems: RHEL/Centos 7.x and Ubuntu 18.04. Minimum Requirements: 32GB RAM, 100GB disk, 8 CPUs. Runs in air-gapped environments.
**Community Hopsworks does not include (1) Feature Store Connectors to Third-Party Platforms and (2) SSO with Active Directory/OAuth-2/Azure-AD/AWS.
2016 2018 2020
Only Managed Feature Available
today on both AWS and Azure

When do I need a Feature
Store for Machine Learning
and what it is anyway?

Business Problem: Use Machine Learning to Predict Money Laundering
Reference: Whitepaper, Webinar

6
What data can I use to solve my Anti-Money Laundering Problem with?
Know Your Customer Data
Historical Financial Transactions
Recent
Financial
Transactions
Data Warehouse
Data Lake
Message Bus
TRAIN
SERVE

7
It is not always easy to get access to Enterprise data for training and serving.
Recent
Financial
Transactions
Data Warehouse
Data Lake
Message Bus
TRAIN
SERVE

8
What data can I use to make predictions with?
Recent
Financial
Transactions
Data Warehouse
Data Lake
Message Bus
TRAIN
SERVE
Feature
Store

Where does the Feature Store ﬁt into the ML Pipeline?
FEATURE STORE
TRAIN / SERVE
FEATURIZE

Oﬄine Feature Store - Create Training Data and Batch Predictions
df = kycFG.select_all().join(rftFG.select_all()).join(hftFG.select_all())
td = fs.create_training_dataset("precipitation_training_dataset",
version=1,
data_format="tfrecord",
description="Precipitation Training dataset",
splits={'train': 0.7, 'test': 0.2, 'validate': 0.1})
td.save(df)
FG=Feature Group https://docs.hopsworks.ai/
Feature Store
kycFG
rftFG
hftFG
Training Data
(.tfrecord)
Model
train

Online Feature Store - the Data Layer for Operational (Online) Models
US-West-1c
US-West-la
US-West-1b
RonDB2
Model
RonDB1 Model
RonDB3
Model
Online Application
1.JDBC 2.Predict
2-20ms
1. Build Feature Vector Using Online Feature Store
2. Send Feature Vector to Model for Prediction
~5-50ms

Code and
conﬁguration
Data Lake,
Warehouse,
Kafka
Model
Registry
Feature
Engineering
Model
Serving
Model
Training
Model
Deploy
Model
Monitoring
Model
Development
Features
Retrieve Features
Log Predictions Training Data Statistics
Sync
HopsFS
Scaleout
Metadata
Experiment
Tracking
Programs
Feature Statistics
A/B Test
Model
Statistics
Serving
Statistics
Search (Artifacts,
Provenance and
Metadata)
Feature
Store
Elasticsearch
Experiments
Hopsworks End-to-End Machine Learning (ML) Pipelines

Hopsworks - Develop and Operate ML Applications at Scale
APPLICATIONS
API
DASHBOARDS
HOPSWORKS
DATASOURCE
ORCHESTRATION
Airﬂow
BATCH
Apache Spark
STREAMING
Apache Spark
Apache Flink
HOPSWORKS
FEATURE
STORE
ML DEVELOP
AND TRAIN
Notebooks as Jobs
Tensorﬂow
Scikit-Learn
PyTorch
Tensorboard
FILESYSTEM & METASTORE
HopsFS
MODEL
SERVING AND
MONITORING
KFServing
TF-Serving
Flask
Data Preparation
& Ingestion
Experimentation
& Model Training
Deploy
& Productionalize
Apache
Kafka

Transitioning Security to
the Cloud….

15
Project-Based Multi-Tenant Security Model

16

17

Moving to the Cloud - Connectors and Integrations
Hopsworks
Project-Based Multi-Tenant Security
API
KEY
IAM Profile or Federated IAM Role
Users
Jobs
Dev Feature Store
Staging Feature Store
Prod Feature Store
User
Login
(LDAP, AD,
OAuth2, 2FA)
databricks
SageMaker
Kubeflow
Amazon EMR
Delta Lake
Snowflake
Amazon S3
Amazon
Redshift

19
Making Hopsworks Cloud-Native
Hopsworks Open Source Cloud Native Service
Open-Source Docker Repository ECR / ACR
Kubernetes EKS / AKS
Hopsworks Services Rejected Cloud Native Versions
Spark-on-YARN Databricks / EMR
HopsFS S3
RonDB DynamoDB/Elasticache
Kafka Managed Kafka
Elastic Open Distro AWS Elastic

Developing
Hopsworks.ai
The ﬁrst European Company to provide a managed
scale-out data and AI platform in the cloud

21
Hopsworks.ai
Early 2020
Nov 2020 (GA)

22
Serverless Platform on AWS - Amplify, Cognito, CloudFront, Lambdas, Route 53, DynamoDB

Integration with other Platforms - Databricks

25
Cloud-Native Kubernetes Integration
https://www.logicalclocks.com/blog/how-we-secure-your-data-with-hopsworks
HopsFS Hive
Elastic
Kafka
Hopsworks
Pod
User
Project Creation
Kubernetes (EKS, AKS)
Access
using
X.509 /JWT
v
Secrets
Project_User
X.509
JWT
Project_User
X.509
JWT
Project_
User
X.509
JWT
Jobs UI
Project-User
Docker Container
Project
Jobs
2
1
1
API
server
Scheduler
2

DynamoDB
Expensive,
High Latency (~10ms lookup),
Limited Query Support - Reporting a Problem,
Quotas, Hotspots

RonDB - a new open-source cloud-native distribution of NDB (MySQL Cluster)
Inventor of NDB
(MySQL Cluster)
www.rondb.com
RonDB vs Redis - RonDB outperforms on 1 CPU Core and Keeps on Scaling
MySQL Cluster (NDB) - the world’s highest throughput transactional datastore
200m ops/second with NDB - world’s fastest key-value store

28
RonDB - the ﬁrst LATS Database in the Cloud. Launched in private beta Feb 2021.
RonDB is a LATS Database
low Latency, high Availability, high Throughput, scalable Storage
< 1ms KV lookup
>10M KV Lookups/sec
>99.999% availability

30
Lessons Learnt (so far) in building a Cloud Native Managed Data/AI Platform
Shiny new Toys not always the best
● Lambda functions poor for synchronous events (e.g. request reply)
due to the slow response times
○ Unsuitable for "web" endpoints - 500-2000 ms response time
○ Cold lambdas, but also JS JIT.
○ Parallel operations difficult due to lack of support in lambda
● “Amplifeck’d” is a common word on our Slack
● SQL > Key Value APIs

31
RonDB Competitors
Latency Throughput
Availability
Scalability
RonDB
DynamoDB,
Cassandra,
BigTable
Redis
Online Feature
Stores

Demo Time.
github.com/logicalclocks/hopsworks
-
@logicalclocks
-
www.logicalclocks.com

Feature Engineering and Model Training Pipeline - With a Feature Store
KAFKA Train/Test Data
(S3, HDFS, etc)
Online
Application
Data Warehouse
Data Lake
Feature
Engineering
Oﬄine
Feature Store
Model
Training
Model
Serving
Online
Feature Store
Model
Repository
Monitor
Deploy
Feature Vectors
Result Sink (DB)
Batch
Scoring
Batch Access
Deploy
Feature Store

Building Hopsworks, a cloud-native managed feature store for machine learning

Recommended

Recommended

More Related Content

Similar to Building Hopsworks, a cloud-native managed feature store for machine learning

Similar to Building Hopsworks, a cloud-native managed feature store for machine learning (20)

More from Jim Dowling

More from Jim Dowling (20)

Recently uploaded

Recently uploaded (20)

Building Hopsworks, a cloud-native managed feature store for machine learning