1
A r u n K e j a r i w a l
K a r t h i k R a m a s a m y
MODEL SERVING VIA
PULSAR FUNCTIONS
2
AI FOR THE
ENTERPRISE
Annual revenue: $3.7 B (2017) → $80.7 (2025) [1]
Content Acquisition
✦ Disparate sources
Content Understanding
✦ Unstructured text
๏ Learning the context is key and non-trivial
✦ Dynamic
๏ Continuous learning and maintenance
[1] https://www.tractica.com/research/artificial-intelligence-for-enterprise-applications/
3
ML/AI FOR THE ENTERPRISE
ML MODELS
NEURAL NETWORKS
REINFORCEMENT LEARNING
T R A I N I N G
FEATURES
EMBEDDINGS
S T O R A G E
ONLINE
REAL-TIME
S E R V I N G
4
SERVING
L O W L AT E N C Y H I G H
T H R O U G H P U T
G R A C E F U L
D E G R A D AT I O N
T H R E E P I L L A R S
5
SERVING
A P P L I C AT I O N
C O D E C H A N G E
D E B U G G I N G
D E P L O Y M E N T L O N G E R T I M E
TO I T E R AT E
C Y C L E S
C H A L L E N G E S
6
LEVERAGING SERVERLESS
✦ REAL-TIME
✦ HIGHLY SCALABLE
✦ FAULT TOLERANT
✦ EASILY PROGRAMMABLE
✦ SUPPORT FOR PLUG-AND-PLAY ANALYTICS
7
SERVERLESS
E V O L U T I O N *
* Figure borrowed from "Serverless Is More: From PaaS to Present Cloud Computing", by Eyk et al. 2018.
8
SERVERLESS
A N O V E R V I E W
F u n c t i o n a s a S e r v i c e ( F a a S ) B a c k e n d a s a S e r v i c e ( B a a S )
AW S L a m b d a , G o o g l e C l o u d F u n c t i o n s
I B M C l o u d F u n c t i o n s
O b j e c t s t o ra g e ,
D a t a b a s e s ,
M e s s a g i n g
9
SERVERLESS
C L O U D F U N C T I O N S
* Figure borrowed from "Serverless Computation with OpenLambda", by Hendrickson et al. 2018.
10
Execution without managing
resource allocation
From x86 machine code to
high-level programming
languages
CODE
Is stateless
Event driven
Fine-grain autoscaling
Decoupled from storage
COMPUTATION
Resources used instead of
resources allocated
100 ms increment
BILLING
SERVERLESS
A N O V E R V I E W
11
APACHE PULSAR
R E A L - T I M E M E S S A G I N G + S T O R A G E
M O D E L U P D AT E W I T H O U T R E S TA R T I N G T H E A P P L I C AT I O N
N AT I V E S U P P O R T F O R S E R V E R L E S S S T R E A M F U N C T I O N S
A N O V E R V I E W
12
APACHE PULSAR
T E R M I N O L O G Y
Apache Pulsar Cluster
Product
Safety
ETL
Fraud
Detection
Topic-1
Account History
Topic-2
User Clustering
Topic-1
Risk Classication
MarketingCampaigns
ETL
Topic-1
Budgeted Spend
Topic-2
Demographic Classication
Topic-1
Location Resolution
Data
Serving
Microservice
Topic-1
Customer Authentication
Tenants
Namespaces
Topics
13
APACHE PULSAR
DURABILITY MULTI-TENANCY TIERED STORAGE UNIFIED MESSAGING
& QUEUING
HIGHLY SCALABLE
K E Y C H A R AC T E R I S T I C S
14
APACHE PULSAR
I N D E P E N D E N T S C A L A B I L I T Y
I N S TA N T S C A L A B I L I T Y
F A U L T TO L E R A N C E
A R C H I T E C T U R A L D E S I G N
Bookie Bookie Bookie
Broker Broker Broker
Producer Consumer
15
APACHE PULSAR
W R I T E
TA I L I N G R E A D S
C ATC H U P R E A D S
AC C E S S PAT T E R N S
16
PULSAR FUNCTIONS
S I M P L E S T P O S S I B L E A P I F U N C T I O N O R A P R O C E D U R E
S U P P O R T F O R M U L T I - L A N G U A G E ( J a v a & P y t h o n )
F L E X I B L E R U N T I M E
I N T R O D U C T I O N
17
PULSAR FUNCTIONS
I N T R O D U C T I O N
Pulsar Function
i/p topic 1
i/p topic 2
i/p topic 3
o/p topic 1
o/p topic 2
o/p topic 3
18
PULSAR FUNCTIONS
I N T R O D U C T I O N
import java.util.function.Function;
public class ExclamationFunction implements Function<String, String> {
@Override
public String apply(String input) {
return input + "!";
}
}
Exclamation Functioni/p topic 1 o/p topic 2
strings strings
19
PULSAR FUNCTIONS
AT L E A S T O N C E
AT M O S T O N C E
E X A C T L Y O N C E
P R O C E S S I N G G UA R A N T E E S
20
PULSAR FUNCTIONS
D Y N A M I C D ATA R O U T I N G
D ATA F I L T E R I N G
D ATA E N R I C H M E N T
E V E N T P R O C E S S I N G D E S I G N PAT T E R N S
A L E R T S A N D T H R E S H O L D S
D ATA T R A N S F O R M AT I O N S
C O U N T I N G W I T H W I N D O W S
21
T H R E A D S P R O C E S S E S C O N T A I N E R S
PULSAR FUNCTIONS
D E P L O Y M E N T
22
PULSAR FUNCTIONS
D E P L O Y M E N T
Broker 1
Worker
Function
wordcount-1
Function
transform-2
Broker 1
Worker
Function
transform-1
Function
dataroute-1
Broker 1
Worker
Function
wordcount-2
Function
transform-3
Node 1 Node 2 Node 3
23
PULSAR FUNCTIONS
D E P L O Y M E N T
Worker
Function
wordcount-1
Function
transform-2
Worker
Function
transform-1
Function
dataroute-1
Worker
Function
wordcount-2
Function
transform-3
Node 1 Node 2 Node 3
Broker 1 Broker 2 Broker 3
Node 4 Node 5 Node 6
24
PULSAR FUNCTIONS
D E P L O Y M E N T - K U B E R N E T E S
Function
wordcount-1
Function
transform-1
Function
transform-3
Pod 1 Pod 2 Pod 3
Broker 1 Broker 2 Broker 3
Pod 7 Pod 8 Pod 9
Function
dataroute-1
Function
wordcount-2
Function
transform-2
Pod 4 Pod 5 Pod 6
25
PULSAR FUNCTIONS
D ATA T R A N S F O R M AT I O N
D ATA E X T R A C T I O N
C O N T E N T R O U T I N G & F I L T E R I N G
U S E C A S E S I N DATA E N G I N E E R I N G
26
APACHE PULSAR
+
MODEL SERVING
* Figure borrowed from "A Case for Serverless Machine Learning", by Carreira et al. 2018.
*
D I S T R I B U T E D M L
Hundreds of concurrent workers
Map to serverless functions
Backend manages compute
resources and task scheduling
ML PIPELINE
T R A I N I N G
27
ML PIPELINE
I N F E R E N C E
G U I D I N G D E C I S I O N M A K I N G
Calls for less computational power than training
RESTful endpoint for Functions
✦ Available via HTTP GET request
Model size
✦ Tens of MB to over a GB
Cold start → Warm start
✦ Function reuse
Backend manages compute resources and task scheduling
28
29
Model Serving
Pulsar Function
Model Stream
Data Stream
model i/p topic
data i/p topic
Inference Stream
Inference o/p topic
MODEL SERVING
30
Model Stream
Data Stream
model i/p topic
data i/p topic
Inference Stream
inference o/p topic
MODEL
MODEL SERVING
31
Model Stream
Data Stream
model i/p topic
data i/p topic
Inference Stream
inference o/p topicMODELold
MODELnew
MODEL SERVING
32
Model Stream
Data Stream
model i/p topic
data i/p topic
Inference Stream
inference o/p topic
MODEL
User Defined Metrics
Recall
Precision
MODEL SERVING
M E T R I C S
DATA
SKETCHES
Approximate
✦ Probabilistic Bounds
Accuracy-Speed Trade-off
One Pass
Incremental
Low memory footprint
A S M O D E L S
33
34
FLAVORS
O F DATA S K E TC H E S
S A M P L I N G C A R D I N A L I T YF I L T E R I N G
F R E Q U E N T
E L E M E N T S
A N O M A L Y
D E T E C T I O N
Q U A N T I L E S
35
FILTERING
B L O O M F I L T E R
MEMBERSHIP
import org.apache.pulsar.functions.api.Context;
import org.apache.pulsar.functions.api.Function;
import com.clearspring.analytics.stream.membership.BloomFilter;
public class BloomFilterFunction implements Function<String, Void> {
BloomFilter filter = new BloomFilter(20, 20);
Void process(String input, Context context) throws Exception {
if (!filter.isPresent(input)) {
filter.add(input);
// Route to “not seen” topic
context.publish(“notSeenTopic”, input);
}
return null;
}
}
36
FREQUENT ELEMENTS
C O U N T - M I N S K E T C H
FREQUENCY
import org.apache.pulsar.functions.api.Context;
import org.apache.pulsar.functions.api.Function;
import com.clearspring.analytics.stream.frequency.CountMinSketch;
public class CountMinFunction implements Function<String, Void> {
CountMinSketch sketch = new CountMinSketch(20, 20, 128);
Void process(String input, Context context) throws Exception {
sketch.add(input, 1); // Calculates bit indexes and performs +1
long count = sketch.estimateCount(input);
// React to the updated count
return null;
}
}
37
CARDINALITY
H Y P E R L O G L O G
# UNIQUE ELEMENTS
import org.apache.pulsar.functions.api.Context;
import org.apache.pulsar.functions.api.Function;
import io.airlift.stats.cardinality.HyperLogLog;
public class HyperLogLogFunction implements Function<Integer, Void> {
HyperLogLog hll = HyperLogLog.newInstance(2048);
Void process(Integer value, Context context) throws Exception {
hll.add(value);
Integer numDistinctElements = hll.cardinality();
// Do something with the distinct elements
}
}
38
SEVERLESS ML
G P U S U P P O R T
Key for Deep Learning
F A S T S H A R E D S T O R A G E
Functions do not talk to each other
Example: Crail*, Pocket
E X T E N D I N G T O T H E E D G E
Functions running on
✦ Smartphones
✦ IoT Devices
G O I N G F O R WA R D
* https://crail.apache.org/
39
“It is better to fail
in originality than
to succeed in
imitation.”
H e r m a n M e l v i l l e
40
P R O J E C T D E S C R I P T I O N
41
https://streaml.io/blog/eda-real-time-analytics-with-pulsar-functions
MORE ON PULSAR
https://streaml.io/blog/eda-event-processing-design-patterns-with-pulsar-functions
https://streaml.io/blog/apache-pulsar-architecture-designing-for-streaming-performance-and-scalability
https://www.businesswire.com/news/home/20180306005633/en/Apache-Pulsar-Outperforms-Apache-Kafka-2.5x-OpenMessaging
https://streaml.io/blog/intro-to-pulsar
https://pulsar.apache.org/
42
Efficient Construction of Approximate Ad-Hoc ML models Through Materialization and Reuse [Hasani et al. 2018]
READINGS
A Case for Serverless Machine Learning [Carreira et al. 2018]
Pocket: Elastic ephemeral storage for serverless analytics [Klimovic et al. 2018]
Serving deep learning models in a serverless platform [Ishakian et al. 2018]
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems [Lee et al. 2018]
Cloud Programming Simplified: A Berkeley View on Serverless Computing [Jona et al. 2019]
43
Serverless Is More: From PaaS to Present Cloud Computing [Eyk et al. 2018]
READINGS
Serverless Computing: Current Trends and Open Problems [Baldini et al. 2018]
Clipper: A low-latency online prediction serving system [Crankshaw et al. 2017]
Borg, Omega, and Kubernetes [Burns et al. 2016]
Serverless Computation with OpenLambda [Hendrickson et al. 2016]
TensorFlow Serving [https://www.tensorflow.org/tfx/guide/serving]
44
Architecture of a Serverless Machine Learning Model [https://cloud.google.com/solutions/architecture-of-a-serverless-ml-model]
READINGS
Pure serverless machine learning inference with AWS Lambda and Layers
[https://medium.com/merapar/pure-serverless-machine-learning-inference-with-aws-lambda-and-layers-979702d9ae49]

Model Serving via Pulsar Functions

  • 1.
    1 A r un K e j a r i w a l K a r t h i k R a m a s a m y MODEL SERVING VIA PULSAR FUNCTIONS
  • 2.
    2 AI FOR THE ENTERPRISE Annualrevenue: $3.7 B (2017) → $80.7 (2025) [1] Content Acquisition ✦ Disparate sources Content Understanding ✦ Unstructured text ๏ Learning the context is key and non-trivial ✦ Dynamic ๏ Continuous learning and maintenance [1] https://www.tractica.com/research/artificial-intelligence-for-enterprise-applications/
  • 3.
    3 ML/AI FOR THEENTERPRISE ML MODELS NEURAL NETWORKS REINFORCEMENT LEARNING T R A I N I N G FEATURES EMBEDDINGS S T O R A G E ONLINE REAL-TIME S E R V I N G
  • 4.
    4 SERVING L O WL AT E N C Y H I G H T H R O U G H P U T G R A C E F U L D E G R A D AT I O N T H R E E P I L L A R S
  • 5.
    5 SERVING A P PL I C AT I O N C O D E C H A N G E D E B U G G I N G D E P L O Y M E N T L O N G E R T I M E TO I T E R AT E C Y C L E S C H A L L E N G E S
  • 6.
    6 LEVERAGING SERVERLESS ✦ REAL-TIME ✦HIGHLY SCALABLE ✦ FAULT TOLERANT ✦ EASILY PROGRAMMABLE ✦ SUPPORT FOR PLUG-AND-PLAY ANALYTICS
  • 7.
    7 SERVERLESS E V OL U T I O N * * Figure borrowed from "Serverless Is More: From PaaS to Present Cloud Computing", by Eyk et al. 2018.
  • 8.
    8 SERVERLESS A N OV E R V I E W F u n c t i o n a s a S e r v i c e ( F a a S ) B a c k e n d a s a S e r v i c e ( B a a S ) AW S L a m b d a , G o o g l e C l o u d F u n c t i o n s I B M C l o u d F u n c t i o n s O b j e c t s t o ra g e , D a t a b a s e s , M e s s a g i n g
  • 9.
    9 SERVERLESS C L OU D F U N C T I O N S * Figure borrowed from "Serverless Computation with OpenLambda", by Hendrickson et al. 2018.
  • 10.
    10 Execution without managing resourceallocation From x86 machine code to high-level programming languages CODE Is stateless Event driven Fine-grain autoscaling Decoupled from storage COMPUTATION Resources used instead of resources allocated 100 ms increment BILLING SERVERLESS A N O V E R V I E W
  • 11.
    11 APACHE PULSAR R EA L - T I M E M E S S A G I N G + S T O R A G E M O D E L U P D AT E W I T H O U T R E S TA R T I N G T H E A P P L I C AT I O N N AT I V E S U P P O R T F O R S E R V E R L E S S S T R E A M F U N C T I O N S A N O V E R V I E W
  • 12.
    12 APACHE PULSAR T ER M I N O L O G Y Apache Pulsar Cluster Product Safety ETL Fraud Detection Topic-1 Account History Topic-2 User Clustering Topic-1 Risk Classication MarketingCampaigns ETL Topic-1 Budgeted Spend Topic-2 Demographic Classication Topic-1 Location Resolution Data Serving Microservice Topic-1 Customer Authentication Tenants Namespaces Topics
  • 13.
    13 APACHE PULSAR DURABILITY MULTI-TENANCYTIERED STORAGE UNIFIED MESSAGING & QUEUING HIGHLY SCALABLE K E Y C H A R AC T E R I S T I C S
  • 14.
    14 APACHE PULSAR I ND E P E N D E N T S C A L A B I L I T Y I N S TA N T S C A L A B I L I T Y F A U L T TO L E R A N C E A R C H I T E C T U R A L D E S I G N Bookie Bookie Bookie Broker Broker Broker Producer Consumer
  • 15.
    15 APACHE PULSAR W RI T E TA I L I N G R E A D S C ATC H U P R E A D S AC C E S S PAT T E R N S
  • 16.
    16 PULSAR FUNCTIONS S IM P L E S T P O S S I B L E A P I F U N C T I O N O R A P R O C E D U R E S U P P O R T F O R M U L T I - L A N G U A G E ( J a v a & P y t h o n ) F L E X I B L E R U N T I M E I N T R O D U C T I O N
  • 17.
    17 PULSAR FUNCTIONS I NT R O D U C T I O N Pulsar Function i/p topic 1 i/p topic 2 i/p topic 3 o/p topic 1 o/p topic 2 o/p topic 3
  • 18.
    18 PULSAR FUNCTIONS I NT R O D U C T I O N import java.util.function.Function; public class ExclamationFunction implements Function<String, String> { @Override public String apply(String input) { return input + "!"; } } Exclamation Functioni/p topic 1 o/p topic 2 strings strings
  • 19.
    19 PULSAR FUNCTIONS AT LE A S T O N C E AT M O S T O N C E E X A C T L Y O N C E P R O C E S S I N G G UA R A N T E E S
  • 20.
    20 PULSAR FUNCTIONS D YN A M I C D ATA R O U T I N G D ATA F I L T E R I N G D ATA E N R I C H M E N T E V E N T P R O C E S S I N G D E S I G N PAT T E R N S A L E R T S A N D T H R E S H O L D S D ATA T R A N S F O R M AT I O N S C O U N T I N G W I T H W I N D O W S
  • 21.
    21 T H RE A D S P R O C E S S E S C O N T A I N E R S PULSAR FUNCTIONS D E P L O Y M E N T
  • 22.
    22 PULSAR FUNCTIONS D EP L O Y M E N T Broker 1 Worker Function wordcount-1 Function transform-2 Broker 1 Worker Function transform-1 Function dataroute-1 Broker 1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3
  • 23.
    23 PULSAR FUNCTIONS D EP L O Y M E N T Worker Function wordcount-1 Function transform-2 Worker Function transform-1 Function dataroute-1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3 Broker 1 Broker 2 Broker 3 Node 4 Node 5 Node 6
  • 24.
    24 PULSAR FUNCTIONS D EP L O Y M E N T - K U B E R N E T E S Function wordcount-1 Function transform-1 Function transform-3 Pod 1 Pod 2 Pod 3 Broker 1 Broker 2 Broker 3 Pod 7 Pod 8 Pod 9 Function dataroute-1 Function wordcount-2 Function transform-2 Pod 4 Pod 5 Pod 6
  • 25.
    25 PULSAR FUNCTIONS D ATAT R A N S F O R M AT I O N D ATA E X T R A C T I O N C O N T E N T R O U T I N G & F I L T E R I N G U S E C A S E S I N DATA E N G I N E E R I N G
  • 26.
  • 27.
    * Figure borrowedfrom "A Case for Serverless Machine Learning", by Carreira et al. 2018. * D I S T R I B U T E D M L Hundreds of concurrent workers Map to serverless functions Backend manages compute resources and task scheduling ML PIPELINE T R A I N I N G 27
  • 28.
    ML PIPELINE I NF E R E N C E G U I D I N G D E C I S I O N M A K I N G Calls for less computational power than training RESTful endpoint for Functions ✦ Available via HTTP GET request Model size ✦ Tens of MB to over a GB Cold start → Warm start ✦ Function reuse Backend manages compute resources and task scheduling 28
  • 29.
    29 Model Serving Pulsar Function ModelStream Data Stream model i/p topic data i/p topic Inference Stream Inference o/p topic MODEL SERVING
  • 30.
    30 Model Stream Data Stream modeli/p topic data i/p topic Inference Stream inference o/p topic MODEL MODEL SERVING
  • 31.
    31 Model Stream Data Stream modeli/p topic data i/p topic Inference Stream inference o/p topicMODELold MODELnew MODEL SERVING
  • 32.
    32 Model Stream Data Stream modeli/p topic data i/p topic Inference Stream inference o/p topic MODEL User Defined Metrics Recall Precision MODEL SERVING M E T R I C S
  • 33.
    DATA SKETCHES Approximate ✦ Probabilistic Bounds Accuracy-SpeedTrade-off One Pass Incremental Low memory footprint A S M O D E L S 33
  • 34.
    34 FLAVORS O F DATAS K E TC H E S S A M P L I N G C A R D I N A L I T YF I L T E R I N G F R E Q U E N T E L E M E N T S A N O M A L Y D E T E C T I O N Q U A N T I L E S
  • 35.
    35 FILTERING B L OO M F I L T E R MEMBERSHIP import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.Function; import com.clearspring.analytics.stream.membership.BloomFilter; public class BloomFilterFunction implements Function<String, Void> { BloomFilter filter = new BloomFilter(20, 20); Void process(String input, Context context) throws Exception { if (!filter.isPresent(input)) { filter.add(input); // Route to “not seen” topic context.publish(“notSeenTopic”, input); } return null; } }
  • 36.
    36 FREQUENT ELEMENTS C OU N T - M I N S K E T C H FREQUENCY import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.Function; import com.clearspring.analytics.stream.frequency.CountMinSketch; public class CountMinFunction implements Function<String, Void> { CountMinSketch sketch = new CountMinSketch(20, 20, 128); Void process(String input, Context context) throws Exception { sketch.add(input, 1); // Calculates bit indexes and performs +1 long count = sketch.estimateCount(input); // React to the updated count return null; } }
  • 37.
    37 CARDINALITY H Y PE R L O G L O G # UNIQUE ELEMENTS import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.Function; import io.airlift.stats.cardinality.HyperLogLog; public class HyperLogLogFunction implements Function<Integer, Void> { HyperLogLog hll = HyperLogLog.newInstance(2048); Void process(Integer value, Context context) throws Exception { hll.add(value); Integer numDistinctElements = hll.cardinality(); // Do something with the distinct elements } }
  • 38.
    38 SEVERLESS ML G PU S U P P O R T Key for Deep Learning F A S T S H A R E D S T O R A G E Functions do not talk to each other Example: Crail*, Pocket E X T E N D I N G T O T H E E D G E Functions running on ✦ Smartphones ✦ IoT Devices G O I N G F O R WA R D * https://crail.apache.org/
  • 39.
    39 “It is betterto fail in originality than to succeed in imitation.” H e r m a n M e l v i l l e
  • 40.
    40 P R OJ E C T D E S C R I P T I O N
  • 41.
  • 42.
    42 Efficient Construction ofApproximate Ad-Hoc ML models Through Materialization and Reuse [Hasani et al. 2018] READINGS A Case for Serverless Machine Learning [Carreira et al. 2018] Pocket: Elastic ephemeral storage for serverless analytics [Klimovic et al. 2018] Serving deep learning models in a serverless platform [Ishakian et al. 2018] PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems [Lee et al. 2018] Cloud Programming Simplified: A Berkeley View on Serverless Computing [Jona et al. 2019]
  • 43.
    43 Serverless Is More:From PaaS to Present Cloud Computing [Eyk et al. 2018] READINGS Serverless Computing: Current Trends and Open Problems [Baldini et al. 2018] Clipper: A low-latency online prediction serving system [Crankshaw et al. 2017] Borg, Omega, and Kubernetes [Burns et al. 2016] Serverless Computation with OpenLambda [Hendrickson et al. 2016] TensorFlow Serving [https://www.tensorflow.org/tfx/guide/serving]
  • 44.
    44 Architecture of aServerless Machine Learning Model [https://cloud.google.com/solutions/architecture-of-a-serverless-ml-model] READINGS Pure serverless machine learning inference with AWS Lambda and Layers [https://medium.com/merapar/pure-serverless-machine-learning-inference-with-aws-lambda-and-layers-979702d9ae49]