Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Model Serving via Pulsar Functions

776 views

Published on

In this talk we walk through an architecture in which models are served in real time and the models are updated, using Apache Pulsar, without restarting the application at hand. They then describe how to apply Pulsar functions to support two example use—sampling and filtering—and explore a concrete case study of the same.

Published in: Technology
  • Be the first to comment

Model Serving via Pulsar Functions

  1. 1. 1 A r u n K e j a r i w a l K a r t h i k R a m a s a m y MODEL SERVING VIA PULSAR FUNCTIONS
  2. 2. 2 AI FOR THE ENTERPRISE Annual revenue: $3.7 B (2017) → $80.7 (2025) [1] Content Acquisition ✦ Disparate sources Content Understanding ✦ Unstructured text ๏ Learning the context is key and non-trivial ✦ Dynamic ๏ Continuous learning and maintenance [1] https://www.tractica.com/research/artificial-intelligence-for-enterprise-applications/
  3. 3. 3 ML/AI FOR THE ENTERPRISE ML MODELS NEURAL NETWORKS REINFORCEMENT LEARNING T R A I N I N G FEATURES EMBEDDINGS S T O R A G E ONLINE REAL-TIME S E R V I N G
  4. 4. 4 SERVING L O W L AT E N C Y H I G H T H R O U G H P U T G R A C E F U L D E G R A D AT I O N T H R E E P I L L A R S
  5. 5. 5 SERVING A P P L I C AT I O N C O D E C H A N G E D E B U G G I N G D E P L O Y M E N T L O N G E R T I M E TO I T E R AT E C Y C L E S C H A L L E N G E S
  6. 6. 6 LEVERAGING SERVERLESS ✦ REAL-TIME ✦ HIGHLY SCALABLE ✦ FAULT TOLERANT ✦ EASILY PROGRAMMABLE ✦ SUPPORT FOR PLUG-AND-PLAY ANALYTICS
  7. 7. 7 SERVERLESS E V O L U T I O N * * Figure borrowed from "Serverless Is More: From PaaS to Present Cloud Computing", by Eyk et al. 2018.
  8. 8. 8 SERVERLESS A N O V E R V I E W F u n c t i o n a s a S e r v i c e ( F a a S ) B a c k e n d a s a S e r v i c e ( B a a S ) AW S L a m b d a , G o o g l e C l o u d F u n c t i o n s I B M C l o u d F u n c t i o n s O b j e c t s t o ra g e , D a t a b a s e s , M e s s a g i n g
  9. 9. 9 SERVERLESS C L O U D F U N C T I O N S * Figure borrowed from "Serverless Computation with OpenLambda", by Hendrickson et al. 2018.
  10. 10. 10 Execution without managing resource allocation From x86 machine code to high-level programming languages CODE Is stateless Event driven Fine-grain autoscaling Decoupled from storage COMPUTATION Resources used instead of resources allocated 100 ms increment BILLING SERVERLESS A N O V E R V I E W
  11. 11. 11 APACHE PULSAR R E A L - T I M E M E S S A G I N G + S T O R A G E M O D E L U P D AT E W I T H O U T R E S TA R T I N G T H E A P P L I C AT I O N N AT I V E S U P P O R T F O R S E R V E R L E S S S T R E A M F U N C T I O N S A N O V E R V I E W
  12. 12. 12 APACHE PULSAR T E R M I N O L O G Y Apache Pulsar Cluster Product Safety ETL Fraud Detection Topic-1 Account History Topic-2 User Clustering Topic-1 Risk Classification MarketingCampaigns ETL Topic-1 Budgeted Spend Topic-2 Demographic Classification Topic-1 Location Resolution Data Serving Microservice Topic-1 Customer Authentication Tenants Namespaces Topics
  13. 13. 13 APACHE PULSAR DURABILITY MULTI-TENANCY TIERED STORAGE UNIFIED MESSAGING & QUEUING HIGHLY SCALABLE K E Y C H A R AC T E R I S T I C S
  14. 14. 14 APACHE PULSAR I N D E P E N D E N T S C A L A B I L I T Y I N S TA N T S C A L A B I L I T Y F A U L T TO L E R A N C E A R C H I T E C T U R A L D E S I G N Bookie Bookie Bookie Broker Broker Broker Producer Consumer
  15. 15. 15 APACHE PULSAR W R I T E TA I L I N G R E A D S C ATC H U P R E A D S AC C E S S PAT T E R N S
  16. 16. 16 PULSAR FUNCTIONS S I M P L E S T P O S S I B L E A P I F U N C T I O N O R A P R O C E D U R E S U P P O R T F O R M U L T I - L A N G U A G E ( J a v a & P y t h o n ) F L E X I B L E R U N T I M E I N T R O D U C T I O N
  17. 17. 17 PULSAR FUNCTIONS I N T R O D U C T I O N Pulsar Function i/p topic 1 i/p topic 2 i/p topic 3 o/p topic 1 o/p topic 2 o/p topic 3
  18. 18. 18 PULSAR FUNCTIONS I N T R O D U C T I O N import java.util.function.Function; public class ExclamationFunction implements Function<String, String> { @Override public String apply(String input) { return input + "!"; } } Exclamation Functioni/p topic 1 o/p topic 2 strings strings
  19. 19. 19 PULSAR FUNCTIONS AT L E A S T O N C E AT M O S T O N C E E X A C T L Y O N C E P R O C E S S I N G G UA R A N T E E S
  20. 20. 20 PULSAR FUNCTIONS D Y N A M I C D ATA R O U T I N G D ATA F I L T E R I N G D ATA E N R I C H M E N T E V E N T P R O C E S S I N G D E S I G N PAT T E R N S A L E R T S A N D T H R E S H O L D S D ATA T R A N S F O R M AT I O N S C O U N T I N G W I T H W I N D O W S
  21. 21. 21 T H R E A D S P R O C E S S E S C O N T A I N E R S PULSAR FUNCTIONS D E P L O Y M E N T
  22. 22. 22 PULSAR FUNCTIONS D E P L O Y M E N T Broker 1 Worker Function wordcount-1 Function transform-2 Broker 1 Worker Function transform-1 Function dataroute-1 Broker 1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3
  23. 23. 23 PULSAR FUNCTIONS D E P L O Y M E N T Worker Function wordcount-1 Function transform-2 Worker Function transform-1 Function dataroute-1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3 Broker 1 Broker 2 Broker 3 Node 4 Node 5 Node 6
  24. 24. 24 PULSAR FUNCTIONS D E P L O Y M E N T - K U B E R N E T E S Function wordcount-1 Function transform-1 Function transform-3 Pod 1 Pod 2 Pod 3 Broker 1 Broker 2 Broker 3 Pod 7 Pod 8 Pod 9 Function dataroute-1 Function wordcount-2 Function transform-2 Pod 4 Pod 5 Pod 6
  25. 25. 25 PULSAR FUNCTIONS D ATA T R A N S F O R M AT I O N D ATA E X T R A C T I O N C O N T E N T R O U T I N G & F I L T E R I N G U S E C A S E S I N DATA E N G I N E E R I N G
  26. 26. 26 APACHE PULSAR + MODEL SERVING
  27. 27. * Figure borrowed from "A Case for Serverless Machine Learning", by Carreira et al. 2018. * D I S T R I B U T E D M L Hundreds of concurrent workers Map to serverless functions Backend manages compute resources and task scheduling ML PIPELINE T R A I N I N G 27
  28. 28. ML PIPELINE I N F E R E N C E G U I D I N G D E C I S I O N M A K I N G Calls for less computational power than training RESTful endpoint for Functions ✦ Available via HTTP GET request Model size ✦ Tens of MB to over a GB Cold start → Warm start ✦ Function reuse Backend manages compute resources and task scheduling 28
  29. 29. 29 Model Serving Pulsar Function Model Stream Data Stream model i/p topic data i/p topic Inference Stream Inference o/p topic MODEL SERVING
  30. 30. 30 Model Stream Data Stream model i/p topic data i/p topic Inference Stream inference o/p topic MODEL MODEL SERVING
  31. 31. 31 Model Stream Data Stream model i/p topic data i/p topic Inference Stream inference o/p topicMODELold MODELnew MODEL SERVING
  32. 32. 32 Model Stream Data Stream model i/p topic data i/p topic Inference Stream inference o/p topic MODEL User Defined Metrics Recall Precision MODEL SERVING M E T R I C S
  33. 33. DATA SKETCHES Approximate ✦ Probabilistic Bounds Accuracy-Speed Trade-off One Pass Incremental Low memory footprint A S M O D E L S 33
  34. 34. 34 FLAVORS O F DATA S K E TC H E S S A M P L I N G C A R D I N A L I T YF I L T E R I N G F R E Q U E N T E L E M E N T S A N O M A L Y D E T E C T I O N Q U A N T I L E S
  35. 35. 35 FILTERING B L O O M F I L T E R MEMBERSHIP import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.Function; import com.clearspring.analytics.stream.membership.BloomFilter; public class BloomFilterFunction implements Function<String, Void> { BloomFilter filter = new BloomFilter(20, 20); Void process(String input, Context context) throws Exception { if (!filter.isPresent(input)) { filter.add(input); // Route to “not seen” topic context.publish(“notSeenTopic”, input); } return null; } }
  36. 36. 36 FREQUENT ELEMENTS C O U N T - M I N S K E T C H FREQUENCY import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.Function; import com.clearspring.analytics.stream.frequency.CountMinSketch; public class CountMinFunction implements Function<String, Void> { CountMinSketch sketch = new CountMinSketch(20, 20, 128); Void process(String input, Context context) throws Exception { sketch.add(input, 1); // Calculates bit indexes and performs +1 long count = sketch.estimateCount(input); // React to the updated count return null; } }
  37. 37. 37 CARDINALITY H Y P E R L O G L O G # UNIQUE ELEMENTS import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.Function; import io.airlift.stats.cardinality.HyperLogLog; public class HyperLogLogFunction implements Function<Integer, Void> { HyperLogLog hll = HyperLogLog.newInstance(2048); Void process(Integer value, Context context) throws Exception { hll.add(value); Integer numDistinctElements = hll.cardinality(); // Do something with the distinct elements } }
  38. 38. 38 SEVERLESS ML G P U S U P P O R T Key for Deep Learning F A S T S H A R E D S T O R A G E Functions do not talk to each other Example: Crail*, Pocket E X T E N D I N G T O T H E E D G E Functions running on ✦ Smartphones ✦ IoT Devices G O I N G F O R WA R D * https://crail.apache.org/
  39. 39. 39 “It is better to fail in originality than to succeed in imitation.” H e r m a n M e l v i l l e
  40. 40. 40 P R O J E C T D E S C R I P T I O N
  41. 41. 41 https://streaml.io/blog/eda-real-time-analytics-with-pulsar-functions MORE ON PULSAR https://streaml.io/blog/eda-event-processing-design-patterns-with-pulsar-functions https://streaml.io/blog/apache-pulsar-architecture-designing-for-streaming-performance-and-scalability https://www.businesswire.com/news/home/20180306005633/en/Apache-Pulsar-Outperforms-Apache-Kafka-2.5x-OpenMessaging https://streaml.io/blog/intro-to-pulsar https://pulsar.apache.org/
  42. 42. 42 Efficient Construction of Approximate Ad-Hoc ML models Through Materialization and Reuse [Hasani et al. 2018] READINGS A Case for Serverless Machine Learning [Carreira et al. 2018] Pocket: Elastic ephemeral storage for serverless analytics [Klimovic et al. 2018] Serving deep learning models in a serverless platform [Ishakian et al. 2018] PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems [Lee et al. 2018] Cloud Programming Simplified: A Berkeley View on Serverless Computing [Jona et al. 2019]
  43. 43. 43 Serverless Is More: From PaaS to Present Cloud Computing [Eyk et al. 2018] READINGS Serverless Computing: Current Trends and Open Problems [Baldini et al. 2018] Clipper: A low-latency online prediction serving system [Crankshaw et al. 2017] Borg, Omega, and Kubernetes [Burns et al. 2016] Serverless Computation with OpenLambda [Hendrickson et al. 2016] TensorFlow Serving [https://www.tensorflow.org/tfx/guide/serving]
  44. 44. 44 Architecture of a Serverless Machine Learning Model [https://cloud.google.com/solutions/architecture-of-a-serverless-ml-model] READINGS Pure serverless machine learning inference with AWS Lambda and Layers [https://medium.com/merapar/pure-serverless-machine-learning-inference-with-aws-lambda-and-layers-979702d9ae49]

×