Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Serverless Streaming Architectures and Algorithms for the Enterprise

1,210 views

Published on

In recent years, serverless has gained momentum in the realm of cloud computing. Broadly speaking, it comprises function as a service (FaaS) and backend as a service (BaaS). The distinction between the two is that under FaaS, one writes and maintains the code (e.g., the functions) for serverless compute; in contrast, under BaaS, the platform provides the functionality and manages the operational complexity behind it. Serverless provides a great means to boost development velocity. With greatly reduced infrastructure costs, more agile and focused teams, and faster time to market, enterprises are increasingly adopting serverless approaches to gain a key advantage over their competitors.

Example early use cases of serverless include, for example, data transformation in batch and ETL scenarios and data processing using MapReduce patterns. As a natural extension, serverless is being used in the streaming context such as, but not limited to, real-time bidding, fraud detection, intrusion detection. Serverless is, arguably, naturally suited to extracting insights from fast data, that is, high-volume, high-velocity data. Example tasks in this regard include filtering and reducing noise in the data and leveraging machine learning and deep learning models to provide continuous insights about business operations.

We walk the audience through the landscape of streaming systems for each stage of an end-to-end data processing pipeline—messaging, compute, and storage. We overview the inception and growth of the serverless paradigm. Further, we deep dive into Apache Pulsar, which provides native serverless support in the form of Pulsar functions, and paint a bird’s-eye view of the application domains where Pulsar functions can be leveraged.

Baking in intelligence in a serverless flow is paramount from a business perspective. To this end, we detail different serverless patterns—event processing, machine learning, and analytics—for different use cases and highlight the trade-offs. We present perspectives on how advances in hardware technology and the emergence of new applications will impact the evolution of serverless streaming architectures and algorithms. The topics covered include an introduction to st
reaming, an introduction to serverless, serverless and streaming requirements, Apache Pulsar, application domains, serverless event processing patterns, serverless machine learning patterns, and serverless analytics patterns.

Published in: Technology
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Serverless Streaming Architectures and Algorithms for the Enterprise

  1. 1. ServerlessStreamingArchitectures&Algorithms fortheEnterprise Anurag Khandelwal, Arun Kejariwal, Karthik Ramasamy @anuragk_ @arun_kejariwal @karthikz
  2. 2. WHY SERVERLESS? 2
  3. 3. WHY BOTHER? ON A HIGH LEVEL LANGUAGE e.g., Python, JavaScript,… DEMAND DRIVEN EXECUTION Runs whenever new requests arrive PAY BASED ON RUNTIME ~ millisecond granularity CODE BILLING COMPUTATION 3
  4. 4. SERVERLESS COMPUTING SIMPLIFIES CLOUD PROGRAMMING 4 Upload Code Pay for what you use Run at 
 any scale
  5. 5. EVENT-DRIVEN APPLICATION EXAMPLE: IMAGE RESIZING[1] 5 Cloud Storage Serverless Save
 thumbnail Save
path Cloud Storage Cloud Database λ [1] Slide adapted from talk by Eric Jonas and Johann Schleier-Smith, “A Berkeley View on Cloud CompuCng”
  6. 6. BATCH ANALYTICS EXAMPLE: VIDEO ANALYTICS λ No car: filter locally Car detected: analyze in cloud Analyze video using DNNs Law enforcement Traffic video analytics λ Video encoding/decoding Encoder/Decoder 6
  7. 7. STREAMING EXAMPLE: FIGHTING SPAMS ON TWITTER 7 Spammy Tweet Regular Tweet λ Similarity Clustering Message Queue Key-Value Store ✦ Fight spammy content, engagements, and behaviors in Twitter ✦ Spam campaign comes in large batch ✦ Despite randomized tweaks, enough similarity among spammy entities are preserved
  8. 8. A REAL USE-CASE: HOW FINANCIAL ENGINES CUT COSTS 90% USING SERVERLESS [1] 8 ✦ Financial Engines: Independent Investment Advisor ๏ 9 million people across 743 companies, $1.8 trillion in assets ✦ Automated portfolio management using computational engines ๏ Core engine component: Integer programming optimizer (IPO) ๏ Linear Programming to compute optimization/feasibility [1] Financial Engines Cuts Costs 90% Using AWS Lambda and Serverless CompuCng,
 hNps://aws.amazon.com/soluCons/case-studies/financial-engines/
  9. 9. IPO SERVER FARM 9 … Solver Library Solver Library Solver Library Solver Library ✦ IPO consumes > 30% of total CPU capacity ๏ Spikes of up to 1000 requests/s, 100ms per request ๏ Capacity planning during marketing campaigns that produce large traffic spikes is hard… 40 IPO Servers
  10. 10. NEED TO DO A LOT OF WORK … 10 ✦ Scaling in response to load variations ✦ Request routing and load balancing ✦ Monitoring to respond to problems ✦ Provision servers based on budget, requirements ✦ System upgrades, including security patching ✦ Migration to new hardware as it becomes available …
  11. 11. λ Solver Libraryλ Solver Library 11 ✦ AWS Lambda function for each IPO request ๏ Run as many copies of the IPO function as needed in parallel ✦ Serverless benefits ๏ Up to 94% cost savings annually, not including operational savings ๏ 200-300 M IPO requests/month, 60,000 per minute at peak ๏ Increased reliability: just instantiate new lambda requests on crash λ Solver Library [1] Financial Engines Cuts Costs 90% Using AWS Lambda and Serverless CompuCng,
 hNps://aws.amazon.com/soluCons/case-studies/financial-engines/ A REAL USE-CASE: HOW FINANCIAL ENGINES CUT COSTS 90% USING SERVERLESS [1]
  12. 12. OF CLOUD PLATFORMS EVOLUTION 12 On-prem virtualization Platform as a Service (PaaS) Backend as a Service (BaaS) Container Orchestration Serverless Platforms App Engine, Heroku Borg, Kubernetes ✦ AWS Lambda, Google Cloud Functions, Azure Functions ✦ Big Query, DynamoDB ✦ Cloud Dataflow ✦ Easy switch from legacy infrastructure ✦ Added cloud services 
 (e.g., storage, pub-sub) VMs in the cloud
  13. 13. OF SHARING RESOURCES EVOLUTION App Runtime OS Hardware No Sharing App Runtime OS Hardware VM App Runtime OS VM Virtual Machines OS Hardware App Runtime App Runtime Containers Runtime OS Hardware App App FaaS Increasing Virtualization [1] [1] Serverless ComputaCon with OpenLambda, Hendrickson et. al.
  14. 14. ✦ Different pricing models, resource allocations ✦ Security and isolation support ✦ Programming language support, OS support, etc. [1,2] SERVERLESS TODAY: FUNCTION-AS-A-SERVICE (FAAS) 14[1] Peeking Behind the Curtains of Serverless PlaWorms, Wang et. al. [2] EvaluaCon of ProducCon Serverless CompuCng Environments, Lee et. al. ✦ Many FaaS platforms AWS Lambda Google Cloud Functions IBM Cloud Functions Azure FunctionsCloudflare Workers Alibaba Function Compute
  15. 15. FAAS ORCHESTRATION [1] 15 [1] Comparison of FaaS OrchestraCon Systems, Lopez et. al. ✦ Many orchestration frameworks: ✦ Varying pricing models, programming models, parallel execution support, state management, architectures, etc. [1] ✦ Serverless trilemma: ๏ black boxes ๏ substitution principle ๏ double-billing AWS Step Functions Azure Durable Functions IBM Composer
  16. 16. SERVERLESS IS MORE THAN FaaS … Serverless = FaaS BaaS+ ✦ Object Storage (e.g., S3) ✦ Key-Value Stores (e.g., DynamoDB) ✦ Database (e.g., Cloud Firestore) ✦ Data Processing (e.g., Cloud Dataflow) ✦ Complexity Hiding ✦ Consumption based billing ✦ Automatic scaling λ Storage Database FaaS Data 
 Processing Messaging 16
  17. 17. … NOT EVERYTHING IS SERVERLESS! ✦ The “buzzword” effect ๏ Cloud providers market services as “serverless” 
 without its properties: ๏ Complexity hiding ๏ Consumption-based billing ๏ Automatic scaling ✦ “Semi”-serverless ๏ Do not provide one or more of these properties 17
  18. 18. PLAYERS IN SERVERLESS: EVERYONE IS A WINNER 18 Cloud 
 Provider Developer Enterprise
  19. 19. DEVELOPER BENEFITS 19 Developer ✦ Simplified programming ✦ Delegate scaling, scheduling, etc., to cloud def function(event, context): doComplexComputation() Resources automatically scale with load Close to zero configuration No scheduling, load balancing, ….
  20. 20. ENTERPRISE BENEFITS 20 Enterprise ✦ Delegate DevOps to cloud ✦ Cost savings: pay for what you use Time Resources Used Paid for (server-based) Paid for (serverless)
  21. 21. THE COST OF SERVERLESS 21 Function Execution Cost ✦ Charged at ~100ms ✦ Charged per GB memory Data Transfer Cost ✦ Charged per GB ✦ Function fusion: combine functions to avoid data transfer for performance and cost ๏ But fusing functions with different memory requirements can be expensive.. ✦ Function placement: place function close to source for cost savings ๏ But limited compute power at source may slow things down… ✦ How to balance cost with performance? [1] ✦ Use fusion and placement judiciously to optimize cost and performance [1] Costless: OpCmizing Cost of Serverless CompuCng through FuncCon Fusion and Placement, Elgamal et. al.
  22. 22. PROVIDER BENEFITS 22 Cloud 
 Provider ✦ Higher utilization by multiplexing resources across users Time Resources Capacity User1 User2 User3
  23. 23. APPLICATION DOMAINS 23
  24. 24. USE CASES 24 Streaming data transformation Data distribution Real-time analytics Real-time monitoring and notifications IoT analytics ! Event-driven workflows SERVERLESS Interactive applications Log processing and analytics
  25. 25. TRADING SUPPORT PLATFORM Scenario ✦ Major bank looking to move to next-generation data pipeline to support continuous reconciliation of trading activity Challenges ✦ Zero tolerance for data loss ✦ Performance at scale difficult to achieve ✦ Need to support future data and usage growth 25
  26. 26. INDUSTRIAL IOT ANALYTICS Data from sensors on power generation equipment Combined with data from sensors in distribution network Brought together and analyzed in the cloud For immediate insights into capacity, failures, alerts ! 26
  27. 27. STREAMING DATA TRANSFORMATIONS 27 Move best-fit transformations and those needed for fast data access into streaming systems Provide users and applications access to data at multiple stages of transformation Leverage batch systems for specialized capabilities and complex transformations
  28. 28. CONNECTED VEHICLE 28 Scenario Continuously-arriving data generated by connected cars needs to be quickly collected, processed and distributed to applications and partners Challenges Require scalability to handle growing data sources and volumes without complex mix of technologies Solution Leverage Apache Pulsar solution to provide data backbone that can receive, transform, and distribute data at scale
  29. 29. CONNECTED VEHICLE 29 Telemetry data from connected vehicles transmitted and published to Pulsar Data cleansing, enrichment and refinement processed inside Pulsar Data made available to internal teams for analysis and reports Data feeds supplied to partners and partner applications
  30. 30. DATA DRIVEN WORKFLOWS 30 Scenario Application processes incoming events and documents that generate processing workflows Challenges Operational burdens and scalability challenges of existing technologies growing as data grows Solution Process incoming events and data and create work queues in same system Decrypt, extract, convert, dispatch, process, store
  31. 31. APPLICATION CHARACTERISTICS 31
  32. 32. BIG DATA ANALYTICS Analyze volumes of data
 Wide range of applications,: text analytics, machine learning, predictive analytics, data mining, statistics, natural language processing Why Serverless? No server management Transparent resource elasticity Pay for what you use Building Analytics on FaaS platforms PyWren, Flint, Locus, ExCamera, … 32
  33. 33. BIG DATA ANALYTICS: SORT … Partition Task … Partition Task … Partition Task … Merge Task Merge Task Merge Task … OR REDIS S3 Service Capacity IOPS S3 High Low Redis Low High 33 λ λS3 S3
  34. 34. BIG DATA ANALYTICS: LOCUS … S3 λ λ λ λ REDIS λ PARTITION λ MERGE λ FINAL MERGE Hybrid Sort 34
  35. 35. BIG DATA ANALYTICS: FLINT Input Partition Input Partition Input Partition Output Partition Output Partition Flint Executor Flint Executor Flint Executor Flint Executor Flint Executor Queue Queue S3 Lambda SQS AWSClient 
 Spark Context Flint Scheduler 35
  36. 36. How is it done today? ✦ Video = Series of Chunks ๏ Chunk = KeyFrame (large) + InterFrames (small deltas from KeyFrame) Thread#1 Thread#2 Thread#3 Thread#4 1 5 6 KF I I… Frames: Encoded: 1 5 6 KF I I… 1 5 6 KF I I… 1 5 6 KF I I… VIDEO ENCODING/DECODING 36 ✦ High parallelism = worse compression (more KeyFrames)
  37. 37. VIDEO ANALYTICS: EXCAMERA VIDEO ENCODING/DECODING ON AWS LAMBDA Lambda#1 Lambda#2 Lambda#3 Lambda#4 1 5 6 KF I I… 1 5 6 KF I I… 1 5 6 KF I I… 1 5 6 KF I I… 37
  38. 38. VIDEO ANALYTICS: EXCAMERA VIDEO ENCODING/DECODING ON AWS LAMBDA Lambda#1 Lambda#2 Lambda#3 Lambda#4 Serial Pass: Rebase 1 5 6 KF I I… 1 5 6 I I… 1 5 6 I I… 1 5 6 I I… State State StateI I I 37 ✦ 60X faster and 6x cheaper than Google’s vpxenc on 128 cores
  39. 39. VIDEO ANALYTICS: EXCAMERA Making lambdas talk to each other ✦ Lambdas are only permitted outbound TCP/IP connections ✦ Establish outbound cxns to rendezvous server (R) at init ✦ If A wants to talk to B, it sends R an init msg connect(A, B) ๏ R forwards all of A’s subsequent msgs to B Rendezvous Server (R) A B C ... Lambdas 38
  40. 40. APACHE PULSAR 39
  41. 41. APACHE PULSAR OVERVIEW 40 Cloud Na)ve Messaging + Compute System backed by a durable log storage
  42. 42. KEY CHARACTERISTICS MULTI-TENANCY DURABILITY TIERED STORAGE UNIFIED MESSAGE & QUEUING 41 HIGHLY SCALABLE
  43. 43. CORE CONCEPTS 42 Apache Pulsar Cluster Product Safety ETL Fraud Detection Topic-1 Account History Topic-2 User Clustering Marketing Campaigns ETL Topic-1 Budgeted Spend Topic-2 Demographic Classification Topic-1 Location Resolution Data Serving Microservice Topic-1 Customer Authentication Tenants Namespaces TENANTS, NAMESPACES & TOPICS Topic-1 Risk Classification
  44. 44. TOPICS AND STREAMS 43 TopicProducers Consumers Time Consumers Consumers Producers
  45. 45. TOPIC PARTITIONS 44 Topic - P0 Time Topic - P1 Topic - P2 Producers Producers Consumers Consumers Consumers
  46. 46. PARTITIONS AND SEGMENTS 45 Time Segment 1 Segment 2 Segment 3 Segment 1 Segment 2 Segment 3 Segment 4 Segment 1 Segment 2 Segment 3 P0 P1 P2
  47. 47. STREAMING CONSUMPTION - EXCLUSIVE SUBSCRIPTION 46 Pulsar topic/ partition Producer 2 Producer 1 Consumer 1 Consumer 2 Subscription A M4 M3 M2 M1 M0 M4 M3 M2 M1 M0 X Exclusive
  48. 48. STREAMING CONSUMPTION - FAILOVER SUBSCRIPTION 47 Pulsar topic/ partition Producer 2 Producer 1 Consumer 1 Consumer 2 Subscription B M4 M3 M2 M1 M0 M4 M3 M2 M1 M0 Failover In case of failure in consumer 1
  49. 49. MESSAGE QUEUEING - SHARED SUBSCRIPTION 48 Pulsar topic/ partition Producer 2 Producer 1 Consumer 2 Consumer 3 Subscription C M4 M3 M2 M1 M0 Shared Traffic is equally distributed across consumers Consumer 1 M4M3 M2M1M0
  50. 50. MULTI-LAYER AND SCALABLE ARCHITECTURE 49 Consumer Producer Producer Producer Consumer Consumer Consumer Messaging Broker Broker Broker Bookie Bookie Bookie Bookie Bookie Event storage Function Processing WorkerWorker ✦ Independent layers for processing, serving and storage ✦ Messaging and processing built on Apache Pulsar ✦ Storage built on Apache BookKeeper
  51. 51. DATA FLOW 50 Bookie Bookie BookieBroker Producer Journal Journal Journal fsync fsync fsync Segment storage Segment storage Segment storage background process Consumer
  52. 52. STORAGE ARCHITECTURE 51 Logical View Partition Processing & Storage Segment 1 Segment 3Segment 2 Segment n Partition Broker Partition (primary) Broker Partition (copy) Broker Partition (copy) Broker Broker Broker .
.
. .
.
. .
.
. .
.
. Processing (brokers) Warm Storage ✦ Storage co-resident with serving ✦ Partition centric ✦ Cumbersome to scale ๏ Data redistribution ๏ Performance impact ✦ Storage decoupled from processing ✦ Partition stored as segments ✦ Flexible and easy scalability
  53. 53. DATA ACCESS PATTERNS DATA WORK LOAD WRITES TAILING READS CATCHUP READS HISTORICAL READS HCTW 52
  54. 54. DATA ACCESS PATTERNS 53 Partition Broker Broker Broker .
.
. .
.
. .
.
. .
.
. Processing (brokers) Warm Storage Cold Storage Tailing reads: served from in-memory cache Catch-up reads: served from persistent storage layer Historical reads: served from cold storage
  55. 55. PULSAR FEATURES 54
  56. 56. MULTITENANCY 55 SEVERAL TEAMS SHARING THE SAME CLUSTER ✦ Authentication / Authorization / Namespaces / Admin APIs ✦ I/O isolation between writes and reads ๏ Provided by storage layer - ensure readers draining backlog won’t affect publishers Soft isolation ✦ Storage quotas — flow-control — back-pressure — rate limiting Hardware isolation ✦ Constrain some tenants on a subset of brokers or bookies
  57. 57. STORAGE TIERING 56 TAKING ADVANTAGE OF LOW COST CLOUD STORAGE ✦ Offload cold topic data to lower-cost storage (e.g. cloud storage, HDFS) ✦ Manual or automatic (configurable threshold) ✦ Transparent to publishers and consumers ✦ Allows near-infinite event storage at low cost Cold storage Hot storage Topic
  58. 58. SCHEMA REGISTRY 57 MAKING SENSE OF THE BYTES IN DATA ✦ Provides type safety to applications built on top of Pulsar ✦ Two approaches ๏ Client side enforcement: type safety enforcement up to the application ๏ Server side enforcement: system enforces type safety and ensures that producers and consumers remain synced ✦ Schema registry enables clients to upload data schemas on a topic basis. ✦ Schemas dictate which data types are recognized as valid for that topic ✦ Supports JSON, protobuf, binary schemas
  59. 59. SCHEMA REGISTRY 58 MAKING SENSE OF THE BYTES IN DATA ✦ Means for publishers and consumers to communicate structure of topic data ✦ Validates schema as data is published ✦ Supports JSON, protobuf, binary schemas PulsarClient client = PulsarClient.builder() .serviceUrl("pulsar://localhost:6650") .build(); Producer<SensorReading> producer = client.newProducer(JSONSchema.of(SensorReading.class)) .topic("sensor-data") .create(); Consumer<SensorReading> consumer = client.newConsumer(JSONSchema.of(SensorReading.class) .topic("sensor-data") .subscriptionName("sensor-subscriber") .subscribe();
  60. 60. ON THE FLY SCALABILITY 59 ADJUST PULSAR ON DEMAND BASED ON LOAD Scale serving ✦ New nodes immediately available to process requests, no data rebalancing required Scale processing ✦ Add threads, processes or containers to increase parallelism Scale storage retention ✦ Add nodes to increase capacity, no data redistribution required Messaging Broker Broker Broker Bookie Bookie Bookie Bookie Bookie Stream storage Processing WorkerWorker
  61. 61. TOPIC COMPACTION 60 ADJUST PULSAR ON DEMAND BASED ON LOAD ✦ Efficient way to enable consumer to catch up to current state ✦ Process that creates version of a topic that only has current values for each key ✦ Triggered via simple command {key: “A”, value: “foo”} {key: “B”, value: “foobar”} {key: “B”, value: “bar”} {key: “A”, value: “binky”} {key: “A”, value: “bar”} Complete topic Compacted topic {key: “B”, value: “foobar”} {key: “A”, value: “bar”}
  62. 62. SQL QUERYING 61 Enable SQL clients to directly query data in Streamlio ✦ Integrated with schema registry ✦ Uses Presto as query engine ✦ Query engine reads data directly from storage layer ✦ Data visible to SQL engine as soon as published Processing Messaging and queuing Stream storage Data Access Msg QueuePub-Sub SQL engine
 (Presto)Functions SQL Clients Metadata
  63. 63. INTERACTIVE QUERYING USING SQL 62 1234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1 Segment 4 Segment 4 Segment Reader Segment Reader Segment Reader Segment Reader Coordina tor
  64. 64. PULSAR AVAILABILITY AND RESILENCY 63
  65. 65. DURABILITY 64 (CONTD.) Bookie Bookie BookieBrokerProducer Journal Journal Journal fsync fsync fsync Segment storage Segment storage Segment storage background process https://drivescale.com/2017/03/whatever-happened-durability/
  66. 66. RESILENCY AND RECOVERY 65 BROKER, BOOKIE AND DATA CENTER FAILURES Segment 1 Segment 2 Segment n .
.
. Segment 2 Segment 3 Segment n .
.
. Segment 3 Segment 1 Segment n .
.
. Segment 1 Segment 2 Segment n .
.
. Storage Broker Serving Broker Broker ✦ Broker Failure ๏ Topic reassigned to available broker based on load ๏ Can construct the previous state consistently ๏ No data needs to be copied ✦ Bookie Failure ๏ Immediate switch to a new node ๏ Background process copies segments to other bookies to maintain replication factor ✦ Datacenter Failure ๏ Built-in multi-datacenter replication ๏ Brokers in any datacenter can immediately serve replicated topics
  67. 67. BROKER FAILURE RECOVERY 66 BROKER, BOOKIE AND DATA CENTER FAILURES ๏ Topic reassigned to available broker based on load ๏ Can construct the previous state consistently ๏ No data needs to be copied ๏ Failure handled transparently by client library
  68. 68. BOOKIE FAILURE RECOVERY 67 1234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1 Segment 4 Segment 4
  69. 69. BOOKIE FAILURE RECOVERY 68 ๏ After a write failure, BookKeeper will immediately switch write to a new bookie, within the same segment ๏ As long as we have any 3 bookies in the cluster, we can continue to write ๏ In background, starts a many-to-many recovery process to regain the configured replication factor
  70. 70. SEAMLESS CLUSTER EXPANSION 1234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1 Segment 4 Segment 4 Segment Y Segment Z Segment X 69
  71. 71. MULTI-DATACENTER REPLICATION 70 ๏ Scalable asynchronous replication ๏ Integrated in the broker message flow ๏ Simple configuration to add/remove regions Topic (T1) Topic (T1) Topic (T1) Subscription (S1) Subscription (S1) Producer (P1) Consumer (C1) Producer (P3) Producer (P2) Consumer (C2) Data Center A Data Center B Data Center C DISASTER RECOVERY
  72. 72. SYNCHRONOUS REPLICATION DISASTER RECOVERY ✦ Each topic owned by one broker at a time, i.e in one datacenter ✦ ZooKeeper cluster spread across multiple locations ✦ Broker commits writes to bookies in both datacenter ✦ In event of datacenter failure, broker in surviving datacenter assumes ownership of topic ZooKeeperProducers Datacenter 1 Consumers Pulsar Cluster Datacenter 2 Producers Consumers 71
  73. 73. ASYNCHRONOUS REPLICATION DISASTER RECOVERY Producers (active) Datacenter 1 Consumers (active) Pulsar Cluster (primary) Datacenter 2 Producers (standby) Consumers (standby) Pulsar Cluster (standby) Pulsar replication ZooKeeper ZooKeeper ✦ Two independent clusters, primary and standby ✦ Configured tenants and namespaces replicate to standby ✦ Data published to primary is asynchronously replicated to standby ✦ Producers and consumers restarted in second datacenter upon primary failure 72
  74. 74. REPLICATED SUBSCRIPTIONS DISASTER RECOVERY Producers Datacenter 1 Consumers Pulsar Cluster 1 Subscriptions Datacenter 2 Consumers Pulsar Cluster 2 Subscriptions Pulsar Replication MarkerMarker Marker 73
  75. 75. GROWING ECOSYSTEM OF APACHE PULSAR DISASTER RECOVERY 74
  76. 76. ADOPTED BY 300+ COMPANIES 75
  77. 77. APACHE PULSAR COMMUNITY ✦ Twitter: @apache_pulsar ✦ Wechat Subscription: ApachePulsar ✦ Mailing Lists: dev@pulsar.apache.org, users@pulsar.apache.org ✦ Slack: https://apache-pulsar.slack.com ✦ Localization: https://crowdin.com/project/apache-pulsar ✦ Github
 https://github.com/apache/pulsar
 https://github.com/apache/bookkeeper 76
  78. 78. APACHE PULSAR AS A SAAS - PREVIEW https://cloud.streamlio.com 77
  79. 79. PULSAR FUNCTIONS 78
  80. 80. COMPUTE REPRESENTATION - ABSTRACT VIEW 79 f(x) Incoming Messages Output Messages
  81. 81. WHAT’S NEEDED: STREAM NATIVE COMPUTATION 80 ✦ Simplest possible API ๏ Method/Procedure/Function ๏ Multi Language API ๏ Scale developers ✦ Message bus native concepts ๏ Input/Output/Log as topics ✦ Flexible runtime ๏ Simple standalone applications vs system managed applications
  82. 82. PULSAR FUNCTIONS 81 Execute user-defined functions to process and transform data ✦ Dynamic filtering, transformation, routing and analytics ✦ Easy for developers: serverless deployment, fully managed by cluster ✦ Multiple input topics, multiple output topics ✦ Access to windows of messages ✦ Integrated global state storage ✦ Integrated with schema registry f(x)
  83. 83. PULSAR FUNCTIONS 82 SDK-LESS API import java.util.function.Function; public class ExclamationFunction implements Function<String, String> { @Override public String apply(String input) { return input + "!"; } }
  84. 84. PULSAR FUNCTIONS 83 SDK API import org.apache.pulsar.functions.api.PulsarFunction; import org.apache.pulsar.functions.api.Context; public class ExclamationFunction implements PulsarFunction<String, String> { @Override public String process(String input, Context context) { return input + "!"; } }
  85. 85. PULSAR FUNCTIONS 84 INPUT AND OUTPUT ✦ Function executed for every message of input topic ๏ Supports multiple topics as inputs ✦ Function Output goes to the output topic ๏ Function Output can be void/null ✦ SerDe takes care of serialization/deserialization of messages ๏ Custom SerDe can be provided by the users ๏ Integrates with Schema Registry
  86. 86. PULSAR FUNCTIONS 85 AT MOST ONCE AT LEAST ONCE EXACTLY ONCE PROCESSING GUARANTEES
  87. 87. PULSAR FUNCTIONS 86 AS A STANDALONE APPLICATION bin/pulsar-admin functions localrun --input persistent://sample/standalone/ns1/test_input --output persistent://sample/standalone/ns1/test_result --className org.mycompany.ExclamationFunction --jar myjar.jar ✦ Runs as a standalone process ✦ Run as many instances as you want. Framework automatically balances data ✦ Run and manage via Mesos/K8/Nomad/your favorite tool
  88. 88. PULSAR FUNCTIONS 87 RUNNING INSIDE PULSAR CLUSTER ✦ ‘Create’ and ‘Delete’ Functions in a Pulsar Cluster ✦ Pulsar brokers run functions as either threads/processes/docker containers ✦ Unifies Messaging and Compute cluster into one, significantly improving manageability ✦ Ideal match for Edge or small startup environment ✦ Serverless in a jar
  89. 89. PULSAR FUNCTIONS - DEPLOYMENT CONTAINERS THREADS PROCESSES 88
  90. 90. PULSAR FUNCTIONS - DEPLOYMENT 89 (CONTD.) Broker 1 Worker Function wordcount-1 Function transform-2 Broker 1 Worker Function transform-1 Function dataroute-1 Broker 1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3
  91. 91. PULSAR FUNCTIONS - DEPLOYMENT 90 (CONTD.) Worker Function wordcount-1 Function transform-2 Worker Function transform-1 Function dataroute-1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3 Broker 1 Broker 2 Broker 3 Node 4 Node 5 Node 6
  92. 92. PULSAR FUNCTIONS - DEPLOYMENT 91 (CONTD.) Function wordcount-1 Function transform-1 Function transform-3 Pod 1 Pod 2 Pod 3 Broker 1 Broker 2 Broker 3 Pod 7 Pod 8 Pod 9 Function dataroute-1 Function wordcount-2 Function transform-2 Pod 4 Pod 5 Pod 6
  93. 93. STATE MANAGEMENT IN PULSAR FUNCTIONS 92
  94. 94. PULSAR FUNCTIONS 93 BUILT-IN STATE ✦ Functions can store state in stream storage ๏ Framework provides an simple library around this ✦ Support server side operations like counters ✦ Simplified application development ๏ No need to standup an extra system
  95. 95. PULSAR FUNCTIONS 94 BUILT-IN STATE MANAGEMENT ✦ Pulsar uses BookKeeper as its stream storage ✦ Functions can store State in BookKeeper ✦ Framework provides the Context object for users to access State ✦ Support server side operations like Counters ✦ Simplified application development ๏ No need to standup an extra system to develop/test/integrate/operate
  96. 96. PULSAR FUNCTIONS 95 STATE EXAMPLE import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.PulsarFunction; public class CounterFunction implements PulsarFunction<String, Void> { @Override public Void process(String input, Context context) throws Exception { for (String word : input.split(".")) { context.incrCounter(word, 1); } return null; } }
  97. 97. PULSAR FUNCTIONS 96 STATE IMPLEMENTATION ✦ The built-in state management is powered by Table Service in BookKeeper ✦ BP-30: Table Service ๏ Originated for a built-in metadata management within BookKeeper ๏ Expose for general usage. e.g. State management for Pulsar Functions ✦ Available from Pulsar 2.4
  98. 98. PULSAR FUNCTIONS 97 STATE IMPLEMENTATION ✦ Updates are written in the log streams in BookKeeper ✦ Materialized into a key/value table view ✦ The key/value table is indexed with rocksdb for fast lookup ✦ The source-of-truth is the log streams in BookKeeper ✦ Rocksdb are transient key/value indexes ✦ Rocksdb instances are incrementally checkpointed and stored into BookKeeper for fast recovery
  99. 99. EVENT PROCESSING DESIGN PATTERNS DYNAMIC DATA ROUTING ETL DATA ENRICHMENT FILTERING 98 WINDOW AGGREGATION
  100. 100. JIFFY 99
  101. 101. STATEFUL SERVERLESS APPLICATIONS 100
  102. 102. STATEFUL SERVERLESS APPLICATIONS 100 Generate and exchange intermediate data or ephemeral state
  103. 103. STATEFUL SERVERLESS APPLICATIONS 100 Generate and exchange intermediate data or ephemeral state MapReduce 
 (Spark, Hadoop)
  104. 104. STATEFUL SERVERLESS APPLICATIONS 100 Generate and exchange intermediate data or ephemeral state M M M M M R R R M M R R M M R MapReduce 
 (Spark, Hadoop)
  105. 105. STATEFUL SERVERLESS APPLICATIONS 100 Generate and exchange intermediate data or ephemeral state M M M M M R R R M M R R M M R MapReduce 
 (Spark, Hadoop)
  106. 106. STATEFUL SERVERLESS APPLICATIONS 100 Generate and exchange intermediate data or ephemeral state M M M M M R R R M M R R M M R MapReduce 
 (Spark, Hadoop)
  107. 107. STATEFUL SERVERLESS APPLICATIONS 100 Generate and exchange intermediate data or ephemeral state MapReduce 
 (Spark, Hadoop) Stateful Streaming Video Analytics …
  108. 108. STATEFUL SERVERLESS APPLICATIONS 100 Generate and exchange intermediate data or ephemeral state Need a serverless layer for sharing and exchanging ephemeral state MapReduce 
 (Spark, Hadoop) Stateful Streaming Video Analytics …
  109. 109. STATEFUL SERVERLESS APPLICATIONS 100 Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity Generate and exchange intermediate data or ephemeral state Need a serverless layer for sharing and exchanging ephemeral state MapReduce 
 (Spark, Hadoop) Stateful Streaming Video Analytics …
  110. 110. Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101
  111. 111. Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 CPU CPUCPU Remote Persistent Storage (e.g., S3) … …Stateful Tasks CPU
  112. 112. Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 CPU CPUCPU Remote Persistent Storage (e.g., S3) … …Stateful Tasks CPU
  113. 113. Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 Video Encoding in ExCamera [NSDI’17] Task#1 Task#2 Task#N … Rendezvous Server Adhoc
  114. 114. Sorting data on PyWren using Locus [NSDI’19] Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 Reduce#1 Reduce#2 Reduce#M … Map#1 Map#2 Map#N … Video Encoding in ExCamera [NSDI’17] Task#1 Task#2 Task#N … Rendezvous Server Redis Adhoc
  115. 115. Sorting data on PyWren using Locus [NSDI’19] Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 Reduce#1 Reduce#2 Reduce#M … Map#1 Map#2 Map#N … Video Encoding in ExCamera [NSDI’17] Task#1 Task#2 Task#N … Rendezvous Server Redis Adhoc
  116. 116. Sorting data on PyWren using Locus [NSDI’19] Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 Reduce#1 Reduce#2 Reduce#M … Map#1 Map#2 Map#N … Video Encoding in ExCamera [NSDI’17] Task#1 Task#2 Task#N … Rendezvous Server Redis Adhoc General
  117. 117. Sorting data on PyWren using Locus [NSDI’19] Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 Reduce#1 Reduce#2 Reduce#M … Map#1 Map#2 Map#N … Video Encoding in ExCamera [NSDI’17] Task#1 Task#2 Task#N … Rendezvous Server Redis Adhoc General Anna [VLDB’19, IEEE TKDE’19]
  118. 118. Sorting data on PyWren using Locus [NSDI’19] Requirements Low Latency, High IOPS Lifetime Management Fine-grained Elasticity EXISTING APPROACHES 101 Reduce#1 Reduce#2 Reduce#M … Map#1 Map#2 Map#N … Video Encoding in ExCamera [NSDI’17] Task#1 Task#2 Task#N … Rendezvous Server Redis Adhoc General Pocket [OSDI’18] Anna [VLDB’19, IEEE TKDE’19]
  119. 119. JIFFY: MEMORY MANAGEMENT UNIT FOR SERVERLESS OS 102 … CPU CPUCPU …CPU
  120. 120. JIFFY: MEMORY MANAGEMENT UNIT FOR SERVERLESS OS 102 … CPU CPUCPU …CPU Jiffy: Remote Ephemeral Storage Application: Scale ephemeral storage resources independent of other resources Cloud Provider: Multiplex ephemeral storage for high utilization
  121. 121. JIFFY: MEMORY MANAGEMENT UNIT FOR SERVERLESS OS 102 … CPU CPUCPU …CPU Jiffy: Remote Ephemeral Storage Application: Scale ephemeral storage resources independent of other resources Cloud Provider: Multiplex ephemeral storage for high utilization Challenges: What is the right interface? How can we share ephemeral storage across applications with isolation? How should we manage lifetimes of application storage? How to facilitate efficient communication across tasks?
  122. 122. JIFFY INTERFACE 103 Virtual Memory Layer: Transparent memory scaling at “block” granularity for each namespace CreateNamespace(), DestroyNamespace() Stateful Programming Models: Use data structures to exchange state between tasks …Map Reduce Dataflow Streaming Dataflow Piccolo Distributed Data Structure Layer: Wrap “blocks” to efficiently support rich semantics …FIFO Queues Files Hash Table B-Tree Enqueue(), 
 Dequeue() Read(), 
 Write() Get(), 
 Put(),… Lookup(), 
 Insert(),… M M R R
  123. 123. Isolation: Separate data structure per namespace Multiplexing: Blocks multiplexed across data structures JIFFY: HIGH UTILIZATION WITH ISOLATION 104 Transparent scaling by adding/removing blocks & data-structure specific repartitioning Serve r#1 Server #2 Server #N Jiffy Approach …DS#1 DS#N Shared Ephemeral Storage App#1 App#2 App#N… High utilization by multiplexing ephemeral storage across apps Provide isolation guarantees across applications
  124. 124. JIFFY: STATE LIFETIME MANAGEMENT 105
  125. 125. JIFFY: STATE LIFETIME MANAGEMENT 105 New challenges in serverless compute platforms: independent compute/memory lifetimes
  126. 126. JIFFY: STATE LIFETIME MANAGEMENT 105 Server-centric Architectures New challenges in serverless compute platforms: independent compute/memory lifetimes
  127. 127. JIFFY: STATE LIFETIME MANAGEMENT 105 Server-centric Architectures New challenges in serverless compute platforms: independent compute/memory lifetimes
  128. 128. JIFFY: STATE LIFETIME MANAGEMENT 105 Serverless Architectures New challenges in serverless compute platforms: independent compute/memory lifetimes
  129. 129. JIFFY: STATE LIFETIME MANAGEMENT 105 Serverless Architectures New challenges in serverless compute platforms: independent compute/memory lifetimes
  130. 130. JIFFY: STATE LIFETIME MANAGEMENT 105 Serverless Architectures Goal: Couple lifetime of storage resources to application lifetime New challenges in serverless compute platforms: independent compute/memory lifetimes
  131. 131. JIFFY: STATE LIFETIME MANAGEMENT 105 Goal: Couple lifetime of storage resources to application lifetime
  132. 132. Existing storage systems: do not couple JIFFY: STATE LIFETIME MANAGEMENT 105 Goal: Couple lifetime of storage resources to application lifetime
  133. 133. Existing storage systems: do not couple JIFFY: STATE LIFETIME MANAGEMENT 105 Goal: Couple lifetime of storage resources to application lifetime Programming languages: scoping & garbage collection
  134. 134. Challenge: Identify data scope, lifetime when compute and storage are separated Existing storage systems: do not couple JIFFY: STATE LIFETIME MANAGEMENT 105 Goal: Couple lifetime of storage resources to application lifetime Programming languages: scoping & garbage collection
  135. 135. Challenge: Identify data scope, lifetime when compute and storage are separated Existing storage systems: do not couple JIFFY: STATE LIFETIME MANAGEMENT 105 Jiffy Approach: Hierarchical namespaces with lease management Goal: Couple lifetime of storage resources to application lifetime Programming languages: scoping & garbage collection
  136. 136. Challenge: Identify data scope, lifetime when compute and storage are separated Existing storage systems: do not couple JIFFY: STATE LIFETIME MANAGEMENT 105 Jiffy Approach: Hierarchical namespaces with lease management Goal: Couple lifetime of storage resources to application lifetime Programming languages: scoping & garbage collection App1 App2 Task1 Task1 Task1 Task2 Subtask1 Subtask2 App3 /
  137. 137. Challenge: Identify data scope, lifetime when compute and storage are separated Existing storage systems: do not couple JIFFY: STATE LIFETIME MANAGEMENT 105 Jiffy Approach: Hierarchical namespaces with lease management Goal: Couple lifetime of storage resources to application lifetime Programming languages: scoping & garbage collection App1 App2 Task1 Task1 Task1 Task2 Subtask1 Subtask2 App3 lease duration, last renewed Lease Renewals Application Tasks /
  138. 138. JIFFY: INTER-TASK COMMUNICATION 106 Ephemeral Remote Storage ? A CPU BCPU How does B know it has data to consume?
  139. 139. JIFFY: INTER-TASK COMMUNICATION 106 Ephemeral Remote Storage ? A CPU BCPU How does B know it has data to consume? Jiffy: in-built notification mechanism to indicate availability of data Jiffy CPUA CPU B
  140. 140. JIFFY: INTER-TASK COMMUNICATION 106 Ephemeral Remote Storage ? A CPU BCPU How does B know it has data to consume? Jiffy: in-built notification mechanism to indicate availability of data Jiffy CPUA CPU B Subscribe(Put)
  141. 141. JIFFY: INTER-TASK COMMUNICATION 106 Ephemeral Remote Storage ? A CPU BCPU How does B know it has data to consume? Jiffy: in-built notification mechanism to indicate availability of data Jiffy Notify(Put, K, V) CPUA CPU B Put(K, V)
  142. 142. JIFFY: SYSTEM OVERVIEW 107 Directory Service Storage Service Hierarchical namespaces Data Structure per Namespace Jiffy Client Lease Renewal Lease Management Notification Framework Block-level allocator CONTROL DATA
  143. 143. TWOFOLD JIFFY: KEY IDEAS SEPARATION OF CONTROL PLANE AND DATA PLANE HIERARCHICAL NAMESPACES 
 For resource multiplexing 
 and lifetime management ELASTIC SCALING MILLISECOND TIMESCALES ISOLATION BETWEEN TASKS 108
  144. 144. EVALUATION LATENCY ELASTICITY MBPS IOPS 109 FOUR DIMENSIONS
  145. 145. HOW WELL DOES JIFFY PERFORM? 110 Serverless Platform AWS Lambda Service Storage Service Amazon EC2 (m4.16xlarge instances) Compared Storage Systems Redis, Apache Crail, Pocket, DynamoDB, Amazon S3 Latency/IOPS/MBPS comparable to state-of-the-art (Redis, Apache Crail, Pocket) • ~100us/operation for 64B requests, at ~100,000 operations per second. Transparent fine-grained elasticity for various data structures within 2-500ms 110
  146. 146. PERFORMANCE FOR STATEFUL APPLICATIONS 111 Encode 15min 4k video on ExCamera TaskID 15 12 9 6 3 0 Task Latency (s) 0 15 30 45 60 ExCamera ExCamera + Jiffy Sort 50GB data on PyWren S3 Redis Jiffy Task Latency (s) 0 10 20 30 40 50 Map Task Reduce Task TPC-DS Queries on 100GB data on Hive Q1 Q2 Q3 Q4 Q5 Task Latency (s) 0 160 320 480 640 800 Local HDFS Jiffy Takeaway Jiffy performance is comparable to state-of-the-art, even while providing fine-grained transparent elasticity, lifetime-management, etc.
  147. 147. Total Capacity BENEFITS OF MULTIPLEXING 112 50GB sort jobs arriving every 50s, 50 100 Used capacity Time 0 Delay until capacity available UsedCapacity (GB) 0 10 20 30 40 50 60 Time (s) 0 50 100 150 200 250 300 350 400 450 500 Sort-1 Sort-2 Sort-3 Sort-4 Sort-5 0 10 20 30 40 50 60 Time (s) 0 50 100 150 200 250 300 350 400 450 500 Redis Jiffy on storage system with fixed 50GB capacity No Available Capacity
  148. 148. SERVERLESS STREAMING ANALYTICS 113
  149. 149. 114 TIME TO MARKET
  150. 150. STREAMING APPROXIMATEREAL-TIME 115
  151. 151. DATA SKTECHES CARDINALITY QUANTILES FREQUENT ELEMENTSMEMBERSHIP 116 EXAMPLE FAMILIES
  152. 152. IP/ Device ID Blacklisting Databases (e.g., speed up semi-join operations), Caches, Routers, Storage Systems Reduce space requirement in probabilistic routing tables MEMBERSHIP APPLICATIONS 117
  153. 153. MEMBERSHIP CUCKOO FILTER [Fan et al. 2014] BLOOM FILTER [Bloom 1970] NEURAL BLOOM FILTER [Rae et al. 2019] LEARNED BLOOM FILTER [Mitzenmacher 2018] 118 FLAVORS
  154. 154. BLOOM FILTER [1] 119 [1] Bloom (1970). “Space-Time Trade-offs in Hash Coding with Allowable Errors”. [2] IllustraCon borrowed from hNp://www.eecs.harvard.edu/~michaelm/postscripts/im2005b.pdf [2]
  155. 155. BLOOM FILTER 120 ✦ Natural generalization of hashing ✦ False positives are possible ✦ No false negatives No deletions allowed ✦ For false positive rate ε, # hash functions = log2(1/ε) where, n = # elements, k = # hash functions m = # bits in the array
  156. 156. CUCKOO FILTER [1] 121 ✦ Key Highlights ๏ Add and remove items dynamically ๏ For false positive rate ε < 3%, more space efficient than Bloom filter ๏ Higher performance than Bloom filter for many real workloads ๏ Asymptotically worse performance than Bloom filter ‣ Min fingerprint size α log (# entries in table) ✦ Overview ๏ Stores only a fingerprint of an item inserted ‣ Original key and value bits of each item not retrievable ๏ Set membership query for item x: search hash table for fingerprint of x [1] Fan et al. (2014). “Cuckoo Filter: PracCcally BeNer Than Bloom”, CoNEXT.
  157. 157. CUCKOO FILTER [1] 122 Cuckoo Hashing [1] [1] R. Pagh and F. Rodler. “Cuckoo hashing,” Journal of Algorithms, 51(2):122-144, 2004. [2] IllustraCon borrowed from Fan et al., (2014) “Cuckoo Filter: PracCcally BeNer Than Bloom”, CoNEXT. [2] IllustraCon of Cuckoo hashing [2] ✦ High space occupancy ✦ Practical implementations: multiple items/bucket ✦ Example uses: Software-based Ethernet switches Cuckoo Filter [2] ✦ Uses a multi-way associative Cuckoo hash table ✦ Employs partial-key cuckoo hashing ๏ Store fingerprint of an item ๏ Relocate existing fingerprints to their alternative locations [2]
  158. 158. 123[1] Mitzenmacher et al. (2017). “AdapCve Cuckoo Filters”. ✦ Motivation ๏ Minimize false positive rate ✦ Selectively remove false positives without introducing false negatives ✦ Maintain a replica of cuckoo hash table with raw elements ✦ Indices of buckets are determined by hash values of the element, and not solely by the fingerprint ✦ Allow different hash functions for the fingerprints ๏ Enables removal and reinsertion of elements to remove false positives ✦ Insertion complexity and space overhead KEY HIGHLIGHTS ADAPTIVE CUCKOO FILTER [1]
  159. 159. 124 CONCURRENT CUCKOO FILTER [1] [1] Li et al. (2014), “Algorithmic Improvements for Fast Concurrent Cuckoo Hashing”. Support for multiple writers Optimistic cuckoo hashing Minimizes the size of the locked critical section during updates Leverage Intel’s Hardware Transactional Memory (HTM) Optimize TSX lock elision to reduce transactional abort rate Algorithmic/Architectural tuning Breadth-first Search for an Empty Slot Lock After Discovering a Cuckoo Path Striped fine-grain spin locks Increase set-associativity Prefetcing
  160. 160. CUCKOO FILTER 125 CUCKOO++ HASH TABLES [Scouarnec 2018] MORTON FILTER [Breslow et al. 2018] SMART CUCKOO [Sun et al. 2017] POSITION-AWARE CUCKOO [Kwon et al. 2018] VARIANTS
  161. 161. 126[1] IllustraCon borrowed from Lang et al. (2019) “Performance-OpCmal Filtering: Bloom Overtakes Cuckoo at High Throughput”. PERFORMANCE COMPARISON [1] Space-precision trade-off Memory footprint False positive rate Rate of negative lookups Throughput Cache misses, # Network messages, Local disk I/O Saved work per lookup that filtering avoids Optimizations Register blocking Cache sectorization METRICS
  162. 162. LEARNED BLOOM FILTER [1] 127[1] Kraska et al. (2018). “The Case for Learned index Structures”, SIGMOD. ✦ Bloom filter as a binary classifier - predict whether a key exists in as set or not (membership) ๏ Subtleties - no false negatives ‣ Learned model + auxiliary data structure ✦ Learn structure of lookup keys ๏ Minimize collisions between keys and non-keys ๏ Leverage continuous functions to capture the underlying data distribution ✦ Learn different models for read-heavy vs. write-heavy workloads KEY HIGHLIGHTS
  163. 163. LEARNED BLOOM FILTER SANDWICHING [1] [1] Mitzenmacher (2018). “A Model for Learned Bloom Filters and OpCmizing by Sandwhiching”, NIPS. ✦ Challenges ๏ Deletion of keys ‣ Re-train the model ✦ Sandwich Learned Bloom Filter ๏ Increased robustness ✦ Pre-filtering ๏ Remove keys not present ๏ Minimizes the distance between the distribution of the queries and test set used to estimate the learned Bloom filter’s false positive probability ๏ Limits the size of the Backup filter ✦ Computationally more complex than Learned Bloom Filter [1] 128
  164. 164. 129[1] Rae et al. (2019). “Meta-Learning Neural Bloom Filters”. NEURAL BLOOM FILTER [1] ✦ Inputs arrive at high throughput, or are ephemeral ๏ Few-shot neural data structures ✦ Learning membership in one-shot via meta-learning ✦ Overview ๏ Sample tasks from a common distribution ๏ Network learns to specialize to a given task with few examples KEY HIGHLIGHTS [1]
  165. 165. FREQUENT ELEMENTS 130 TOP-K ELEMENTS, HEAVY HITTERS ✦ E-commerce ✦ Security ✦ Network measurements ✦ Sensing ✦ Databases ✦ Feature selection
  166. 166. FREQUENT ELEMENTS COUNT-SKETCH [Charikar et al. 2002] COUNT-MIN-LOG [Pitel & Fouquier 2015] COUNT-MIN [Cormode & Muthukrishnan 2005] LEARNED COUNT-MIN [Hsu et al. 2019] 131 5 5 5 5
  167. 167. ✦ A two-dimensional array counts with w columns and d rows ✦ Each entry of the array is initially zero ✦ d hash functions are chosen uniformly at random from a pairwise independent family ✦ Update ๏ For a new element i, for each row j and k = hj(i), increment the kth column by one ✦ Point query where, sketch is the table ✦ Parameters COUNT-MIN [1] 132 [1] Cormode and Muthukrishnan (2005). "An Improved Data Stream Summary: The Count-Min Sketch and its ApplicaCons". J. Algorithms 55: 29–38. ),( δε ! ! " # # $ = ε e w ! ! " # # $ = δ 1 lnd }1{}1{:,,1 wnhh d ……… →
  168. 168. ✦ Millions/billions of features - a routine ๏ NLP, genomics, computational biology, chemistry ✦ Accuracy vs. Performance trade-off ๏ Model vs. runtime ✦ Model Interpretability COUNT-SKETCH FEATURE SELECTION ✦ Feature Hashing ๏ Loss of interpretability ✦ Count-Sketch + top-k heap ๏ top-k values of the sketch used for iterative update [1] IllustraCon borrowed from Aghazadeh et al. (2018). “MISSION: Ultra Large-Scale Feature SelecCon using Count-Sketched”. [1] 133
  169. 169. ✦ Count-Min sketch with conservative update (CU sketch) ✦ Update an item with frequency c ๏ Avoid unnecessary updating of counter values => Reduce over-estimation error ๏ Prone to over-estimation error on low-frequency items ✦ Lossy Conservative Update (LCU) - SWS ๏ Divide stream into windows ๏ At window boundaries, ∀ 1 ≤ i ≤ w, 1 ≤ j ≤ d, decrement sketch[i,j] if 0 < sketch[i,j] ≤ COUNT-MIN [1] [1] Cormode, G. 2009. Encyclopedia entry on ’Count-MinSketch’. In Encyclopedia of Database Systems. Springer., 511–516. VARIANTS 134
  170. 170. ✦ Minimize error of low frequency items ✦ Overview ๏ Same structure than Count-Min Sketch with conservative update ๏ Replace the classical binary counting cells by log counting cells COUNT-MIN-LOG [1] 135[1] Pitel and Fouquier (2015). "Count-Min-Log sketch: Approximately counCng with approximate counters”. UPDATE QUERY
  171. 171. ✦ Applications ๏ Changepoint/Global Iceberg Detection ๏ Entropy Estimation UnivMON [1] 136[1] Liu et al. (2016). "One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon”. ONLINESKETCHINGSTEP OFFLINEESTIMATION ✦ Universal sketch ✦ Provably accurate for estimating a large class of functions ✦ Generality ๏ Delay binding to application of interest ✦ High fidelity
  172. 172. ✦ Need for line rate processing: 10-100 Gbps ✦ Limited memory in switching hardware ๏ Memory ∝ # heavy flows HASH-PIPE [1] 137[1] Sivaraman et al. (2017). “Heavy-HiNer DetecCon EnCrely in the Data Plane”. ✦ Small time budget: 1 ns ๏ Manipulate state & process packets at each stage ๏ Process each packet only once
  173. 173. ✦ Exploit patterns in the input ๏ For example, in text data, word frequency ∝ 1/word length ✦ Mitigate large estimation error ๏ Collisions between high-frequency elements ✦ Learn properties to identify heavy hitters ✦ Does not need to know the data distribution a priori ✦ Logarithmic improvement in error bound ✦ Key high level idea ๏ Assign each heavy hitter to its unique bucket LEARNED COUNT-MIN [1] 138[1] Hsu et al. (2019). “Learning-based Frequency EsCmaCon Algorithms”, ICLR.
  174. 174. LEARNED COUNT-MIN 139 ✦ Frequency of an element in a unique bucket is exact ✦ Provably reduces estimation errors [1] IllustraCon borrowed from Hsu et al. (2019). “Learning-based Frequency EsCmaCon Algorithms”, ICLR. [1]
  175. 175. REAL-TIME FREQUENT ELEMENTS in PULSAR & HERON 140 Streamlio (Apache Pulsar and Apache Heron) Data Source 2 clean-fn 2 Data Source 1 Data Source 3 clean-fn 1 trend- topology 3 Trending Application T1 T2 T3
  176. 176. PRIVATE COUNT-MIN [1] 141[1] Melis et al. (2016), “Efficient Private StaCsCcs with Succinct Sketches”. ✦ out-of-dictionary words → auto-complete ✦ Why not employ homomorphic encryption for privacy-preserving aggregation? ✦ Perform private aggregation over the sketches, rather than the raw inputs ✦ Reduce the communication and computation complexity ๏ Linear to logarithmic in the size of their input ✦ Real-world privacy-friendly systems ๏ Recommendations for media streaming services ๏ Prediction of user locations ‣ Improve transportation services and predict future trends ✦ Federated learning
  177. 177. FEDERATED LEARNING 142[1] IllustraCon borrowed from hNps://ai.googleblog.com/2017/04/federated-learning-collaboraCve.html. [1]
  178. 178. FEDERATED & DIFFERENTIALLY PRIVATE Discover the heavy hitters but not their frequencies Without additional noise Iterative algorithm[1] randomly a select set of users Each user votes on a single character extension to an already discovered popular prefix Server aggregates the received votes using a trie structure and prunes nodes that have counts that fall below a chosen threshold θ [1] Zhu et al. (2019), “Federated Heavy HiNers with DifferenCal Privacy”. 143
  179. 179. Customer CARDINALITY ESTIMATION # DISTINCT ELEMENTS IN A DATABASE # UNIQUE SEARCH QUERIES # UNIQUE WEBSITE VISITORS # DISTINCT NETWORK FLOWS 144 APPLICATIONS
  180. 180. 145 CARDINALITY ESTIMATION ✦ Hash values as strings ✦ Occurrence of particular patterns in the binary representation ✦ Example: Hyperloglog [Flajolet et al. 2008] BIT-PATTERN OBSERVABLES ✦ Hash values as real numbers ✦ k-th smallest value ๏ Insensitive to distribution of repeated values ✦ Examples: MinCount [Giroire, 2000] ORDER STATISTIC OBSERVABLES
  181. 181. SKETCH-BASED VS. SAMPLING BASED UNIFORM HASHING VS. LOGARITHMIC HASHING INTERNAL BASED VS. BUCKET BASED 146 CARDINALITY ESTIMATION FLAVORS Adaptive sampling, Distinct sampling, Method-of-Moments Estimator, (Smoothed) Jacknife Estimator LogLog, SuperLogLog, HyperLogLog, and HyperLogLog++ MinCount Counting Bloom filter
  182. 182. ✦ Apply hash function h to every element in a multiset ✦ Cardinality of multiset is 2max(ϱ) where 0ϱ-11 is the bit pattern observed at the beginning of a hash value ✦ Above suffers with high variance ๏ Employ stochastic averaging ๏ Partition input stream into m sub-streams Si using first p bits of hash values (m = 2p) 147 HYPERLOGLOG where
  183. 183. 148 HYPERLOGLOG OPTIMIZATIONS ✦ Use of 64-bit hash function ๏ Total memory requirement 5 * 2p -> 6 * 2p, where p is the precision ✦ Empirical bias correction ๏ Uses empirically determined data for cardinalities smaller than 5m and uses the unmodified raw estimate otherwise ✦ Sparse representation ๏ For n≪m, store an integer obtained by concatenating the bit patterns for idx and ϱ(w) ๏ Use variable length encoding for integers that uses variable number of bytes to represent integers ๏ Use difference encoding - store the difference between successive elements ✦ Other optimizations [1, 2] [1] hNp://druid.io/blog/2014/02/18/hyperloglog-opCmizaCons-for-real-world-systems.html [2] hNp://anCrez.com/news/75
  184. 184. 149 ANOMALY DETECTION QuantelAI
  185. 185. 150 ANOMALY DETECTION Aggregate function{} Aggregate function{} Aggregate function{} Partitioned Pulsar topic* Pulsar Broker* Aggregate function{} FIX logs Market Data Alternate Data Fluent-bit (Producer) Partitioned topic Windowing & aggregations applied here* Higher level aggregations* Eagle AI Model Server (Consumer) Pulsar topic * Pulsar topic * Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} Aggregate function{} * Indicates components that can be load balanced QuantelAI
  186. 186. SKETCHING FOR MACHINE LEANING 151
  187. 187. ✦ Stochastic/Incremental gradient descent ๏ Slow to converge ✦ Variance reduction, Accelerated gradient descent ๏ AdaBound, AMSGrad, Nesterov, Adamax, Adam, RMSProp, AdaDelta ‣ Stragglers worsen the convergence ✦ Select a subset of training data points along with their corresponding learning rates ๏ Greedily maximize the facility location function ‣ Minimizes the upper-bound on the estimation error of the full gradient FASTER TRAINING [1] Mirzasoleiman et al. (2019). “Data Sketching for Faster Training of Machine Learning Models”. 152 KEY IDEA
  188. 188. SERVERLESS MACHINE LEARNING 153
  189. 189. CATEGORIES REGRESSIONCLASSSIFICATION 154
  190. 190. TRAINING INFERENCE 155
  191. 191. Problem Statement fn(x): smooth function h(x): non-smooth function (such as l1 and l2 penalty) Leverage ADMM Worker w updates its own copy xw and master updates global variable z OPTIMIZATION 156 [1] Aytekin and Johansson (2019), “Harnessing the Power of Serverless RunCmes for Large-Scale OpCmizaCon”.
  192. 192. OPTIMIZATION [1] 157[1] IllustraCon borrowed from Aytekin and Johansson (2019), “Harnessing the Power of Serverless RunCmes for Large-Scale OpCmizaCon”. ✦ Discussion ๏ Utilization, Cold start, Responsiveness [1]
  193. 193. 158 OPTIMIZATION [1] [1] Gupta et al. (2019). “OverSketched Newton: Fast Convex OpCmizaCon for Serverless Systems”. ✦ Large-scale optimization problems ๏ Second order methods ‣ Use gradient and Hessian ‣ Faster convergence ‣ Do not require step size tuning ‣ Computationally prohibitive when training data is large ๏ Go Serverless ‣ Invoke thousands of workers ‣ Communication costs (# iterations) ‣ Compute approximate Hessian ✦ Matrix sketching ๏ Randomized Numerical Linear Algebra (RandNLA) ๏ Inbuilt resiliency against stragglers ‣ Leverage ECC to create redundant computation
  194. 194. OPTIMIZATION 159 ✦ Gradient computation ๏ Matrix-vector multiplication ‣ Coded Matrix Multiplication - distributed, straggler resilient [1] [1] IllustraCon borrowed from Gupta et al. (2019). “OverSketched Newton: Fast Convex OpCmizaCon for Serverless Systems”.
  195. 195. 160 OPTIMIZATION ✦ Hessian computation ๏ Matrix-matrix multiplication (MM) ‣ Block partitioning of input matrices ‣ Sparse sketching matrix based on Count-Sketch [1] [1] IllustraCon borrowed from Gupta et al. (2019). “OverSketched Newton: Fast Convex OpCmizaCon for Serverless Systems”. ✦ Applications - Distributed, Straggler resilient ๏ Ridge Regularized Linear Regression
  196. 196. INFERENCE IN SERVERLESS ENVIRONMENTS 161 [1] IllustraCon borrowed from Dakkak et al. (2018). “TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep Learning Inference in FuncCon as a Service Environments”. Key Challenge: Low Latency Cold Start: move large amount of model data within and across servers Persistent model store across the GPU, CPU, local storage, and cloud storage hierarchy [1]
  197. 197. 162 LOW LATENCY [1] IllustraCon borrowed from Crankshaw et al. (2017). “Clipper: A Low-Latency Online PredicCon Serving System”. ✦ Content recommendation service ๏ Example: News ๏ Latency < 100 ms ✦ Scalability: Hundred of Millions/Billions per sec ✦ Deployment and Maintenance ✦ Optimizations ๏ Throughput ‣ Caching, Adaptive Batching ๏ Accuracy ‣ Bandit and Ensemble Methods ๏ Model Selection ‣ On a per user/session basis ‣ Straggler mitigation [1]
  198. 198. CHALLENGES RESOURCE MANAGEMENT 163[1] IllustraCon borrowed from Yadwadkar et al. (2019), “A Case for Managed and Model-less Inference Serving”. [1]
  199. 199. SERVERLESS IOT 164
  200. 200. WHAT MAKES IOT ANALYTICS DIFFERENT? 165 More Data ✦ High-volume, continuous data in motion from multiple sensors ✦ Store, blend and manage time-series data More Complexity ✦ Use of multiple analytics techniques ✦ Distributed analytics (edge) More Automation ✦ Integration with operations systems and BPS ✦ Bidirectional communication and control of endpoints
  201. 201. WHAT MAKES IOT ANALYTICS DIFFERENT? 166 Devices Gateways Data Collectors Data Transport Processing Repositories Applications
  202. 202. CHALLENGES 167 ✦ Latency - delay resulting from data transmission from edge to cloud or datacenter may exceed application requirements ✦ Capacity - volume of data streams would require expensive network bandwidth to collect and transmit detailed data ✦ Processing lag - time required to process incoming data streams to make them ready for applications may exceed requirements ✦ Complexity - complicated mix of technologies and tools creates inconsistency and operations burdens
  203. 203. WHAT’S NEEDED? 168 ✦ Simplified infrastructure for data movement and processing ✦ Performance and scalability to keep up with data ✦ Ability to process, understand and act on data wherever it is Resilient, scalable data movement From edge to cloud to datacenter (and back) Unified platform Consistent development and processing environment across edge, cloud, datacenter Intelligence everywhere Dynamically filter, process, analyze and route data as needed at edge, cloud and datacenter
  204. 204. IOT DATA FABRIC 169 Apache Pulsar Edge Cloud Datacenter Integrated solution for event data movement, processing and storage Scalable for deployment across, edge, cloud and datacenter Simple framework for filtering, transformation, enrichment, analytics Built on Apache Pulsar open source technology, proven at massive scale
  205. 205. IOT ARCHITECTURE WITH APACHE PULSAR 170 Devices Gateways Data Collectors Data Transport Processing Repositories Applications Apache Pulsar
  206. 206. WRAPPING UP … 171
  207. 207. SERVERLESS: MISSING PIECES 172 SLA Guarantees Performance guarantees, Performance isolation Security Side-channels, Information leakage via network communications Heterogenous Hardware FPGAs, GPAs, TPUs, etc
  208. 208. 173 ✦ Increased co-residency: side-channels ๏ Rowhammer attacks on DRAM [1] ๏ Exploiting Micro-architectural vulnerabilities ✦ Information leakage via network communications ✦ Potential solutions ๏ Hardware-level security and isolation ๏ Light-weight and secure container isolation ๏ Task-placement strategies Security MISSING PIECES: SECURITY
  209. 209. 174 ✦ Increased multiplexing = less predictable performance ๏ Resource-allocation delays ๏ Scheduling delays ๏ Cold-start latencies ✦ Potential solutions ๏ Hardware-level isolation, container-level isolation ๏ Bin-packing based on performance needs (throughput, latency) ๏ Bin-packing based on complementary resource needs MISSING PIECES: SLA GUARANTEES SLA Guarantees
  210. 210. 175 ✦ Only CPU resources, no hardware heterogeneity ๏ GPU ๏ TPU ๏ FPGAs ✦ Not fundamental, providers eventually will offer them ✦ Leads to new opportunities: ๏ Greater degree of multiplexing for different resource types ๏ Bin-pack applications with different hardware needs MISSING PIECES: HETEROGENEOUS HARDWARE Heterogeneous Hardware
  211. 211. 176 ✦ Serverless enables: ๏ Complexity hiding ๏ Consumption based billing ๏ Automatic scaling ✦ All players benefit: ๏ Developers (simpler programming) ๏ Enterprises (lower costs) ๏ Cloud providers (high resource utilization) ✦ Future Serverless infrastructures will address today’s shortcomings ๏ Security, SLA guarantees, Heterogenous hardware. SERVERLESS IS THE FUTURE
  212. 212. 177
  213. 213. 178
  214. 214. 179 ACKNOWLDEGEMENTS RACHIT AGARWAL, ION STOICA, ADITYA AKELLA ERIC JONAS, JOHANN SCHLEIER- SMITH VIKRAM SREEKANTI, CHIA-CHE TSAI QIFAN PU, VAISHAAL SHANKAR, JOAO MENEZES CARREIRA, KARL KRAUTH, NEERAJA YADWADKAR, JOSEPH GONZALEZ, RALUCA ADA POPA, DAVID A. PATTERSON
  215. 215. READINGS 180
  216. 216. SERVERLESS Peeking Behind The Curtains Of Serverless Platforms [Wang et al. 2018] The Serverless Data Center : Hardware Disaggregation Meets Serverless Computing [Pemberton and Schleier-Smith, 2019] A Berkeley View On Serverless Computing [Jonas et al. 2018] SAND: Towards High-Performance Serverless Computing [Akkus et al. 2018] The Server Is Dead, Long Live The Server: Rise Of Serverless Computing, Overview Of Current State And Future Trends In Research And Industry [Castro et al. 2019] Agile Cold Starts For Scalable Serverless [Mohan et al. 2019] 181
  217. 217. 182Slide - [Brenner and Kapitza, 2019] Trust More, Serverless Clemmys: towards secure remote execution in FaaS [Trach et al. 2019] SERVERLESS 182 No More, No Less - A Formal Model For Serverless Computing [Gabbrielli et al. 2019] Serverless Computing: One Step Forward, Two Steps Back [Hellerstein et al. 2019] Formal Foundations Of Serverless Computing [Jangda et al. 2019]
  218. 218. numpywren: serverless linear algebra 183 SERVERLESS ANALYTICS/MACHINE LEARNING Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure A Serverless Real-Time Data Analytics Platform for Edge Computing [Nastic et al. 2017] [Ishakian et al. 2017] Serving deep learning models in a serverless platform [Carreira et al. 2018] A Case for Serverless Machine Learning [Pu et al. 2019] [Bhattacharjee et al. 2019] BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services [Kim and Lin 2018] Serverless Data Analytics with Flint [Shankar et al. 2018] [Feng et al. 2018] Exploring Serverless Computing for Neural Network Training
  219. 219. ACCELERATED STOCHASTIC GRADIENT DESCENT On the momentum term in gradient descent learning algorithms [Qian 1999] Accelerating stochastic gradient descent using predictive variance reduction [Johnson and Zhang 2013] 184 A method for unconstrained convex minimization problem with the rate of convergence O(1/k2) [Nesterov 1983] Adaptive Subgradient Methods for Online Learning and Stochastic Optimization [Duchi et al. 2011] Incorporating Nesterov Momentum into Adam [Dozat 2016] Adam: a Method for Stochastic Optimization [Kingma and Ba 2015] Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning [Shang et al. 2017] On the Convergence of Adam and Beyond [Reddi et al. 2019]
  220. 220. OPTIMIZATION [Drineas and Mahoney 2016] RandNLA: Randomized Numerical Linear Algebra [Gupta et al. 2019] OverSketched Newton: Fast Convex Optimization for Serverless Systems [Boyd et al. 2010] Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers [Parikh and Boyd, 2014] Proximal Algorithms 185 [Roosts et al. 2018] Newton-MR: Newton’s method without smoothness or convexity
  221. 221. APPROXIMATION A stochastic approximation method [Robbins and Munro 1951] On a stochastic approximation method [Chung et al. 1954] An analysis of approximations for maximizing submodular set functions - I [Nemhauser et al. 1978] An analysis of approximations for maximizing submodular set functions - II [Nemhauser et al. 1978] Accelerated greedy algorithms for maximizing submodular set functions [Minoux 1978] 186
  222. 222. A general-purpose counting filter: Making every bit count [Pandey et al. 2017] Multiple Set Matching and Pre-Filtering with Bloom Multifilters [Concas et al. 2019] Cuckoo filter: Practically better than Bloom [Fan et al. 2014] Improving retouched bloom filter for trading off selected false positives against false negatives [Donnet et al. 2010] 187 MEMBERSHIP Bloom filters in adversarial environments [Naor and Yegev 2015] Bloom Filters, Adaptivity, and the Dictionary Problem [Bender et al. 2018] Don’t thrash: how to cache your hash on flash [Bender et al. 2012] The bloomier filter: an efficient data structure for static support lookup tables [Chazelle et al. 2004]
  223. 223. FREQUENT ELEMENTS [Sivaraman et al. 2017] Heavy-Hitter Detection Entirely in the Data Plane [Roy et al. 2016] Augmented Sketch: Faster and more Accurate Stream Processing [Aghazadel et al. 2018] MISSION: Ultra Large-Scale Feature Selection using Count-Sketches [Harrison et al. 2018] Network-Wide Heavy Hitter Detection with Commodity Switches 188
  224. 224. CARDINALITY ESTIMATION NEURAL NETWORK BASED APPROACHES Cardinality estimation with local deep learning models [Woltmann et al. 2019] Learned Cardinalities: Estimating Correlated Joins with Deep Learning [Kipf et al. 2018] Cardinality estimation using neural networks [Liu et al. 2015] An Empirical Analysis of Deep Learning for Cardinality Estimation [Ortiz et al. 2019] 189
  225. 225. ✦ Federated Optimization: Distributed Machine Learning for On-Device Intelligence [Konečný et al. 2016] ✦ Communication-Efficient Learning of Deep Networks from Decentralized Data [McMahan et al. 2016] ✦ Federated Learning: Strategies for Improving Communication Efficiency [Konečný et al. 2016] ✦ Towards Federated Learning at Sscale: System Design [Bonawitz et al. 2019] ✦ Asynchronous FEDERATED Optimization [Xie et al. 2019] ✦ FEDERATED Heavy Hitters with Differential Privacy [Zhu et al. 2019] FEDERATED LEARNING 190
  226. 226. RESOURCES 191
  227. 227. ON THE WWW 192 Serverless deep/machine learning in production—the pythonic way https://medium.com/@waya.ai/deploy-deep-machine-learning- in-production-the-pythonic-way-a17105f1540eServerless Inference https://github.com/castorini/serverless-inference Serverless Architectures https://martinfowler.com/articles/ serverless.html An overview of gradient descent optimization algorithms http://ruder.io/optimizing-gradient-descent/ Amazon Elastic Inference https://aws.amazon.com/machine-learning/ elastic-inference OpenLambda — An open source serverless computing platform https://github.com/open-lambda/ open-lambda Serverless Predictions at Scale http://aws-de-media.s3.amazonaws.com/images/AWS_Summit_2018/ June6/Doppler/Serverless_Predictions_At_Scale.pdf

×