Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[232]mist 고성능 iot 스트림 처리 시스템

4,462 views

Published on

mist 고성능 iot 스트림 처리 시스템

Published in: Technology
  • Be the first to comment

[232]mist 고성능 iot 스트림 처리 시스템

  1. 1. MIST: Towards Large-Scale IoT Stream Processing 이계원, 엄태건 Joint work with 전병곤, 조성우, 김경태, 이정길, 이산하 1
  2. 2. Many IoT devices heartbeat location temperature humidity …. Continuous Data Streams Various Places Icons made by Freepik, Icon Pond, Roundicons from www.flaticon.com is licensed by CC 3.0 BY
  3. 3. * : IoT stream query Temperature data stream Q1 Q2Adjust the air cond. cooling temperature Adjust the fan speed of the electric fan
  4. 4. * : IoT stream query Temperature data stream Q1 Q2Adjust the air cond. cooling temperature Adjust the fan speed of the electric fan ● Long-running ● Small data streams ● Large numbers ● Various types
  5. 5. Our Scope IoT Stream Processing System * : IoT stream query
  6. 6. Our Scope IoT Stream Processing System * : IoT stream query Focus of this work : How to handle efficiently billions of IoT stream queries in a cluster of machines?
  7. 7. Current Stream Processing Systems ● Optimized for handling a small number of big stream queries
  8. 8. MIST
  9. 9. User & Application Building Manager (User) Building Management Application (Android, iOS, Web, ...) MIST “I want to monitor a room!”
  10. 10. User & Application Building Manager (User) Building Management Application (Android, iOS, Web, ...) MIST “I want to monitor a room!” “OK… I will submit the necessary query for you using MIST API”
  11. 11. User & Application Building Manager (User) Building Management Application (Android, iOS, Web, ...) MIST “I want to monitor a room!” “OK… I will submit the necessary query for you using MIST API” “I will give you notifications when something happens!”
  12. 12. MIST Architecture Cluster of machines MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine Query Submit (DAG, CEP, …)U s e r App / Client
  13. 13. App / Client MIST Architecture Cluster of machines MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine U s e r 1. A query is submitted to MIST Master
  14. 14. MIST Architecture Cluster of machines MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine 2. MIST master assigns the query to a MIST processing engine U s e r App / Client
  15. 15. MIST Architecture Cluster of machines MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine U s e r App / Client 3. Many IoT stream queries are processed in a cluster of machines
  16. 16. MIST Architecture Cluster of machines MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine Query Submit (DAG, CEP, …) App / Client U s e r
  17. 17. MIST Front-end
  18. 18. MIST Query API ●MIST provides query API for application developers ○ Implemented in Java 8 ○ Support UDFs (User-Defined Functions) in the form of Java lambda function ■ Ex) Map, Filter, … ■ Provide more flexible programming model than SQL
  19. 19. MIST Query API ●MIST supports two types of query APIs ○ Dataflow Model ■ Support low-level query construction using UDF ○ Complex Event Processing (CEP) ■ Support high-level pattern detection
  20. 20. MIST Dataflow Query Example ●Simple Noise Sensing Query Noise sensors inside the building MQTT Broker Building manager MIST MQTT Publish MQTT Subscribe (Noti) MQTT Pub/Sub
  21. 21. How to Define and Submit a IoT Stream Query? ●Configure the input stream source ●Define operations on how the input events are transformed ●Configure the output sink ●Submit the query to the MIST master
  22. 22. MIST Dataflow Query Example public static void main(final String args[]) { final SourceConfiguration localMQTTSourceConf = MQTTSourceConfiguration.newBuilder() .setTopic("snu/building302/room420/noisesensor") .setBroker("tcp://mqtt_broker_address:1883") .build(); ... Configure MQTT source
  23. 23. MIST Dataflow Query Example final MISTQueryBuilder queryBuilder = new MISTQueryBuilder("room_noise_sensing"); final ContinuousStream<Integer> sensedData = queryBuilder.mqttStream(mqttSourceConf) .map((mqttMessage) -> new String(mqttMessage.getPayload()))) .map(stringData -> Integer.parseInt(stringData)); Set application name
  24. 24. final MISTQueryBuilder queryBuilder = new MISTQueryBuilder("room_noise_sensing"); final ContinuousStream<Integer> sensedData = queryBuilder.mqttStream(mqttSourceConf) .map((mqttMessage) -> new String(mqttMessage.getPayload()))) .map(stringData -> Integer.parseInt(stringData)); MIST Dataflow Query Example Get data from MQTT source
  25. 25. final MISTQueryBuilder queryBuilder = new MISTQueryBuilder("room_noise_sensing"); final ContinuousStream<Integer> sensedData = queryBuilder.mqttStream(mqttSourceConf) .map((mqttMessage) -> new String(mqttMessage.getPayload()))) .map(stringData -> Integer.parseInt(stringData)); MIST Dataflow Query Example map() transforms the incoming MQTT message into integer value
  26. 26. MIST Dataflow Query Example sensedData .filter(value -> value < 200) .map(value -> new MqttMessage("Noisy".getBytes())) .mqttOutput("tcp://mqtt_broker_address:1883", "snu/building302/room420/monitor") final MISTQuery query = queryBuilder.build(); Notify if the room is noisy
  27. 27. MIST Dataflow Query Example sensedData .filter(value -> value < 200) .map(value -> new MqttMessage("Noisy".getBytes())) .mqttOutput("tcp://mqtt_broker_address:1883", "snu/building302/room420/monitor") final MISTQuery query = queryBuilder.build(); Send the notification via MQTT
  28. 28. MIST Dataflow Query Example sensedData .filter(value -> value < 200) .map(value -> new MqttMessage("Noisy".getBytes())) .mqttOutput("tcp://mqtt_broker_address:1883", "snu/building302/room420/monitor") final MISTQuery query = queryBuilder.build(); Build the query
  29. 29. MIST Dataflow Query Example final MISTExecutionEnvironment executionEnvironment = new MISTDefaultExecutionEnvironmentImpl( "mist_master_address", mistPort); final QueryControlResult result = executionEnvironment.submit(query, jarPath); System.out.println(result); } Submit the query to MIST Master
  30. 30. Demo: Noise Sensing Query
  31. 31. MIST CEP Query ●Complex Event Processing enables higher-level pattern detection on stream data ●CEP query consists of ○ Event Pattern which meets Qualification ○ Action ●MIST transforms high-level CEP queries into DAG before running them
  32. 32. MIST CEP Query Example ●Find a sequence of heart rates ○ Higher than the normal upper heart rate limit designated by a doctor ○ Showing ascending pattern ○ In recent 5 minutes ●Notify through MQTT when finding the abnormal pattern
  33. 33. MIST CEP Query Example final MISTCepQuery<CepHRClass> cepQuery = new MISTCepQuery.builder<CepHRClass>("bpm_monitor") .input(mqttInput) .setEventSequence(eventD, eventP) .setQualifier( … ) .within(300000) .setAction(mqttNotify) .build();
  34. 34. Demo: CEP Abnormal Heart Rate Detection
  35. 35. MIST Back-end
  36. 36. MIST Architecture Revisited Cluster of machines MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine App/ Client Query Submit (DAG, CEP, …)U s e r
  37. 37. MIST in a single machine MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine Cluster of machines App/ Client Query Submit (DAG, CEP, …)U s e r How to process many IoT queries in a single machine?
  38. 38. MIST in a single machine MIST Processing Engine MIST Master MIST Processing Engine MIST Processing Engine Cluster of machines App/ Client Query Submit (DAG, CEP, …)U s e r How to process many IoT queries in a cluster of machines? (In progress)
  39. 39. MIST in a Single Machine
  40. 40. Design Principle: Reuse system resources as much as possible! 1. Code sharing 2. Exploit the locality of code references 3. Query merging
  41. 41. Design Principle: Reuse system resources as much as possible! 1. Code sharing 2. Exploit the locality of code references 3. Query merging
  42. 42. 1.Code sharing src map sink User-Defined Function (temp) -> { if (temp > threshold) { return "action:speed:fast"; } ... }Query 1 Compiled code Room A
  43. 43. 1.Code sharing src map sink User-Defined Function (temp) -> { if (temp > threshold) { return "action:speed:fast"; } ... }Query 1 Compiled code src map sink User-Defined Function (temp) -> { if (temp > threshold) { return "action:speed:fast"; } ... } Query 2 Compiled code Bad! Room A Room B
  44. 44. 1.Code sharing User-Defined Function (temp) -> { if (temp > threshold) { return "action:speed:fast"; } ... } Compiled code User-Defined Function (temp) -> { if (temp > threshold) { return "action:speed:fast"; } ... } Code sharing ⇒ Reduce working set size of code references Query1 Query2 Great!
  45. 45. Design Principle: Reuse system resources as much as possible! 1. Code sharing 2. Exploit the locality of code references 3. Query merging
  46. 46. Instruction cache (size = 2) e 1 e 2 e 3 e 4 e 5 e 6 e 7 e 8 Event queuee 9 UDF1 UDF2 UDF3 Query1 Query2 Query3 Query4Query5 Query6 Query7 Query8 Query9
  47. 47. Instruction cache (size = 2) e 1 e 2 e 3 e 4 e 5 e 6 e 7 e 8 Event queuee 9 UDF1 UDF2 UDF3 Query1 Query2 Query3 Query4Query5 Query6 Query7 Query8 Query9 UDF1 UDF2 UDF3 Bad! Frequent cache misses!
  48. 48. Instruction cache (size = 2) e 1 e 2 e 3 e 4 e 5 e 6 e 7 e 8 Event queuee 9 UDF1 UDF2 UDF3 Query1 Query2 Query3 Query4Query5 Query6 Query7 Query8 Query9 UDF1 UDF2 Cache misses 9 ⇒ 3! Great!
  49. 49. Instruction cache (size = 2) e 1 e 2 e 3 e 4 e 5 e 6 e 7 e 8 Event queuee 9 UDF1 UDF2 UDF3 Query1 Query2 Query3 Query4Query5 Query6 Query7 Query8 Query9 UDF1 UDF2 Cache misses 9 ⇒ 3! Great! How to realize this event processing mechanism?
  50. 50. Exploit the locality of code references: Group-Aware Execution Model ● Fixed Number of Threads ● Query Grouping ● Group Assignment ● Group Reassignment
  51. 51. Exploit the locality of code references: Fixed number of threads (1/4) Thread 1 Thread 2
  52. 52. Exploit the locality of code references: Query Grouping (2/4) Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Thread 1 Thread 2
  53. 53. Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Exploit the locality of code references: Group Assignment (3/4) Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Query3 Query6 Query9 e 3 e 6 e 9 UDF3 Thread 1 Thread 2
  54. 54. Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Exploit the locality of code references: Group Assignment (3/4) Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Query3 Query6 Query9 e 3 e 6 e 9 UDF3 Thread 1 Thread 2 UDF4 ? ?
  55. 55. Exploit the locality of code references: Group Assignment (3/4) λ: Event arrival rate μ: Event process rate Load = λ / μ
  56. 56. Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Exploit the locality of code references: Group Assignment (3/4) Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Query3 Query6 Query9 e 3 e 6 e 9 UDF3 Thread 1 Thread 2 Load=0.3 Load=0.2 Load=0.2 Load=0.5 Load=0.2
  57. 57. Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Exploit the locality of code references: Group Assignment (3/4) Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Query3 Query6 Query9 e 3 e 6 e 9 UDF3 Load=0.3 Load=0.2 Load=0.2 UDF4 Load=0.5 Load=0.2Thread 1 Thread 2
  58. 58. Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Exploit the locality of code references: Group Reassignment (4/4) Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Query3 Query6 Query9 e 3 e 6 e 9 UDF3 Load=0.3 Load=0.2 Load=0.2 Load=0.5 Load=0.2Thread 1 Thread 2
  59. 59. Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Exploit the locality of code references: Group Reassignment (4/4) Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Query3 Query6 Query9 e 3 e 6 e 9 UDF3 Load=0.7 Load=0.2 Load=0.2 Load=0.9 Load >= 0.9 ~ Overloaded Load < 0.7 ~ Underloaded Overloaded Load=0.2Thread 1 Thread 2
  60. 60. Query1 Query4 Query7 UDF1 e 1 e 4 e 7 Exploit the locality of code references: Group Reassignment (4/4) Query2 Query5 Query8 e 2 e 5 e 8 UDF2 Query3 Query6 Query9 e 3 e 6 e 9 UDF3 Load=0.7 Load=0.2 Load=0.2 Load=0.7 Load=0.4Thread 1 Thread 2
  61. 61. Design Principle: Reuse system resources as much as possible! 1. Code sharing 2. Exploit the locality of code references 3. Query merging
  62. 62. src map sink Same UDF Query 1 src map sink Query 2 Query Merging Same UDF Bad!
  63. 63. src map sink Same UDF Query 1 src map sink Query 2 Query Merging Same UDF Process same data stream
  64. 64. src map sink Same UDF Query 1 src map sink Query 2 Query Merging Same UDF Have same operations
  65. 65. sink Query 1 src map sink Query 2 Query Merging Same UDF Merge two queries! Great!
  66. 66. Single Machine Evaluation
  67. 67. Evaluation Environment ● Environment: 28-core NUMA machine (35M cache, 8x 16GB RDIMM) ● Data transfer protocol: MQTT (A lightweight messaging protocol for IoT) ● Metrics: Max. # of queries ● Baseline: Flink & Thread-Per Query (TPQ) ● # of queries per code: 100 MQTT Broker (EMQ) MIST, (Flink, TPQ) Data Stream Generator
  68. 68. Performance Comparison The number of queries can be processed with < 10ms latency 13.6x375x 3.18x87.5x
  69. 69. MIST in a Cluster of Machines (Ongoing work)
  70. 70. Ongoing work ●Distributed Masters ○ Prevent a bottleneck, No single point of failure ●Load balancing among nodes ○ Query Allocation, Dynamic Query Migration ●Fault tolerance ○ Checkpointing, Upstream backup
  71. 71. Summary ●Processes a large number of IoT stream queries efficiently ●Techniques for scaling up stream processing ○ Code sharing ○ Exploiting the locality of code references ○ Query merging ●MIST outperforms 375x compared to Apache Flink, 13.6x compared to TPQ in a single machine
  72. 72. We will make MIST as an open-source project soon! We look forward contribution from many developers! Contact: mist@spl.snu.ac.kr Software Platform Lab Site: http://spl.snu.ac.kr
  73. 73. MIST: Towards Large-Scale IoT Stream Processing 이계원, 엄태건 Joint work with 전병곤, 조성우, 김경태, 이정길, 이산하 73

×