Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composition and Workflow Placement

54 views

Published on

The presentation slides of my Ph.D. thesis. For more information - https://kkpradeeban.blogspot.com/2019/07/my-phd-defense-software-defined-systems.html

Published in: Technology
  • Be the first to comment

  • Be the first to like this

My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composition and Workflow Placement

  1. 1. Software-Defined Systems for Network-Aware Service Composition and Workflow Placement Pradeeban Kathiravelu Supervisors: Prof. Luís Veiga Prof. Peter Van Roy Lisboa, Portugal. July 1st , 2019
  2. 2. 2/38 Introduction ● Service providers and tenants in the cloud ecosystem. – Challenges in interoperability and control. ● Network Softwarization: Management, Control, & reusability. ● Network Softwarization typically focus on a single provider. ● Network-awareness for multi-domain workflows.
  3. 3. 3/38 Network Softwarization ● Software-Defined Networking (SDN) ● Network Functions Virtualization (NFV) – Network middleboxes → Virtual Network Functions (VNFs) ● Software-Defined Systems (SDS) – Storage, Security, Data center, .. – Improved configurability
  4. 4. 4/38 Motivation ● Enhanced control for tenants in service workflow placements. – Tenant Policies and Service Level Objectives (SLOs) ● Address workflow challenges: technical, economic, and policy
  5. 5. 5/38 Thesis Goals Network-Aware Service Composition and Workflow Placement Scale Intra-Domain Multi-Domain Edge The Internet
  6. 6. 6/38 Q1: Execution Migration Across Development Stages Can we seamlessly scale and migrate network applications through network softwarization across development and deployment stages? Scale: Data center (CoopIS’16, SDS’15, and IC2E’16)
  7. 7. 7/38 Q2: Economic & Performance Benefits Can network softwarization offer economic and performance benefits to the end users? Scale: Data center → Inter-cloud (Networking’18 and IM’17)
  8. 8. 8/38 Q3: Service Chain Placement Can we efficiently chain services from several edge and cloud providers to compose tenant workflows, by federating SDN deployments of the providers, using SOA? Scale: Multi-domain → Edge (ETT’18, ICWS’16, and SDS’16)
  9. 9. 9/38 Q4: Interoperability Can we enhance the interoperability of diverse network applications, by leveraging network softwarization and SOA? Scale: Data center → Multi-domain and Edge (CLUSTER’18, DAPD’19, SDS’17, and CoopIS’15)
  10. 10. 10/38 Q5: Application to Big Data Can we improve the performance, modularity, and reusability of big data applications, by leveraging network softwarization and SOA? Scale: Data center → the Internet (CCPE’19 and SDS’18)
  11. 11. 11/38 Thesis Contributions Q1: Seamless Development & Deployment of cloud networks Q2: Economic & Performance Benefits: Q3: Service Chain Placement: Q4: Interoperability of multi-domain service workflows Q5: Application to Big Data Cloud-Assisted Networks as an Alternative Connectivity Provider. Network Service Chain Orchestration at the Edge.
  12. 12. 12/38 I) Cloud-Assisted Networks as an Alternative Connectivity Provider Kathiravelu, P., Chiesa, M., Marcos, P., Canini, M., Veiga, L. Moving Bits with a Fleet of Shared Virtual Routers. In IFIP Networking 2018. May 2018. pp. 370 – 378.
  13. 13. 13/38 Introduction ● Increasing demand for bandwidth. ● Decreasing bandwidth prices. ● Pricing Disparity. E.g. IP Transit price per Mbps, 2014 – USA: 0.94 $ – Kazakhstan: 15 $ – Uzbekistan: 347 $ ● What about latency?
  14. 14. 14/38 Motivation ● Dedicated connectivity* of the cloud providers. – Increasing geographical presence. – Well-provisioned network → Low latency network links. ● Cloud-Assisted Networks – Can a network overlay built over cloud instances be a better connectivity provider? ● High-performance ● Cost effectiveness * James Hamilton, VP, AWS (AWS re:invent 2016).
  15. 15. 15/38 Our Proposal: NetUber • A Cloud-Assisted Network as a third-party virtual connectivity provider with no fixed infrastructure. – Better network paths compared to the Internet.
  16. 16. 16/38 NetUber Application Scenarios • Cheaper data transfers between two endpoints. • Higher throughput and lower latency. • Network services. • Alternative to Software-as-a-Service replication.
  17. 17. 17/38 NetUber Inter-Cloud Architecture • Deploy SaaS applications in one or a few regions. – Fast access from more regions with NetUber. Ohio London Belgium AWS GCP
  18. 18. 18/38 Monetary Costs to Operate NetUber A.Cost of Cloud VMs (per second) – Spot instances: volatile, but up to 90% savings. B.Cost of Bandwidth (per transferred data volume). C.Cost to connect to the cloud provider (per port-hour).
  19. 19. 19/38 Evaluation • NetUber prototype with AWS r4.8xlarge spot instances. • Cheaper point-to-point connectivity. ● Better throughput and reduced latency & jitter. – Origin: RIPE Atlas Probes and our distributed servers. – Destination: VMs of multiple AWS regions. ● Network Services: Compression
  20. 20. 20/38 1) Cheaper Point-to-Point Connectivity • Cost for 10 Gbps flat connectivity: from EU & USA. – Cheaper for data transfers <50 TB/month.
  21. 21. 21/38 2) Low Latency with Cloud Routes • NetUber data transfer A → Z via the path A → B → Z. – Cloud region B is closer to the origin server A. – B and Z are cloud VMs connected by NetUber overlay.
  22. 22. 22/38 Ping times – ISP vs. NetUber (via region, % Improvement) • NetUber cuts Internet latencies up to 30%. • Direct Connect would make NetUber even faster.
  23. 23. 23/38 3) Throughput: ISP, NetUber, and Selectively Using NetUber ● Better throughput with NetUber via near cloud region. – Selective use of overlay when no proximate region.
  24. 24. 24/38 4) Low Jitter with Cloud Overlay ● NetUber for latency-sensitive web applications.
  25. 25. 25/38 Key Findings • Connectivity provider that does not own the infrastructure – Low latency cloud-assisted overlay network. – Better data rate than ISPs. • Previous research do not consider economic aspects. – A cheaper alternative (< 50 TB/month). • Similar industrial efforts. – Voxility, an alternative to transit providers. – Teridion, Internet fast lanes for SaaS providers.
  26. 26. 26/38 II) Network Service Chain Orchestration at the Edge Kathiravelu, P., Van Roy, P., & Veiga, L. Composing Network Service Chains at the Edge: A Resilient and Adaptive Software- Defined Approach. In Transactions on Emerging Telecommunications Technologies (ETT). Aug. 2018. Wiley. pp. 1 – 22.
  27. 27. 27/38 Motivation ● Network Services: On-Premise vs. Centralized Cloud? Edge! ● Network Service Chaining (NSC) ● Finding optimal service chain at the edge abiding by the tenant SLOs.
  28. 28. 28/38 Our Proposal: Évora ● Graph-based algorithm to incrementally construct user workflows as service chains at the edge. ● SDN With Message-Oriented Middleware (MOM). – For multi-domain edge environments. – Place and migrate user service chains. ● Adhering to the user policies.
  29. 29. 29/38 Deployment Architecture ● Distributed execution: Orchestrator in each user device.
  30. 30. 30/38 Évora Orchestration 1) Initialize Orchestrator in each Device ● Construct a service graph in the user device. ― As a snapshot of the service instances at the edge.
  31. 31. 31/38 2) Identify Potential Workflow Placements ● Construct potential chains incrementally. – Subgraphs from service graph to match user chain. – Noting individual service properties. ● A complete match? – Save as a potential service chain placement.
  32. 32. 32/38 3) Service Chain Placement ● Calculate a penalty value for potential placements. – Normalized values: Cost, Latency, and Throughput. – α,β,γ ← User-specified weights. ● Place NSC on composition with minimal penalty value. – Mixed Integer Linear Problem. – Extensible with powers and more properties.
  33. 33. 33/38 Evaluation ● Model sample edge environment. – Service nodes and a user device. – User policies for the service workflow. ● Microbenchmark Évora workflow placement. – Effectiveness in satisfying user policies. – Efficacy in closeness to optimal results ● ↡Penalty value ➡ ↟Quality of experience
  34. 34. 34/38 User Policies with Two Properties ● Equal weights to 2 properties among C, L, and T. ● Darker circles – compositions with minimal penalty. – The ones that Évora chooses (circled). T ↑ and C ↓ T ↑ and L ↓ C ↓ and L ↓
  35. 35. 35/38 Policies with Three Properties: One given more prominence (weight = 10), than the other two (weight = 3). Radius of the circles – Monthly Cost
  36. 36. 36/38 Two given more prominence (weight = 10), than the third (weight = 3). ● Effectively satisfying the user policies – multiple properties with different weights.
  37. 37. 37/38 Key Findings ● Bring control back to the users for edge workflows. ● Previous research focus on single NSC provider. ● Évora efficient workflow placement. – Abiding by the user policies. – Multi-domain edge with multiple providers. – Extending SDN with MOM to wide area networks. ● Network-aware execution from user devices. – Decentralized and distributed.
  38. 38. 38/38 Conclusion ● Seamless migration across development and deployments. ● A case for Cloud-Assisted Networks as a connectivity provider. ● Composing & placing workflows in multi-domain networks. ● Increased interoperability with network softwarization & SOA. ● Applicability of our contributions in the context of Big Data. Future Work ● NetUber as an enterprise connectivity provider. ● Adaptive network service chains on hybrid networks. Thank you! Questions?
  39. 39. 39/38 Additional Slides
  40. 40. 40/38 (0) *Overview*
  41. 41. 41/38 Publications
  42. 42. 42/38 Multitenancy and the Tenant Users of a Cloud Environment
  43. 43. 43/38 Contributions and Relationships
  44. 44. 44/38 Why SOA for our SDS? ● Beyond data center scale. – Thanks to the standardization of services. ● SOA and RESTful reference architectures. – Multiple implementation approaches such as Message- Oriented Middleware (MOM). ● Publish/subscribe to a message broker over the Internet. ● Service endpoints to handover messages to the broker. ● Flexibility, modularity, loose-coupling, and adaptability.
  45. 45. 45/38 OpenDaylight ● Incremental development of OSGi bundles – Checkpointing and versioning of the modules. ● State of executions and transactions – Stored in the controller distributed data tree.
  46. 46. 46/38 What MOM got to do with the controller? ● Expose the internals from controller (e.g. OpenDaylight) – Through a message-based northbound API ● e.g. AMQP (Advanced Message Queuing Protocol). – Publish/Subscribe with a broker (e.g. ActiveMQ). ● What can be exposed – Data tree (internal data structures of the controller) – Remote procedure calls (RPCs) – Notifications. ● Thanks to Model-Driven Service Abstraction Layer (MD-SAL) of OpenDaylight. – Compatible internal representation of data plane. – Messaging4Transport Project.
  47. 47. 47/38 State-aware Adaptive Scaling ● Adaptive scaling through shared state. – Horizontal scalability through In-Memory Data Grids (IMDGs). – State of the executions for scaling decisions. ● Pause-and-resume executions. – Parallel multi-tenant executions.
  48. 48. 48/38 (1) *SENDIM* ● Simulation, Emulation, aNd Deployment Integration Middleware ● CoopIS’16, SDS’15, and IC2E’16
  49. 49. 49/38 Introduction ● Networks simulated or emulated at early stages . ● Programmable networks → continuous development. – Native integration of emulators into SDN. – Network simulators supporting SDN and emulation. – Cloud simulators extended for clouds with SDN. ● Lack of “Software-Defined” network simulators. – Policy/algorithms locked in simulator-imperative code. ● Demand for easy migration and programmability.
  50. 50. 50/38 Motivation ● An integrated network simulation and emulation. ● Extend SDN controllers for cloud network simulations. – Bring the benefits of SDN to its own simulations! ● Reusability, Scalability, Easy migration, . . . – Run control plane code in controller itself (portability). – Simulate the data plane (scalability, efficiency). ● by programmatically invoking the southbound.
  51. 51. 51/38 Integrated Modeling and Development
  52. 52. 52/38 Our Proposal: SENDIM ● Separation of the Application Logic From the Execution Environment.
  53. 53. 53/38 Solution Architecture
  54. 54. 54/38 SENDIM Execution
  55. 55. 55/38 “Software-Defined Simulations” ● Application Logic expressed in “descriptors”. – Deployed into the SDN controller, with a Java API. ● System simulated in the simulation sandbox.
  56. 56. 56/38
  57. 57. 57/38
  58. 58. 58/38 Prototype Implementation ● Oracle Java 1.8.0 - Development language. ● Apache Maven 3.1.1 - Build the bundles and execute the scripts. ● Infinispan 7.2.0.Final - Distributed cluster. ● Apache Karaf 3.0.3 - OSGi run time. ● OpenDaylight Beryllium - Default controller. ● Multiple deployment options: – As a stand-alone simulator. – Distributed execution with an SDN controller. – As a bundle in an OSGi-based SDN controller.
  59. 59. 59/38 Evaluation ● A cluster of up to 6 identical computers. – Intel Core TM i7-4700MQ CPU @ 2.40GHz 8 CPU. – 8 GB memory, Ubuntu 14.04 LTS 64 bit. ● Simulating routing algorithms in fat-tree topology. – Up to 100,000 nodes and changing degrees. ● Simulation Performance: Benchmark against CloudSimSDN/Cloud2Sim. ● Evaluate the migration performance. – Emulation (Mininet) → Simulation (SENDIM) – Simulation (SENDIM) → Emulation (Mininet)
  60. 60. 60/38 Automated Code Migration: Simulation → Emulation ● Time taken to programmatically convert a SENDIM simulation script into a Mininet script.
  61. 61. 61/38 Modeling Performance ● Network Construction Efficiency and Adaptiveness. – Simulate when resources are scarce for emulation.
  62. 62. 62/38 Simulation Performance and Scalability ● Higher performance for larger simulations. ● Smart scale-out → Higher horizontal scalability
  63. 63. 63/38 Performance with Incremental Updates ● Smaller simulations: up to 1000 nodes. ● SENDIM: controller and middleware execution completion time.
  64. 64. 64/38 Performance with Incremental Updates ● Initial execution takes longer - Initializations.
  65. 65. 65/38 Performance with Incremental Updates ● Faster, once SENDIM & controller initialized.
  66. 66. 66/38 Test-driven Development ● Faster executions once the system is initialized.
  67. 67. 67/38 Subsequent Incremental Updates ● Even faster executions for subsequent simulations.
  68. 68. 68/38 Deploy Changesets to the Controller● No change in simulated environment
  69. 69. 69/38 Revert Changesets from the Controller ● No change in simulated environment
  70. 70. 70/38 Scale/Migrate Simulated Environment ● No change in controller.
  71. 71. 71/38
  72. 72. 72/38 Key Findings ● SENDIM, Separation of execution from the infrastructure. – Easy migration between simulations and emulations. – Enabling an incremental modeling of cloud networks. ● Performance and scalability. – Reuse the same controller code to simulate larger deployments. – Adaptive parallel and distributed simulations. Future Work ● Extension points for easy migrations. – More emulator and controller integrations.
  73. 73. 73/38 (2) *NetUber* (Complementary Slides) ● Networking’18
  74. 74. 74/38 Cost of Cloud Spot VMs ● 10 Gbps R4 instance pairs offered only maximum of 1.2 Gbps of data transfer inter-region. – 10 Gbps only inside a placement group.
  75. 75. 75/38 Price disparity is real! Cost of Bandwidth Regions 1 - 9 (US, Canada, and EU) much cheaper than the others.
  76. 76. 76/38 Potential for Network Services ● NetUber uses memory-optimized R4 spot instances. – Each with 244 GB memory, 32 vCPU, and 10 GbE interface. ● Deploy network services at the instances – Value-added services for the customer. ● Encryption, WAN-Optimizer, load balancer, .. – Services for cost-efficiency. ● Compression.
  77. 77. 77/38 (3) *SMART* ● SDN Middlebox Architecture for Reliable Transfers. ● EI2N’16 and IM’17
  78. 78. 78/38 Introduction ● Differentiated QoS in multi-tenant cloud networks. – Different priorities among tenant processes. – Application-level user preferences and system policies. – Performance guarantees at the network-level. ● Network is shared among the tenants. – SLA guarantee despite congestion for critical flows.
  79. 79. 79/38 Motivation ● Cross-layer optimization of clouds with SDN. – Centralized network-as-a-service control plane.
  80. 80. 80/38 Our Proposal: SMART ● Cross-layer architecture for differentiated QoS of flows. ● FlowTags - Software middlebox to tag the network flows with contextual information. – Application-level preferences to the control plane as tags. – Dynamic flow routing modifications based on the tags. ● Timely delivery of priority flows by dynamically diverting them or cloning them to a less congested path. – Selective Redundancy – Adaptive approach in cloning and diverting.
  81. 81. 81/38 SMART Approach ● Divert or clone subflows by setting breakpoints in the priority flows, to avert congestion. – Trade-off of redundancy to ensure the SLA. – Adaptiveness with contextual information.
  82. 82. 82/38
  83. 83. 83/38
  84. 84. 84/38 SMART Deployment
  85. 85. 85/38 SMART Workflow
  86. 86. 86/38 I: Tag Generation for Priority Flows ● Tag generation query and response. – between hosts and FlowTags controller. ● A centralized controller for FlowTags. ● Tag the flows at the origin. ● FlowTagger software middlebox. – A generator of the tags. – Invoked by the host application layer. – Similar to the FlowTags-capable middleboxes for NATs
  87. 87. 87/38 II: Regular Routing until Policy Violation
  88. 88. 88/38 III: When a Threshold is Met ● Controller is triggered through the OpenFlow API. ● A series of control flows inside the control plane. ● Modify flow entries in the relevant switches.
  89. 89. 89/38 SMART Control Flows: Rules Manager ● A software middlebox in the control plane. ● Consumes the tags from the packet. – Similar to FlowTags-capable firewalls.
  90. 90. 90/38 Rules Manager Tags Consumption ● Interprets the tags – as input to the SMART Enhancer
  91. 91. 91/38 SMART Enhancer● Gets the input to the enhancement algorithms. ● Decides the flow modifications. – Breakpoint node and packet. – Clone/divert decisions.
  92. 92. 92/38 Prototype Implementation ● Developed in Oracle Java 1.8.0. ● OpenDaylight Beryllium as the core SDN controller. ● Enhancer & Rules Manager middlebox: controller extensions. – Deployed in OpenDaylight Karaf runtime as OSGi bundles. ● FlowTags middlebox controller deployed with SDN controller. – FlowTags, originally a POX extension. ● Network nodes and flows emulated with Mininet. – Larger scale cloud deployments simulated.
  93. 93. 93/38 Evaluation Strategy ● Data center network with 1024 nodes and leaf-spine topology. – Path lengths of more than two-hops. – Up to 100,000 of short flows. ● Flow completion time < 1 s. ● A few non-priority elephant flows. – SLA → maximum permitted flow completion time for priority flows – Uniformly randomized congestion. ● hitting a few uplinks of nodes concurrently. ● overwhelming amount of flows through the same nodes and links. ● Benchmark: SMART enhancements over base routing algorithms. – Performance (SLA-awareness), redundancy, and overhead.
  94. 94. 94/38 SMART Adaptive Clone/Replicate ● Replicate subsequent flows once a previous flow was cloned. – Shortest path and Equal-Cost Multi-Path (ECMP)
  95. 95. 95/38 Related Work ● Multipath TCP (MPTCP) uses the available multiple paths between the nodes concurrently to route the flows. – Performance, bandwidth utilization, & congestion control – through a distributed load balancing. ● ProgNET: WS-Agreement and SDN for SLA-aware cloud. ● pFabric for deadline-constrained data flows with minimal completion time. ● QJump linux traffic control module for latency-sensitive applications.
  96. 96. 96/38 Key Findings ● SMART leverages redundancy in the flows – Improve the SLA of the priority flows. ● Cross-layer optimizations through tagging the flows. – For differentiated QoS. Future Work ● Implementation of SMART on a real data center network. ● Evaluate against the related work quantitatively.
  97. 97. 97/38 (4) *Mayan* ● Software-Defined Service Compositions ● ICWS’16 and SDS’16
  98. 98. 98/38 Introduction ● eScience workflows – Computation-intensive. – Execute on highly distributed networks. ● Complex service composition workflows – To automate scientific and enterprise business processes.
  99. 99. 99/38 Motivation ● Better orchestration of service workflow compositions in wide area networks. ● Software-Defined Service Composition
  100. 100. 100/38 Our Proposal: Mayan ● SDN-based approach for adaptively composing multi-domain service workflows – An efficient service instance selection. – Loose coupling of service definitions and implementations. – Availability of a logically centralized control plane. ● State of executions and transactions stored in the controller distributed data tree. – Clustered and federated deployments with MOM.
  101. 101. 101/38 Alternative Representations
  102. 102. 102/38 Mayan Services Registry: Modeling Language
  103. 103. 103/38 Service Composition Representation ● <Service3,(<Service1, Input1>, <Service2, Input2>)>
  104. 104. 104/38 Service Instances: Alternative Implementations and Deployments
  105. 105. 105/38 Solution Architecture ● Mayan Controller Farm: Inter-Domain Compositions
  106. 106. 106/38
  107. 107. 107/38
  108. 108. 108/38
  109. 109. 109/38 Evaluation ● Evaluation Environment: – Smaller physical deployments in a cluster. – Larger deployments as simulations and emulations (Mininet). ● Evaluation Strategy: – A workflow performing distributed data cleaning and consolidation. ● A distributed web service composition. vs. ● Mayan approach with the extended SDN architecture.
  110. 110. 110/38 Speedup and Horizontal Scalability ● No performance degradation for larger deployments.
  111. 111. 111/38 Controller Throughput ● No. of messages entirely processed by the controller. – Publisher → Controller → Receiver. ● 5000 messages/s in a concurrency of 10 million msg.
  112. 112. 112/38 Processing Time ● Total time to process the complete set of messages – Against a varying number of messages. ● Linear scaling with the number of parallel messages. – 10 million messages in 40 minutes.
  113. 113. 113/38 Success Rate ● Success rate of the controller vs. number of messages processed in parallel. – 100% for up to 10,000 parallel messages. – 99.5% for up to 10 million parallel messages.
  114. 114. 114/38 Scalability of the Mayan Controller ● Presented results for a single stand-alone controller. ● Mayan is designed as a federated deployment. – Scales horizontally to ● manage a wider area with a more substantial number of service nodes and improved latency. ● handle more concurrent messages in each controller domain.
  115. 115. 115/38 Key Findings ● SDN-based approach that enables efficient and flexible large-scale service composition workflows . – Multi-tenant and multi-domain executions. – Service composition with web services and distributed execution frameworks. ● Related Works on SDN for distributed frameworks and service workflows. – Palantir: SDN for MapReduce performance with the network proximity data.
  116. 116. 116/38 (5) *Évora* (Complementary Slides) ● ETT’18
  117. 117. 117/38 A User-Defined NSC Among the Edge Nodes
  118. 118. 118/38 Problem Scale: Representation of the Service Graph from the Node Graph ● The number of links in this service graph grows – linearly with the number of edges or links between the edge nodes. – exponentially with the average number of services per edge node.
  119. 119. 119/38 MILP and Graph Matching can be Computation Intensive ● But initialization is once per user chain with a given policy. – This procedure does not repeat once initialized. – unless updates received from the edge network. ● New node with the service offering at the edge. ● An existing node or a service offering fails to respond. ● Services in each NSC is typically 5 – 10. – Évora algorithm follows a greedy approach, rather than a typical graph matching.
  120. 120. 120/38 Performance and Scalability of Évora Orchestrator Algorithms
  121. 121. 121/38
  122. 122. 122/38
  123. 123. 123/38
  124. 124. 124/38 (6) *SD-CPS* ● Software-Defined Cyber-Physical Systems ● CLUSTER’18, SDS’17, and M4IoT’15
  125. 125. 125/38 Cyber-Physical System (CPS) ● A system composed of cyber and physical elements. ● Challenges in CPS. – Modeling – Large-scale heterogeneous execution environments. – Decision making: communication and coordination. – Management and orchestration of the intelligent agents.
  126. 126. 126/38 Motivation ● An SDS to address the challenges of CPS. Desired Properties in a new CPS framework ● Easy to adopt from current CPS approaches. ● Should not introduce more/new challenges.
  127. 127. 127/38 ● An SDS framework for CPS workflows at the edge. – CPS workload as edge service workflows. ● A dual (physical and virtual/cyber) execution environment for CPS executions. – Efficient CPS modelling and simulations. – Mitigate the unpredictability of the physical execution environment. ● Resilience for critical flows with a differentiated QoS. – End-to-end delivery guarantees. Our Proposal: Software-Defined Cyber-Physical Systems (SD-CPS)
  128. 128. 128/38 SD-CPS Controller Architecture
  129. 129. 129/38 Controller Farm and Software-Defined Sensor Networks
  130. 130. 130/38 Modeling and Simulating CPS ● Cyberspace to model the smart devices as virtual intelligent agents. ● Mapped interactions between the actors in physical & cyber spaces. ● Incrementally model and load from the controller farm.
  131. 131. 131/38 Evaluation Environment● Edge nodes and service resource requirements – with properties normalized. ● Resource requirement – Negative value – even the smallest node satisfies. – High positive value – higher demand for resource.
  132. 132. 132/38 Service Deployment Over the Nodes ● How each service is deployed across nodes. ● How each node hosts several services.
  133. 133. 133/38 Parallel Execution of 1 million workflows ● Minimal idling nodes. ● High resource utilization.
  134. 134. 134/38 Related Work ● SDN for Heterogeneous Devices. – Sensor OpenFlow: SD-Wireless Sensor Networks. ● Scaling SDN: Clustering SDN controller with Akka. ● OpenDaylight Federation ● Conceptual Data Tree projects. ● SDS for Smart Environments. ● Albatross: Taming challenges of distributed systems
  135. 135. 135/38 Key Findings ● Increased resource efficiency using edge workflows. ● An approach to mitigate the design and operations challenges in CPS. ● Benefits of SDN to CPS. – Unified and centralized control. – Improved QoS, management, and resilience. – Reduced repeated effort in modeling.
  136. 136. 136/38 (7) *Obidos* ● OOn-demand BBig Data IIntegration, DDistribution, and OOrchestration SSystem ● DAPD’19, CoopIS’15, and DMAH’17
  137. 137. 137/38 Introduction ● Volume, variety, and distribution of big data are rising. – Structured, semi-structured, unstructured, or ill-formed. ● Integration of data is crucial for data science. – Multiple types of data: Imaging, clinical, and genomic. – Numerous data sources: No shared messaging protocol. – Do we really need to integrate all the data? ● Sharing of integrated data and results for reproducibility.
  138. 138. 138/38 Human-in-the-loop On-Demand Data Integration● Service-based data access through APIs. – Thanks to specifications such as HL7 FHIR. ● The researchers possess domain knowledge. ● Integrate On-Demand. – Avoid eager loading of binary data or its textual metadata. – Use the researcher query as an input in loading data. ● Scalable storage in-house. – Load, integrate, index, and query unstructured data.
  139. 139. 139/38 Data Sharing Intra-Organization ● Load data only once per organization. – Bandwidth and storage efficiency.
  140. 140. 140/38 Data Sharing Inter-Organization ● Do not duplicate data! – We ``own`` our interest; not the data. ● Point to the data in the data sources. – Pointers to data like Dropbox Shared Links. ● Avoids outdated duplicate data. ● Easy to maintain. ● APIs – Access the list of research data sets.
  141. 141. 141/38 Problems ● How to.. – Load data from several big data sources. ● Avoid repeated loading and near duplicate data. – Integrate disparate data and persist for future accesses. – Share pointers to data internally and externally.
  142. 142. 142/38 Our Proposal: Óbidos ● Define subsets of data that are of interest. – using hierarchical structure of medical data. ● Medical Images (DICOM), Clinical data, .. ● User query → Narrow down the search space. OOn-demand BBig Data IIntegration, DDistribution & OOrchestration SSystem
  143. 143. 143/38 Óbidos Approach ● Hybrid of virtual and materialized data integration approaches. – Lazy load of metadata: Load the matching subset of metadata. – Store integrated data and query results → scalable storage. ● Track already loaded data. – Near duplicate detection. – Download only updates (changesets). ● Efficient SQL queries on NoSQL storage. ● Share pointers to the datasets rather than the dataset itself. ● Generic design; implementation for medical research data.Generic design; implementation for medical research data.
  144. 144. 144/38 Óbidos Architecture
  145. 145. 145/38
  146. 146. 146/38 Data Sharing with Óbidos
  147. 147. 147/38
  148. 148. 148/38 Data Structures of the Replicaset Holder
  149. 149. 149/38 Evaluation ● Evaluation Data: – Clinical data and TCIA DICOM imaging collections. ● Benchmark Óbidos against eager and lazy ETL. – Performance of loading and querying data. ● Óbidos (inter- and intra- org) against binary data sharing. – Space/bandwidth efficiency of data sharing.
  150. 150. 150/38 Workload Characterization Various Entries in Evaluated Collections
  151. 151. 151/38 Data Load Time Change in total data volume (Same query and same interest) ● Load time for eager & lazy ETL with total volume↟ ● Load time for Óbidos remains constant.
  152. 152. 152/38 Change in studies of interest (Same query and constant total data volume) Data Load Time ● Load time for eager and lazy ETL remains constant. ● Load time increases for Óbidos with the interest. – Converges to the load time of lazy ETL.
  153. 153. 153/38 Load Time from the Remote Data Sources ● Eager and lazy ETL take much longer – To load more data and metadata over the Internet.
  154. 154. 154/38 Query Completion Time for the Integrated Data Repository ● Corresponding data already loaded in Óbidos. ● Indexed scalable NoSQL architecture of Óbidos → Better performance.
  155. 155. 155/38 Efficiency in Sharing Medical Research Data ● Replicaset – Pointers of marginal size, yet increases with entries of same granularity.
  156. 156. 156/38 Key Findings ● Óbidos offers on-demand service-based big data integration. – Fast and resource-efficient data analysis. – SQL queries over NoSQL data store for the integrated data. – Efficient data sharing without replicating the actual data. Future Work – Consume data from repositories beyond medical domain. ● EUDAT – Óbidos distributed virtual data warehouses. ● Leverage the proximity in data integration and sharing.
  157. 157. 157/38 (8) *Mayan-DS* ● Software-Defined Data Services (SDDS) ● CCPE’19 and SDS’18 (Best Paper Award) ● Work-in-progress
  158. 158. Introduction ● Data services: Service APIs to big data → Interoperability. ● Related data and services distributed far from each other → Bad performance with scale. ● Chaining of data services. – Composing chains of numerous data services. – Data access → Data cleaning → Data integration. ● How to scale out efficiently? – How to minimize communication overheads?
  159. 159. 159/38 Motivation ● Software-Defined Networking (SDN). – A unified controller to the data plane devices. – Brings network awareness to the applications. ● Data services – Make big data executions interoperable. ● Can we bring SDN to the Data Services? – Software-Defined Data Services (SDDS)
  160. 160. 160/38 Our Proposal: Software-Defined Data Services (SDDS) ● SDDS as a generic approach for data services. – Extending and leveraging SDN. ● Mayan-DS, an SDDS framework. – Efficient management of data services. – Interoperability and scalability.
  161. 161. 161/38 Solution Architecture
  162. 162. 162/38 SDDS Approach ● Define all the data operations as interoperable services. ● SDN for distributing data and service executions – Inside a data center (e.g. Software-Defined Data Centers). – Beyond data centers (extend SDN with MOM). ● Optimal placement of data and service execution. – Minimize communication overhead and data movements. ● Execute data service on the best-fit server, until interrupted.
  163. 163. 163/38 Efficient Data and Execution Placement {i, j} – related data objects D – datasets of interest n – execution node ξ – spread of the related data objects
  164. 164. 164/38 Prototype Implementation
  165. 165. 165/38 Simulated Environment (with Modeled Latency in ms)
  166. 166. 166/38 Ping Times (ms) Between Two Nodes: Regular Internet vs. Mayan-DS
  167. 167. 167/38 Latency: Ping Times of Mayan-DS ● Up to 33% reduction in latency – with a fraction of the path through a direct link. ● 75% or more reduction with significant portion of direct link.
  168. 168. 168/38 Key Findings ● Software-Defined Data Services (SDDS) for interoperability and scalability in big data executions. ● Mayan-DS leverages SDN for big data workflows at Internet-scale. ● Limited focus of industrial offerings. – Storage or one or a few specific services. Future Work ● Extend Mayan-DS for edge and IoT/CPS environments.

×