Software-Defined Systems for Network-Aware Service Composition and Workflow Placement

Software-Defined Systems for
Network-Aware Service Composition and
Workflow Placement
Pradeeban Kathiravelu
Supervisors: Prof. Luís Veiga
Prof. Peter Van Roy
Lisboa, Portugal.
18/06/2018.

2/33
Introduction
● Network Softwarization: Making the networks “programmable”.
– Software-Defined Networking (SDN)
● Unifying the control plane away from network data plane devices.
● Global view and control of the data center network via a single controller.
– Network-Functions Virtualization (NFV)
● Virtualizing network middleboxes into network functions.
●
Firewall, intrusion detection, Network Address Translation (NAT), ..
● Software-Defined Systems (SDS).
– Frameworks extending, or inspired by, SDN.
– Storage, Security, Data center, ..
● Improved configurability: Separation of mechanisms from
policies.

3/33
Motivation
● Software-Defined Systems to compose and place
service workflows beyond data center-scale.
– Bring the control back to the service user.

4/33
Research Questions
● Can we uniformly separate the infrastructure from the
network, at various stages of development, from
data centers to the cloud?
● Can such network softwarization offer economic and
performance benefits to the end users?
● Can we orchestrate the services for workflow
compositions efficiently, by extending SDN to the
cloud and edge environments?
● Can we improve the performance of big data
applications by scaling the execution environment
in a network-aware manner?

5/33
Current Contributions
● Network softwarization as an encompassing approach,
from design to cloud deployments (CoopIS’16, SDS’15, and
IC2E’16).
● Differentiating network flows based on user policies
with SDN and middleboxes (EI2N’16 and IM’17).
● Network-aware big data executions (SDS’18 and
CoopIS’15).
● Extending network softwarization to wide area for
service composition and workflow placement.
– Cloud-Assisted Networks as an alternative connectivity
provider (Networking’18).
– Composing service chains at the edge (ETT’18, ICWS’16,
and SDS’16).

6/33
I) Cloud-Assisted Networks
as an Alternative Connectivity Provider
Kathiravelu, P., Chiesa, M., Marcos, P., Canini, M., Veiga, L.
Moving Bits with a Fleet of Shared Virtual Routers.
In IFIP Networking 2018. (CORE Rank A). May 2018. pp. 370 – 378.

7/33
Introduction
● Increasing demand for bandwidth.
● Decreasing bandwidth prices.
● Pricing Disparity. E.g. IP Transit Price, 2014 (per Mbps)
– USA: 0.94 $
– Kazakhstan: 15 $
– Uzbekistan: 347 $
● What about latency?
– Online gaming.
– High-frequency trading.
– Remote surgery.

8/33
Motivation
● Cloud providers often have a dedicated connectivity*
.
– Increasing number of regions and points of presence.
– Well provisioned and maintained network.
● Can a network overlay over cloud instances be used
as an alternative connectivity provider?
– High-performance.
– Cost-effectiveness.
– Optional network services.
* James Hamilton, VP, AWS (AWS re:invent 2016).

9/33
Cloud-Assisted Networks
● Virtual/overlay networks over cloud environments

10/33
● Better control over the path, compared to the
Internet paths.
Our Proposal: NetUber
● A third-party virtual connectivity provider with no
fixed infrastructure.
– A cloud-assisted overlay network, leveraging multi-cloud
infrastructures.

11/33
Better Alternative to SaaS Replication
● Deploy Software-as-a-Service (SaaS) applications in
just one or a few regions.
– Use NetUber to access them from other regions.
● Access to more regions via multiple cloud providers.
– Ohio (AWS, but not GCP); London (both AWS
and GCP); Belgium (GCP, but not AWS).

12/33
A.Cost of Cloud Instances.
– Charged per second.
– Very high. [Spot instances: volatile, but up to 90% savings]
B.Cost of Bandwidth
– Charged per data transferred.
– Also very high. [No cheaper alternative.]
C.Cost to connect to the cloud provider.
– Often managed by the cloud provider. E.g: AWS Direct Connect
– Typically, the end user pays directly to the cloud provider.
Monetary Costs to Operate NetUber

13/33
Evaluation
● Cheaper point-to-point connectivity.
– AWS as the overlay cloud provider.
– Compared against a transit provider and another connectivity
provider with a large global backbone network.
● Better throughput or Reduced Latency.
– Compared to ISPs.
– Traffic sent from: RIPE Atlas Probes and distributed servers.
– Destination: AWS distributed servers from the AWS regions.
– ISPs vs. ISP to the nearest AWS region and then NetUber
overlay.
● Network Services: Compression, Encryption, ..

14/33
1) Cheaper Point-to-Point Connectivity
● Expense for 10 Gbps flat connectivity
– Measured for transfers from EU and USA.
– Cheaper for data transfers <50 TB/month.

15/33
2) Improve Latency with Cloud Routes
● Instead of sending traffic A → Z, can we send A → B → Z?
○ A is closer to B. B and Z are servers in cloud regions.
○ B and Z are connected by NetUber overlay.

16/33
Ping times: ISP vs. NetUber (via region,
% Improvement)
● NetUber cuts Internet latencies up to a factor of 30%.
● The use of AWS Direct Connect would make this even faster.

17/33
Key Findings
• Previous research focus on technical side.
– Not economical aspects - More expensive.
●
Industrial efforts on leveraging cloud or data center
infrastructure to offer connectivity.
– Teridion - Internet fast lanes for SaaS providers.
– Voxility - As an alternative to transit providers.
●
NetUber - A cheaper alternative (< 50 TB/month).
– A connectivity provider that does not own the infrastructure
– “Internet Fast-routes” through cloud-assisted networks.
– Better than ISPs (< 100 Mbps, often with a cap) for end users.

18/33
II) Composing Network Service Chains at the Edge:
A Resilient and Adaptive Software-Defined Approach
Kathiravelu, P., Van Roy, P., & Veiga, L.
In Transactions on Emerging Telecommunications Technologies (ETT).
(JCR IF: 1.535, Q2). 2018. Wiley. Accepted for publication.

19/33
Motivation
● Increasingly, network services placed at the edge.
– Limitations in hosting all the network services on-premise.
– Closer to the users than centralized clouds.
● Network Service Chaining (NSC)
– Finding the optimal service chain for a user request.
– Service Level Objectives of the service chain users.

20/33
Our Proposal: Évora
● A graph-based algorithm to incrementally construct
and deploy service chains at the edge.
● An Orchestrator in the user device, to place and
migrate service chains, adhering to the user policies.
● An architecture extending SDN to wide area to
efficiently support the service chains at the edge.

21/33
Évora Approach
● Initialize once per user device:
– Step 1) Construct a service graph.
● Initialize once per a user’s service chain.
– Step 2) Find matching subgraphs for the user’s service
chain as partial, potential chains.
– Step 3) Complete matches → Potential Chains.
● Initialize once per <nsc, policy> pair
– Step 4) Service chain placement at the best fit among
the possible chains, based on a user-defined policy.
● Execute the service chain.

22/33
1) Initialize the orchestrator
(Once per device)
● Construct a service graph in the user device.
― As a snapshot of the available service instances at
the edge.

23/33
2) Initialize Service Chain
(Once per each chain)
● Construct matching subgraphs as potential chains.
– while noting the individual service properties
● Incrementally calculate a “penalty value” for each
potential chain that is being constructed.
– with user-given weight to the properties.
● monthly cost (C), throughput (T), end-to-end
latency (L), ..

24/33
3) Complete matches →
Potential Service Chain Placements
● Ability to place the entire service chain in the
matching subgraph.
– Complete matching subgraph, i.e. a potential service
chain placement is found.
● Record.
● Stop procedure once all the nodes are traversed.
● Subsequent NSC executions require no initialization.

25/33
4) The Service Chain Placement
● Penalty function, with normalized values of C, L, and T.
– α,β,γ ← Non-negative integers specified by user.
● Solve this as a Mixed Integer Linear Problem.
● The penalty function can be extended with powers.
● Place the current NSC (<nsc,policy> pair) on the service
composition with minimal penalty value.
● Possible updates and migrations.
– Future service unavailability → choose the next.

26/33
Solution Architecture
● Extending SDN to a multi-domain edge environment.
– With Message-Oriented Middleware (MOM).

27/33
Evaluation
● Microbenchmark how user policies are satisfied with
Évora for service chains among various alternatives.
– Algorithm effectiviness in satisfying user policies.
– Efficacy: Closeness to optimal results
● minimizing penalty function results in improved quality of
experience

28/33
User policies with two attributes
● Location of the circles → Properties (C, L, and T).
● Darker circles – chains with minimal penalty, the
ones that we prefer (circled).
T ↑ and C ↓ T ↑ and L ↓ C ↓ and L ↓
● Results show user policies supported fairly well.

29/33
● Policies with three attributes: One given more prominence
(weight = 10), than the other two (weight = 3).
● Results show efficient
support for multiple
attributes with different
weights.
Radius of the circles –
Monthly Cost

30/33
Key Findings
● More and more services hosted at the edge.
● NSCs have more constraints than stand-alone VNFs.
● Évora supports efficient chaining of network services.
– Leveraging a software-defined approach for services
● Extending SDN with MOM.

31/33
1) Software-Defined Cyber-Physical
Systems (CPS) workflows in the edge
● Can we tackle some design, operational, and
scalability challenges of CPS?
– By representing them as software-defined service
compositions at the edge?
III) Ongoing work
SDS’17, M4IoT’15, and CLUSTER (Invited from SDS’17. Unde
review).

32/33
2) A Service-Oriented Workflow for Big
Data Research at the Edge
● Analyse decentralized big data (TB-scale) with a
service based data access and virtual integration
approach.
– Addressing data related optimizations as service chains.
● Data cleaning, incremental data integration, and data analysis.
CoopIS’15, SDS’18, and DAPD (Distributed and Parallel
Databases. Invited from DMAH’17. Under Review).

33/33
Thank you!
pradeeban.kathiravelu@tecnico.ulisboa.pt
Acknowledgements:
Prof. Marco Canini (KAUST)
Prof. Ashish Sharma (Emory)
Prof. Helena Galhardas (IST)
Prof. Tihana Galinac Grbac (URijeka)
Prof. Marco Chiesa (KTH)
Ed Warnicke (Cisco)

38/33
(1.1) *SDNSim*
● CoopIS’16 and SDS’15

39/33
Introduction
● Network architectures and algorithms simulated or
emulated at early stages of development.
● SDN is expanding in its scope.
– Programmable networks → continuous development.
– Native integration of network emulators into SDN
controllers.

40/33
How well the SDN simulators fare?
● Network simulators supporting SDN and emulation
capabilities.
– NS-3.
● Cloud simulators extended for cloud networks with SDN.
– CloudSim → CloudSimSDN.
However..
● Lack of “SDN-Native” network simulators.
– Simulators not following the Software-Defined Systems
paradigm.
– Policy/algorithmic code locked in simulator-imperative code.
● Need for easy migration and programmability.

41/33
Goals
● A simulator for SDN Systems.
● Extend and leverage the SDN controllers in cloud
network simulations.
– Bring the benefits of SDN to its own simulations!
● Reusability, Scalability, Easy migration, . . .
– Run the control plane code in the actual controller
(portability).
– Simulate the data plane (scalability, resource efficiency).
● by programmatically invoking the southbound of SDN controller.

42/33
Our Proposal:
Software-Defined Simulations
● Separation of control plane and (simulated) data
plane.
● Integration with SDN controllers.

43/33
SDNSim: A Framework for
Software-Defined Simulations.
● Network system to be simulated.
– Expressed in “descriptors”.
● XML-based description language.
– Parsed and executed in SDNSim simulation sandbox.
● A Java middleware.
● Simulated application logic.
– Deployed into controller.

44/33
Contributions and SDNSim Approach
1. Reusable simulation building blocks.
● Simulating complex and large-scale SDN systems.
– Network Service Chaining (NSC).

45/33
1. Reusable simulation building blocks.
● Simulating complex and large-scale SDN systems.
– Network Service Chaining (NSC).
– As a case of Network Function Virtualization (NFV).

46/33
2. Support for continuous development
and iterative deployment.
● Checkpointing and versioning of simulated
application logic.
– Incremental updates: changesets as OSGi bundles in the
control plane.

47/33
3. State-aware simulations.
● Adaptive scaling through shared state.
– Horizontal scalability through In-Memory Data Grids.
– State of the simulations for scaling decisions.
● Pause-and-resume simulations.
– Multi-tenanted parallel executions.

48/33
4. Expressiveness.
● Data plane: XML-based network representation.
● Control plane: Java API.

49/33
Prototype Implementation
● Oracle Java 1.8.0 - Development language.
● Apache Maven 3.1.1 - Build the bundles and execute
the scripts.
● Infinispan 7.2.0.Final - Distributed cluster.
● Apache Karaf 3.0.3 - OSGi run time.
● OpenDaylight Beryllium - Default controller.
● Multiple deployment options:
– As a stand-alone simulator.
– Distributed execution with an SDN controller.
– As a bundle in an OSGi-based SDN controller.

50/33
Evaluation Deployment Configurations
● Intel Core TM i7-4700MQ
– CPU @ 2.40GHz 8 processor.
– 8 GB memory.
– Ubuntu 14.04 LTS 64 bit operating system.
● A cluster of up to 5 identical computers.

51/33
Evaluation Strategy
● Benchmark against
CloudSimSDN.
– Cloud2Sim for distributed
execution.
● Simulating routing algorithms
in fat-tree topology.
● Experiments repeated 6
times.
● Data center simulations of up
to 100,000 nodes.

52/33
Performance and Problem Size
● SDNSim yields higher performance for larger
simulations.

53/33
Horizontal Scalability
● Smart scale-out.
● Higher horizontal scalability.

54/33
Performance with Incremental Updates
● Smaller simulations: up to 1000 nodes.
● SDNSim: controller and middleware execution
completion time.

55/33
● Initial execution takes longer - Initializations.

56/33
● Faster executions once the system is initialized.

57/33
Incremental Updates: Test-driven
development
● Faster executions once the system is initialized.

58/33
development
● Even faster executions for subsequent simulations.

59/33
development
● No change in simulated environment – Deploy
changesets to controller.

60/33
development
● No change in simulated environment - Revert
changeset.

61/33
Performance with Incremental Scaling
● No change in controller - scale the simulated
environment.

62/33
Network Construction with Mininet
and SDNSim
● Adaptive Emulation and Simulation.
– Simulate when resources are scarce for emulation.

63/33
Automated Code Migration:
Simulation → Emulation
● Time taken to programmatically convert an SDNSim
simulation script into a Mininet script.

64/33
Conclusion
Conclusions
● SDNSim is an SDN-aware network simulator
– Built following the SDN paradigm
● Separation of data layer from the control layer and application logic.
– Enabling an incremental modelling of cloud networks.
● Performance and scalability.
– Complex network systems simulations.
– Reuse the same controller code algorithm developers created to
– simulate much larger scale deployments.
– Adaptive parallel and distributed simulations.
Future Work
● Extension points for easy migrations.
– More emulator and controller integrations.

65/33
(1.2) *SENDIM*
● Simulation, Emulation, aNd Deployment Integration
Middleware
● IC2E’16

71/33
NetUber Application Scenarios
● Cheaper transfers between two endpoints.
● Higher throughput or reduced latency.
● Better alternative to SaaS replication.
● Network services (compression, encryption, ..).

72/33
Scenario (1 of 4): Cheaper Transfers
A) Cost of Cloud Instances: Observations
● 10 Gbps R4 instance (r4.8xlarge) pairs offered only
maximum of 1.2 Gbps of data transfer inter-region.
– 10 Gbps only inside a placement group.
● We need more
pairs of instances!

73/33
Spot Instances
● Cheaper (up to 90% savings), but volatile, instances.
● Price Fluctuations - Future price unpredictable (for
EC2).
● Differing prices among availability zones of a region.
– Buy from the cheapest availability zones at the moment.
– Maintain instances in the cheap availability zones.

74/33
B) Cost of Bandwidth: Price disparity is real!
Regions 1 - 9 (US, Canada, and EU) remain much cheaper than the
others.

75/33
C) Cost to Connect to the Cloud Provider
● Connect the end-user to the cloud servers.
● Often provided by the cloud provider.
● Example: Amazon Direct Connect.
● Charged per port-hour (e.g. how many hours a 10
GbE port is used).

76/33
Scenario (2 of 4): Higher throughput
or reduced latency
● Cloud-Assisted Point-to-Point Connectivity
– Better control over the path, compared to the Internet
paths.
– Also cheaper than MPLS networks or transit providers.
● Thanks to spot instances.

77/33
Scenario (3 of 4): Better Alternative
to SaaS Replication
● See slide 8

78/33
Scenario (4 of 4): Network Services
● NetUber uses memory-optimized R4 spot instances.
– Each instance with 244 GB memory, 32 vCPU, and 10 GbE
interface.
● Possibility to deploy network services at the instances.
● Network services.
– Value-added services for the customer.
● Encryption, WAN-Optimizer, load balancer, ..
– Services for cost-efficiency.
● Compression.

79/33
Conclusion
● A connectivity provider that does not own the
infrastructure.
● “Internet Fast-routes” through cloud-assisted networks.
– Better than ISPs (~50 - 75 Mbps, often with a cap) for end-
users.
● Cheaper point-to-point connectivity.
– Cheaper than transit providers and similar offerings (for < 50
TB/month).
● Future work:
– Evaluate NetUber for more parameters (loss rate, jitter, ..)
– Evaluate the cost with more cloud providers and pairs of
regions.

80/33
(3) *SMART*
● EI2N’16 and IM’17

81/33
Introduction
● Cloud data centers consist of various tenants with multiple
roles.
● Differentiated Quality of Service (QoS) in multi-tenant
clouds.
– Service Level Agreements (SLA).
– Different priorities among tenant processes.
● Network is shared among the tenants.
– End-to-end delivery guarantee despite congestion for critical
flows.

82/33
SDN for Clouds
● Cross-layer optimization of clouds with SDN.
– Centralized control plane of the network-as-a-service.

83/33
Motivation
● How to offer differentiated QoS and SLA in multi-tenant
networks?
– Application-level user preferences and system policies.
– Performance guarantees at the network-level.
– More potential in having them both!
– SDN, Middleboxes, . . .

84/33
Goals
● How to offer differentiated QoS and SLA in multi-
tenant networks?
– Leverage SDN to offer a selective partial redundancy in
network flows.
– FlowTags - Software middlebox to tag the flows with
contextual information.
● Application-level preferences to the network control plane as
tags.
● Dynamic flow routing modifications based on the tags.

85/33
Goals
● How to offer differentiated QoS and SLA in multi-
tenant networks?
– Leverage SDN to offer a selective partial redundancy in
network flows.
– FlowTags - Software middlebox to tag the flows with
contextual information.
● Application-level preferences to the network control plane as
tags.
● Dynamic flow routing modifications based on the tags.

86/33
Our Proposal: SMART
● An SDN Middlebox Architecture for Reliable Transfers.
● An architectural enhancement for network flows
allocation, routing, and control.
● Timely delivery of priority flows by dynamically
diverting them to a less congested path.
● Cloning subflows of higher priority flows.
● An adaptive approach in cloning and diverting of the
flows.

87/33
Contributions
● A cross-layer architecture ensuring differentiated
QoS.
● A context-aware appraoch in load balancing the
network.
– Servers supporting multihoming, connected
topologies, . . .

88/33
SMART Approach
● Divert and clone subflows by setting breakpoints in
the flows in their route, to avert congestion.
– Trade-off of minimal redundancy to ensure the SLA of
priority flows.
– Adaptive execution with contextual information on the
network.
● Leverage FlowTags middlebox
– to pass application-level system and user preferences to
the network.

89/33
SMART Enhancements
● When to break and when to merge?
– Clone destination.

92/33
I: Tag Generation for Priority Flows
● Tag generation query and
response.
– between the hosts and the FlowTags
controller.
● A centralized controller for FlowTags.
● Tag the flows at the origin.
● FlowTagger software middlebox.
– A generator of the tags.
– Invoked by the host application layer.
– Similar to the FlowTags-capable
middleboxes for NATs

93/33
II: Regular routing until the policies
(from the tags) are violated

94/33
III: When a threshold is met
● Controller is triggered through OpenFlow API.
● A series of control flows inside the control plane.
● Modify flow entries in the relevant switches.

95/33
SMART Control Flows: Rules Manager
● A software middlebox in the control plane.
● Consumes the tags from the packet.
– Similar to FlowTags-capable firewalls.

96/33
Rules Manager Tags Consumption
● Interprets the tags
– as input to the SMART Enhancer

97/33
SMART Enhancer
● Core of the SMART architecture.
● Gets the input to the enhancement algorithms.
● Decides the flow modifications.
– Breakpoint node and packet.
– Clone/divert decisions.

98/33
● Developed in Oracle Java 1.8.0.
● OpenDaylight Beryllium as the core SDN controller.
● Enhancer and the Rules Manager middlebox as controller extensions.
– Developed as OSGi bundles.
– Deployed into Apache Karaf runtime of OpenDaylight.
● FlowTags middlebox controller deployed along the SDN controller.
– FlowTags, originally a POX extension.
● Network nodes and flows emulated with Mininet.
– Larger scale cloud deployments simulated.

99/33
Evaluation Strategy
● Data center network with 1024 nodes and leaf-spine topology.
– Path lengths of more than two-hops.
– Up to 100,000 of short flows.
● Flow completion time < 1 s.
● A few non-priority elephant flows.
– SLA → maximum permitted flow completion time for priority flows
– Uniformly randomized congestion.
● hitting a few uplinks of nodes concurrently.
● overwhelming amount of flows through the same nodes and links.
● Benchmark: SMART enhancements over base routing
algorithms.
– Performance (SLA awareness), redundancy, and overhead.

100/33
SMART Adaptive Clone/Replicate
with Shortest-Path
● Replicate the subsequent flows once a previous flow
was cloned.

101/33
SMART Adaptive Clone/Replicate
with Equal-Cost Multi-Path (ECMP)
● Repeat the experiment with ECMP routing.

102/33
Related Work
● Multipath TCP (MPTCP) uses the available multiple
paths between the nodes concurrently to route the
flows across the nodes.
– Performance, bandwidth utilization, and congestion
control
– through a distributed load balancing.
● ProgNET leverages WS-Agreement and SDN for
SLA-aware cloud.
● pFabric for deadline-constrained data flows with
minimal completion time.
● QJump linux traffic control module for latency-
sensitive applications.

103/33
Conclusion
Conclusions
● SMART leverages redundancy in the flows as a mean to
improve the SLA of the priority flows.
● Opens an interesting research question leveraging SDN,
middleboxes, and redundancy.
– Cross-layer optimizations through tagging the flows.
– For differentiated QoS.
Future Work
● Implementation of SMART on a real data center network.
● Evaluate against the identified related work
quantitatively.

104/33
(4) *Mayan*
● ICWS’16 and SDS’16

105/33
Introduction
● eScience workflows
– Computation-intensive.
– Execute on highly distributed networks.
● Complex service compositions aggregating web
services
– To automate scientific and enterprise business processes.

106/33
Motivation
● Scalable Distributed Executions in wide area
networks.
– Better orchestration of service compositions.
● Multi-Tenant Environments.
– Isolation Guarantees.
– Differentiated Quality of Service (QoS).
● Increasing demand for geo-distribution (workflows
and service compositions).

107/33
Contributions
● Support for,
– Adaptive execution of scientific workflows.
– Flexible service composition.
– Reliable large-scale service composition.
– Efficient selection of service instances.

108/33
Our Proposal: Mayan
● Extensible SDN approach for cloud-scale service composition.
● An approach driven by,
– Loose coupling of service definitions and implementations.
– Message-oriented Middleware (MOM).
– Availability of a logically centralized control plane.
● Leveraging OpenDaylight SDN controller as the core.
– Modular, as OSGi bundles.
– Additional advanced features.
●
State of executions and transactions stored in the controller distributed data tree.
● Clustered and federated deployments.

109/33
Software-Defined Service Composition:
Services as the building blocks of
Mayan

110/33
Multiple Implementations and
Deployments of a Service

112/33
Mayan Services Registry:
Modelling Language

113/33
Service Composition Representation
● <Service3,(<Service1, Input1>, <Service2, Input2>)>

114/33
Alternative Implementations and
Deployments

116/33
Connecting Services View with the
Network View

117/33
Connecting Services View with the
Network View

119/33
Evaluation System Configurations
● Evaluation Approach:
– Smaller physical deployments in a cluster.
– Larger deployments as simulations and emulations (Mininet).
● Evaluated Deployment:
– Service Composition Implementations.
● Web services frameworks.
● Apache Hadoop MapReduce.
● Hazelcast In-Memory Data Grid.
– OpenDaylight SDN Controller.

120/33
Preliminary Assessments
● A workflow performing distributed data cleaning and
consolidation.
– A distributed web service composition.
vs.
– Mayan approach with the extended SDN architecture.

121/33
Speedup and Horizontal Scalability
● No negative scalability in larger distributions.
● 100% more positive scalability for larger
deployments.

122/33
Throughput of the controller
● Measured as the number of msg entirely processed
by the controller, arriving from the publishers to be
forwarded towards a relevant receiver.
● 5000 messages/s in a concurrency of 10 million msg.

123/33
Processing Time
● Total time taken to process the complete set of
messages at a Mayan controller, against the varying
number of messages.
● The controller scaled linearly regarding processing
time with the number of parallel messages.
● It processes 10 million messages in 40 minutes.

124/33
Scalability of the Mayan Controller
● The results presented are for a single stand-alone
deployment of the controller.
● Mayan is designed as a federated deployment.
– Scales horizontally to
● manage a wider area with a more substantial number of service
nodes and improved latency.
● handle more concurrent messages in each controller domain.

125/33
Related Work
● MapReduce for efficient service compositions [SD
2014].
● Palantir: SDN for MapReduce performance with the
network proximity data [ZY 2014].
[SD 2014] Deng, Shuiguang, et al. "Top-Automatic Service Composition: A Parallel Method for
Large-Scale Service Sets." Automation Science and Engineering, IEEE Transactions on 11.3
(2014): 891-905.
[ZY 2014] Yu, Ze, et al. "Palantir: Reseizing network proximity in large-scale distributed computing
frameworks using sdn." 2014 IEEE 7th International Conference on Cloud Computing (CLOUD).
IEEE, 2014.

126/33
Conclusion
● SDN-based approach that enables large scale
flexibility with performance
– Components in eScience workflows as building blocks of
a distributed platform.
– Service composition with web services and distributed
execution frameworks.
– Multi-tenant and multi-domain executions.

128/33
Services
●
A core element of the Internet ecosystem.
●
Various types of Services
– Web services and microservices
●
key in modern cloud applications.
– Network services / Virtual Network Functions
● firewall, load balancer, proxy, ..
– Data services
● data cleaning, data integration, ..
●
Interesting common research challenges:
– Service placement.
– Service instance selection.
– Service composition or “service chaining”.

129/33
Why Service-Oriented Architectures
for our systems?
● Beyond data center scale.
– Thanks to the fact that services are standardized.
● SOA and RESTful reference architectures.
– Multiple implementation approaches such as Message-
Oriented Middleware.
● Service endpoints to handover messages internally to the broker.
● Publish/subscribe to a message broker over the Internet.
● Flexibility, modularity, loose-coupling, and adaptability.

130/33
Challenges in achieving
Service Chaining at the Edge
● Dependencies among the network services.
– Need to be accessible from each other.
● Service Level Objectives of the service chain users.
– Latency, throughput, monthly cost, ..
● Finding the optimal service chain for a user request.
– In general, an NP-hard problem.

131/33
Service Chain: s1
→ s2
→ s3
→ s4
● Goals
– Services close to the user.
– Services close to the following services in the chain.
– Satisfying user Service Level Objectives!

132/33
Alternative Representations

133/33
Problem Scale: Representation of the
service graph from the data center
graph
● The number of links in this service graph grows
– linearly with the number of edges or links between the edge nodes.
– exponentially with the average number of services per edge node.

134/33
What has Message-Oriented
Middleware got to do with the controller?
● Expose the internals from controller (e.g. OpenDaylight)
– Through a message-based northbound API
● e.g. AMQP (Advanced Message Queuing Protocol).
– Publish/Subscribe with a broker (e.g. ActiveMQ).
● What can be exposed
– Data tree (internal data structures of the controller)
– Remote procedure calls (RPCs)
– Notifications.
● Thanks to Model-Driven Service Abstraction Layer (MD-SAL) of
OpenDaylight.
– Compatible internal representation of data plane.
– Messaging4Transport Project.

135/33
MILP and Graph Matching can be
computation intensive
● But initialization is once per user service chain with a
given policy.
– This procedure does not repeat once initialized.
– unless updates received from the edge network.
● New data center with the service offering at the edge.
● An existing data center or a service offering fails to respond.
● Services in each NSC is typically 5 – 10.
– Évora algorithm follows a greedy approach, rather than a
typical graph matching.

136/33
● Two attributes given more prominence (weight = 10),
than the third (weight = 3).
● Results show efficient
support for multiple
attributes with different
weights.
Radius of the circles –
Monthly Cost

137/33
Performance and Scalability of
Évora Orchestrator Algorithms

140/33
(6) *SD-CPS*
● Work-in-Progress
● SDS’17, M4IoT’15, and CLUSTER (Under Review)

141/33
(7) *Obidos*
● Work-in-Progress
● CoopIS’15, DMAH’17, and DAPD (Under Review)

142/33
(8) *SDDS*
● SDS’18 (Best Paper Award)

Introduction
● Big data with increasing volume and variety.
– Volume requires scalability.
– Variety requires interoperability.
● Data Services
– Services that access and process big data.
– Unified web service interface to data → Interoperability!
● Chaining of data services.
– Composing chains of numerous data services.
– Data Access → Data cleaning → Data Integration.

Problem Statement
● Data services offer interoperability.
● But when related data and services are distributed
far from each other → Bad performance with scale.
– How to scale out efficiently?
● How to minimize communication overheads?

145/33
Motivation
● Software-Defined Networking (SDN).
– A unified controller to the data plane devices.
– Brings network awareness to the applications.
● To make big data executions
– Interoperable.
– Network-aware.

146/33
Our Proposal: SDDS
● Can we bring SDN to the data services?
● Software-Defined Data Services (SDDS).

147/33
Contributions
● SDDS as a generic approach for data services.
– Extending and leveraging SDN in the data centers.
● A software-defined framework for data services.
– Efficient performance and management of data services.
– Interoperability and scalability.

148/33
Solution Architecture
● A bottom-up approach, extending SDN.
– Data Plane (SDN OpenFlow Switches)
– Storage PlaneStorage Plane (SQL and NoSQL data stores)
– Control Plane (SDN Controller, In-Memory Data Grids (IMDGs), ..)
– Execution Plane (Orchestrator and Web Service Engines)Execution Plane (Orchestrator and Web Service Engines)

149/33
Network-Aware Service Executions
with SDN

150/33
SDDS Planes and Layered Architecture

151/33
SDDS Approach
● Define all the data operations as interoperable services.
● SDN for distributing data and service executions
– Inside a data center (e.g. Software-Defined Data Centers).
– Beyond data centers (extend SDN with Message-Oriented
Middleware).
● Optimal placement of data and service execution.
– Minimize communication overhead and data movements.
● Keep the related data and executions closer.
● Send the execution to data, rather than data to execution.
– Execute data service on the best-fit server, until interrupted.

152/33
Efficient Data and Execution Placement

153/33
Efficient Data and Execution Placement
{i, j} – related data objects
D – datasets of interest
n – execution node
Σ – spread of the related data objects

154/33
● Data services implemented with web service
engines.
– Apache Axis2 1.7.0 and Apache CXF 3.2.1.
● IMDG clusters – Hazelcast 3.9.2 and Infinispan 9.1.5.
● Persistent storage – MySQL Server and MongoDB.
● Core SDN Controller – OpenDaylight Beryllium.

155/33
Evaluation Environment
● A cluster of 6 servers.
– AMD A10-8700P Radeon R6, 10 Compute Cores 4C+6G
× 4.
– 8 GB of memory.
– Ubuntu 16.04 LTS 64 bit operating system.
– 1 TB disk space.

156/33
Evaluation
● How does SDDS comply as a network-aware big
data execution compared to network-agnostic
execution?
– SDDS vs data services on top of Infinispan IMDG.
– A data storage and update service with an increasing
volume of persistent data across the cluster, up to a total
of 6 TB data.
● Measured the throughput from the service plane
– by the total amount of data processed through the data
services per unit time.

157/33
Evaluation
● SDDS outperforms the base.
– Better data locality
● by distributing data adhering to network topology.
– Better resource efficiency.
● by avoiding scaling out prematurely.
– Better throughput with minimal distribution when
there is no need to utilize all the 6 servers.

158/33
Related Work
● Software-Defined Systems.
– Software-Defined Service Composition.
– Software-Defined Cyber-Physical Systems and SDIoT.
● Industrial SDDS offerings.
– Many of them storage focused.
● PureStorage, PrimaryIO, HPE, RedHat, ..
– Many focus on specific data services.
● Containers and devops – Atlantix and Portworx.
● Data copying and sharing – IBM Spectrum Copy Data Management
and Catalogic ECX.
● We are the first to propose a generic SDDS
framework.

159/33
Conclusion
Summary
● Software-Defined Data Services (SDDS) offer both
interoperability and scalability to big data executions.
● SDDS leverages SDN in building a software-defined
framework for network-aware executions.
● SDDS caters to data services and compositions of data
services for an efficient execution.
Future Work
● Extend SDDS for edge and IoT/CPS environments.

Software-Defined Systems for Network-Aware Service Composition and Workflow Placement

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Software-Defined Systems for Network-Aware Service Composition and Workflow Placement

Similar to Software-Defined Systems for Network-Aware Service Composition and Workflow Placement (20)

More from Pradeeban Kathiravelu, Ph.D.

More from Pradeeban Kathiravelu, Ph.D. (20)

Recently uploaded

Recently uploaded (20)

Software-Defined Systems for Network-Aware Service Composition and Workflow Placement