Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RECAP Project Overview


Published on

This a RECAP project overview slide deck prepared by Thang Le Duc (UMU), P-O Östberg (UMU) and Tomas Brännström (Tieto). It starts with an introduction and continues with a section on challenges for a self-orchestrated, self-remediated cloud system. It then presents the RECAP vision and use cases and finishes with a conclusion.

Published in: Software
  • Login to see the comments

  • Be the first to like this

RECAP Project Overview

  1. 1. Reliable Capacity Provisioning and Enhanced Remediation for Distributed Cloud Applications recap2020 THIS PROJECT HAS RECEIVED FUNDING FROM THE EUROPEAN UNION’S HORIZON 2020 RESEARCH AND INNOVATION PROGRAMME UNDER GRANT AGREEMENT NUMBER 732667 REliable CApacity Provisioning for Distributed Cloud/Edge/Fog Computing Applications (RECAP) Thang Le Duc @ Umeå University ( P-O Östberg @ Umeå University ( Tomas Brännström @ Tieto (
  2. 2. Agenda • Introduction • Challenges for a Self-Orchestrated, Self-Remediated Cloud System • The RECAP Vision • Use Cases for RECAP System • Conclusion 2
  3. 3. Introduction 3
  4. 4. What is RECAP? • Reliable Capacity Provisioning and Enhanced Remediation for Distributed Cloud/Edge/Fog Applications • Founded under European Horizon 2020 framework • Jan 2017 – Dec 2019 • Consortium comprises many academic and industrial partners ⁃ 9 partners from 5 countries (Germany, Ireland, Spain, Sweden, UK) • 4
  5. 5. Partner Location 5
  6. 6. Project Motivation • Large-scale systems are typically built as distributed systems • Tradeoffs in the placement of application components: ⁃ Data center High latency, high power ⁃ Fog/Edge Low latency, low power • Cloud computing capacity is provisioned using best-effort models and coarsed- grained QoS mechanisms ⁃ Not a sustainable way as the number of connected ”things” increases 6
  7. 7. Realisation Approach (1/2) • An architecture for cloud-edge computing capacity provisioning and remediation ⁃ Fine-grained and accurate models of application behaviour and deployment ⁃ Model of QoS requirements at application component-level ⁃ Model of workloads • To understand and predict the behaviour of applications (users) • To enhance the proactive remediation of systems 7
  8. 8. Realisation Approach (2/2) 8
  9. 9. Challenges for a Self- Orchestrated, Self-Remediated Cloud System 9
  10. 10. Challenges • Resource Management • Data Science and Analytics • Intelligent Automation 10
  11. 11. Resource Management • Becomes more challenging due to the complexity and the large scale of cloud computing systems ⁃ Software-defined infrastructure: seamless distribution of components ⁃ Fully abstracted physical components ⁃ Virtualization technologies: high flexibility in resource management • Requires an understanding of diverse system factors ⁃ User behaviours ⁃ Workload characteristics and distribution ⁃ Interactions among components ⁃ Performance bottlenecks 11
  12. 12. Data Science and Analytics • Data Science and Analytics: collect, clean, integrate, analyze, visualize, and interact with data in order to construct structured knowledge bases • Understanding complex and large systems is challenging ⁃ Big data of diverse factors ⁃ Distributed and real-time orientied analytical models/techniques ⁃ Machine learning techniques 12
  13. 13. Intelligent Automation • Intelligent Automation: constructs intelligence in order to automatically and proactively provide orchestration and remediation ⁃ Automation in management ⁃ Optimization of the system ⁃ QoS • Challenges come from ⁃ The neglect of QoS of shared resource allocation schemes in cloud ⁃ The uncertainties related to QoS provisioning, automated management, and remediation in cloud systems 13
  14. 14. The RECAP Vision 14
  15. 15. Architecture • Feedback Loop ⁃ Collector ⁃ Application Modeler ⁃ Workload Modeler ⁃ Optimizer ⁃ Simulator 15
  16. 16. The RECAP Collector • Gathers, synthesizes and analyzes metrics to be monitored across the infrastructure • Acquires, characterizes, and analyzes data ⁃ Workload patterns and their relationship ⁃ Status of the infrastructure ⁃ … • Visualizes, annotates,archives and manages the collected data 16 Knowledge Discovery
  17. 17. The RECAP Application Modeler • With the knowledge provided by the Collector, the Modeler discovers and defines ⁃ The internal structure of cloud applications ⁃ The QoS requirements for each applications • To support intelligent decision making ⁃ Application/Component placement and autoscaling 17 Application Modeling
  18. 18. The RECAP Workload Modeler • Decomposes, classifies, and predicts the workloads in the network, and the load propagation in applications ⁃ CPU, memory, network traffic, … • Models the workload distribution and load propagation patterns • To improve planning decision and ensure QoS • To support the construction of an artificial workload generation tool ⁃ Validating and training the learning models 18 Workload Modeling
  19. 19. The RECAP Optimizer • With the models provided by the Modelers, the Optimizer performs optimization tasks ⁃ Application/Component placement: scaling vs. migration ⁃ Infrastructure management decisions: energy, utilization rate, load balancing • To improve the efficiency in resources utilzation while maintaining QoS 19 Optimization
  20. 20. The RECAP Simulator • Simulation is needed due to the size and complexity of the intended target systems ⁃ Simulateinteractions of distributed cloud application behaviors ⁃ Emulate data center and network systems • To assist the Optimizer with the evaluation of different deployments of applications and infrastructures • To feed back to the Data Collector with simulation results 20 Testing & Improvement
  21. 21. Use Cases 21
  22. 22. Use Cases from Industry • BT ⁃ NFV, QoS Management and Remediation • Linknovate ⁃ Complex Big Data Analytics Engine • Satec ⁃ Fog Computing and Large Scale IoT Scenario for supporting Smart Cities • Tieto ⁃ Infrastructure and Network Management 22
  23. 23. BT 23 • NFV infrastructure and virtual CDNs ⁃ Softwarization of network appliances ⁃ Mulitiple distributed applications per CDN operator • RECAP: ⁃ Automated decision making • Optimization of placement and scaling vCDN systems • Monitoring and Remediation ⁃ Improves resource utilization while guaranteeing SLAs
  24. 24. Linknovate • Big data analytics engine ⁃ Data acquisition, data aggregation, data processing, data visualization • RECAP: ⁃ Characterizes workload and models workload distribution ⁃ Automatically and dynamically allocates computing resources ⁃ To reduce costs and improve performance 24
  25. 25. Satec • IoT, smart city ⁃ Data management system: to collect sensory data, and to provide data (in a distributed manner) • RECAP: ⁃ Optimizes the placement of IoT resources such as computation, storage for cost and latency ⁃ Automated reallocation of resources ⁃ Automated deployment of IoT applications 25
  26. 26. Tieto 26 • Telecommunication infrastructure systems and applications • RECAP: ⁃ Automated profiling and simulating the infrastructure & VNFs ⁃ QoS and low latency
  27. 27. Conclusion 27
  28. 28. Expected Results / Summary • Distributed and Efficient Data Collection on Heterogeneous Infrastructures • Data Science Analysis for Automated Infrastructure and Application Modelling • Intelligent Automation by Automated Cloud Infrastructure Optimisation • End-to-end Component-level Quality Assurance by Capacity Provisioning • Application and Infrastructure Simulation for Cloud Optimisation • System Observability by Visualizing Data Collection and Modelling Results • Improved Resource Utilisation and User Satisfaction 28