The document discusses power-efficient scheduling in the Linux kernel. It proposes moving power management capabilities from cpufreq and cpuidle into the scheduler to allow it to make more informed decisions. Key points include:
- The scheduler currently lacks power/energy information to optimize task placement.
- Cpufreq and cpuidle are not well coordinated with the scheduler.
- A power driver would provide power/topology data to the scheduler.
- Feedback from a kernel summit highlighted the need for use cases and benchmarks to evaluate proposals.
- Patches have been prepared to implement task placement based on CPU suitability.
LCU14-410: How to build an Energy Model for your SoCLinaro
LCU14-410: How to build an Energy Model for your SoC
---------------------------------------------------
Speaker: Morten Rasmussen
Date: September 18, 2014
---------------------------------------------------
★ Session Summary ★
- ARM to provide a quick overview of the current energy model
- Introduce the methodology/recipe used to build the energy model
- Discuss ways in which the model is used today and intended next steps
- Key outcomes:
- Describe the
- Identify gaps and limitations
Summary of EAS workshop (Amit)
-Summary of hacking sessions - plan to integrate Qualcomm-ARM-Linaro work to send upstream
-Key outcomes:
-List of features and responsibilities
-Dependencies between upstreaming of features, if any
---------------------------------------------------
★ Resources ★
Zerista: http://lcu14.zerista.com/event/member/137778
Google Event: https://plus.google.com/u/0/events/ck3ti7eurknnsq0a4e9ks5a1sbs
Video: https://www.youtube.com/watch?v=JfZt8W3NVgk&list=UUIVqQKxCyQLJS6xvSmfndLA
Etherpad: http://pad.linaro.org/p/lcu14-410
---------------------------------------------------
★ Event Details ★
Linaro Connect USA - #LCU14
September 15-19th, 2014
Hyatt Regency San Francisco Airport
---------------------------------------------------
http://www.linaro.org
http://connect.linaro.org
A Dynamic Programming Approach to Energy-Efficient Scheduling on Multi-FPGA b...TELKOMNIKA JOURNAL
This paper has been studied an important issue of energy-efficient scheduling on multi-FPGA systems. The main challenges are integral allocation, reconfiguration overhead and exclusiveness and energy minimization with deadline constraint. To tackle these challenges, based on the theory of dynamic programming, we have designed and implemented an energy-efficient scheduling on multi-FPGA systems. Differently, we have presented a MLPF algorithm for task placement on FPGAs. Finally, the experimental results have demonstrated that the proposed algorithm can successfully accommodate all tasks without violation of the deadline constraint. Additionally, it gains higher energy reduction 13.3% and 26.3% than that of Particle Swarm Optimization and fully balanced algorithm, respectively.
Profit based unit commitment for GENCOs using Parallel PSO in a distributed c...IDES Editor
In the deregulated electricity market, each
generating company has to maximize its own profit by
committing suitable generation schedule termed as profit
based unit commitment (PBUC). This article proposes a
Parallel Particle Swarm Optimization (PPSO) solution to the
PBUC problem. This method has better convergence
characteristics in obtaining optimum solution. The proposed
approach uses a cluster of computers performing parallel
operations in a distributed environment for obtaining the
PBUC solution. The time complexity and the solution quality
with respect to the number of processors in the cluster are
thoroughly tested. The method has been applied to 10 unit
system and the results show that the proposed PPSO in a
distributed cluster constantly outperforms the other methods
which are available in the literature.
(Slides) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
With the rise of containerization, as well as the established adoption of virtualization technologies, run-time power and energy management is becoming one of the key challenges in modern cloud computing. This is also fundamental as power consumption contributes to the 20% of the Total Cost of Ownership of a datacenter and energy costs will exceed hardware costs in the near future. In this context, several goals towards power optimization can be achieved. On the one hand, power capping can be enforced and on top of that the system should be able to maximize performance. On the other hand, when performance are critical, the system should be able to provide a minimum SLA and optimize power consumption without violating it. Within this context, we propose a common autonomic methodology based on the ODA control loop for containers and virtual machines. The proposed methodology is able to achieve 25% power savings for containers and can improve performance under a power cap for virtual machines.
LCU14-410: How to build an Energy Model for your SoCLinaro
LCU14-410: How to build an Energy Model for your SoC
---------------------------------------------------
Speaker: Morten Rasmussen
Date: September 18, 2014
---------------------------------------------------
★ Session Summary ★
- ARM to provide a quick overview of the current energy model
- Introduce the methodology/recipe used to build the energy model
- Discuss ways in which the model is used today and intended next steps
- Key outcomes:
- Describe the
- Identify gaps and limitations
Summary of EAS workshop (Amit)
-Summary of hacking sessions - plan to integrate Qualcomm-ARM-Linaro work to send upstream
-Key outcomes:
-List of features and responsibilities
-Dependencies between upstreaming of features, if any
---------------------------------------------------
★ Resources ★
Zerista: http://lcu14.zerista.com/event/member/137778
Google Event: https://plus.google.com/u/0/events/ck3ti7eurknnsq0a4e9ks5a1sbs
Video: https://www.youtube.com/watch?v=JfZt8W3NVgk&list=UUIVqQKxCyQLJS6xvSmfndLA
Etherpad: http://pad.linaro.org/p/lcu14-410
---------------------------------------------------
★ Event Details ★
Linaro Connect USA - #LCU14
September 15-19th, 2014
Hyatt Regency San Francisco Airport
---------------------------------------------------
http://www.linaro.org
http://connect.linaro.org
A Dynamic Programming Approach to Energy-Efficient Scheduling on Multi-FPGA b...TELKOMNIKA JOURNAL
This paper has been studied an important issue of energy-efficient scheduling on multi-FPGA systems. The main challenges are integral allocation, reconfiguration overhead and exclusiveness and energy minimization with deadline constraint. To tackle these challenges, based on the theory of dynamic programming, we have designed and implemented an energy-efficient scheduling on multi-FPGA systems. Differently, we have presented a MLPF algorithm for task placement on FPGAs. Finally, the experimental results have demonstrated that the proposed algorithm can successfully accommodate all tasks without violation of the deadline constraint. Additionally, it gains higher energy reduction 13.3% and 26.3% than that of Particle Swarm Optimization and fully balanced algorithm, respectively.
Profit based unit commitment for GENCOs using Parallel PSO in a distributed c...IDES Editor
In the deregulated electricity market, each
generating company has to maximize its own profit by
committing suitable generation schedule termed as profit
based unit commitment (PBUC). This article proposes a
Parallel Particle Swarm Optimization (PPSO) solution to the
PBUC problem. This method has better convergence
characteristics in obtaining optimum solution. The proposed
approach uses a cluster of computers performing parallel
operations in a distributed environment for obtaining the
PBUC solution. The time complexity and the solution quality
with respect to the number of processors in the cluster are
thoroughly tested. The method has been applied to 10 unit
system and the results show that the proposed PPSO in a
distributed cluster constantly outperforms the other methods
which are available in the literature.
(Slides) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
With the rise of containerization, as well as the established adoption of virtualization technologies, run-time power and energy management is becoming one of the key challenges in modern cloud computing. This is also fundamental as power consumption contributes to the 20% of the Total Cost of Ownership of a datacenter and energy costs will exceed hardware costs in the near future. In this context, several goals towards power optimization can be achieved. On the one hand, power capping can be enforced and on top of that the system should be able to maximize performance. On the other hand, when performance are critical, the system should be able to provide a minimum SLA and optimize power consumption without violating it. Within this context, we propose a common autonomic methodology based on the ODA control loop for containers and virtual machines. The proposed methodology is able to achieve 25% power savings for containers and can improve performance under a power cap for virtual machines.
Improvement of Scheduling Granularity for Deadline Scheduler Yoshitake Kobayashi
(Embedded Linux Conference Europe 2012)
https://github.com/ystk/sched-deadline/tree/dlmiss-detection-dev
Real-time system need to meet deadline. In this point of view, the system is required two functions to have determinism. One is interruptlatency stabilization and the other one is processing time reservation. The SCHED_DEADLINE has a feature to reserve CPU time in advance to ensure predictable behavior. In our evaluation, the granularity of CPU reservation is millisecond order.In this presentation, we show the evaluation results of current implementation to make clear the issue. Then we explain how to overcome this issue and its results.
Real-time system need to meet deadline. In this point of view, the system is required two functions to have determinism. One is interruptlatency stabilization and the other one is processing time reservation. The SCHED_DEADLINE has a feature to reserve CPU time in advance to ensure predictable behavior. In our evaluation, the granularity of CPU reservation is millisecond order.In this presentation, we show the evaluation results of current implementation to make clear the issue. Then we explain how to overcome this issue and its results.
Optimization of Continuous Queries in Federated Database and Stream Processin...Zbigniew Jerzak
The constantly increasing number of connected devices and sensors results in increasing volume and velocity of sensor-based streaming data. Traditional approaches for processing high velocity sensor data rely on stream processing engines. However, the increasing complexity of continuous queries executed on top of high velocity data has resulted in growing demand for federated systems composed of data stream processing engines and database engines. One of major challenges for such systems is to devise the optimal query execution plan to maximize the throughput of continuous queries.
In this paper we present a general framework for federated database and stream processing systems, and introduce the design and implementation of a cost-based optimizer for optimizing relational continuous queries in such systems. Our optimizer uses characteristics of continuous queries and source data streams to devise an optimal placement for each operator of a continuous query. This fine level of optimization, combined with the estimation of the feasibility of query plans, allows our optimizer to devise query plans which result in 8 times higher throughput as compared to the baseline approach which uses only stream processing engines. Moreover, our experimental results showed that even for simple queries, a hybrid execution plan can result in 4 times and 1.6 times higher throughput than a pure stream processing engine plan and a pure database engine plan, respectively.
Adaptive Replication for Elastic Data Stream ProcessingZbigniew Jerzak
A major challenge for cloud-based systems is to be fault tolerant so as to cope with an increasing probability of faults in cloud environments. This is especially true for in-memory computing solutions like data stream processing systems, where a single host failure might result in an unrecoverable information loss.
In state of the art data streaming systems either active replication or upstream backup are applied to ensure fault tolerance, which have a high resource overhead or a high recovery time respectively. This paper combines these two fault tolerance mechanisms in one system to minimize the number of violations of a user-defined recovery time threshold and to reduce the overall resource consumption compared to active replication. The system switches for individual operators between both replication techniques dynamically based on the current workload characteristics. Our approach is implemented as an extension of an elastic data stream processing engine, which is able to reduce the number of used hosts due to the smaller replication overhead. Based on a real-world evaluation we show that our system is able to reduce the resource usage by up to 19% compared to an active replication scheme.
In the FACTS-based transmission line, if the fault does not include FACTS device, then the impedance calculation is like an ordinary transmission line, and when the fault includes FACTS, then the impedance calculation accounts for the impedances introduced by FACTS device.
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...Naoki Shibata
Yosuke Wakisaka, Naoki Shibata, Keiichi Yasumoto, Minoru Ito, and Junji Kitamichi : Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost and Hyper-Threading, In Proc. of The 2014 International Conference on Parallel and Distributed Processing Techniques and Applications(PDPTA'14), pp. 229-235
In this paper, we propose a task scheduling algorithm for multiprocessor systems with Turbo Boost and Hyper-Threading technologies. The proposed algorithm minimizes the total computation time taking account of dynamic changes of the processing speed by the two technologies, in addition to the network contention among the processors. We constructed a clock speed model with which the changes of processing speed with Turbo Boost and Hyper-threading can be estimated for various processor usage patterns. We then constructed a new scheduling algorithm that minimizes the total execution time of a task graph considering network contention and the two technologies. We evaluated the proposed algorithm by simulations and experiments with a multiprocessor system consisting of 4 PCs. In the experiment, the proposed algorithm produced a schedule that reduces the total execution time by 36% compared to conventional methods which are straightforward extensions of an existing method.
Featuring a brief overview of fault-tolerant mechanisms across various Big Data systems such as Google File system (GFS), Amazon Dynamo, Bigtable, Hadoop - Map Reduce, Facebook Cassandra along with description of an existing fault tolerant model
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
Abstract: The increasing energy consumption of Physical Machines (PM) in cloud data centers is nowadays a major problem, it has a negative impact on the environment while at the same time increasing the operational costs of data centers. This fosters the development of more energy-efficient scheduling approaches. In this study, we study the barriers of knowledge in energy efficiency for cloud data centers.
And introdution to MR and Hadoop and an view on the opportunities to use MR with databases i.e., SQL-MapReduce by Teradata and In-database MR by Oracle.
The presentation was used during a class of Datenbanken Implementierungstechniken in 2013.
The Linux Scheduler: a Decade of Wasted Coresyeokm1
The talk I gave at Papers We Love #20 (Singapore) about this academic paper "The Linux Scheduler: a Decade of Wasted Cores" by a few researchers.
The video of this talk can be found here: https://engineers.sg/v/758
Here are some relevant links:
Paper: http://www.ece.ubc.ca/~sasha/papers/eurosys16-final29.pdf
Reference Slides: http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf
Reference summary: https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-cores/
Improvement of Scheduling Granularity for Deadline Scheduler Yoshitake Kobayashi
(Embedded Linux Conference Europe 2012)
https://github.com/ystk/sched-deadline/tree/dlmiss-detection-dev
Real-time system need to meet deadline. In this point of view, the system is required two functions to have determinism. One is interruptlatency stabilization and the other one is processing time reservation. The SCHED_DEADLINE has a feature to reserve CPU time in advance to ensure predictable behavior. In our evaluation, the granularity of CPU reservation is millisecond order.In this presentation, we show the evaluation results of current implementation to make clear the issue. Then we explain how to overcome this issue and its results.
Real-time system need to meet deadline. In this point of view, the system is required two functions to have determinism. One is interruptlatency stabilization and the other one is processing time reservation. The SCHED_DEADLINE has a feature to reserve CPU time in advance to ensure predictable behavior. In our evaluation, the granularity of CPU reservation is millisecond order.In this presentation, we show the evaluation results of current implementation to make clear the issue. Then we explain how to overcome this issue and its results.
Optimization of Continuous Queries in Federated Database and Stream Processin...Zbigniew Jerzak
The constantly increasing number of connected devices and sensors results in increasing volume and velocity of sensor-based streaming data. Traditional approaches for processing high velocity sensor data rely on stream processing engines. However, the increasing complexity of continuous queries executed on top of high velocity data has resulted in growing demand for federated systems composed of data stream processing engines and database engines. One of major challenges for such systems is to devise the optimal query execution plan to maximize the throughput of continuous queries.
In this paper we present a general framework for federated database and stream processing systems, and introduce the design and implementation of a cost-based optimizer for optimizing relational continuous queries in such systems. Our optimizer uses characteristics of continuous queries and source data streams to devise an optimal placement for each operator of a continuous query. This fine level of optimization, combined with the estimation of the feasibility of query plans, allows our optimizer to devise query plans which result in 8 times higher throughput as compared to the baseline approach which uses only stream processing engines. Moreover, our experimental results showed that even for simple queries, a hybrid execution plan can result in 4 times and 1.6 times higher throughput than a pure stream processing engine plan and a pure database engine plan, respectively.
Adaptive Replication for Elastic Data Stream ProcessingZbigniew Jerzak
A major challenge for cloud-based systems is to be fault tolerant so as to cope with an increasing probability of faults in cloud environments. This is especially true for in-memory computing solutions like data stream processing systems, where a single host failure might result in an unrecoverable information loss.
In state of the art data streaming systems either active replication or upstream backup are applied to ensure fault tolerance, which have a high resource overhead or a high recovery time respectively. This paper combines these two fault tolerance mechanisms in one system to minimize the number of violations of a user-defined recovery time threshold and to reduce the overall resource consumption compared to active replication. The system switches for individual operators between both replication techniques dynamically based on the current workload characteristics. Our approach is implemented as an extension of an elastic data stream processing engine, which is able to reduce the number of used hosts due to the smaller replication overhead. Based on a real-world evaluation we show that our system is able to reduce the resource usage by up to 19% compared to an active replication scheme.
In the FACTS-based transmission line, if the fault does not include FACTS device, then the impedance calculation is like an ordinary transmission line, and when the fault includes FACTS, then the impedance calculation accounts for the impedances introduced by FACTS device.
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...Naoki Shibata
Yosuke Wakisaka, Naoki Shibata, Keiichi Yasumoto, Minoru Ito, and Junji Kitamichi : Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost and Hyper-Threading, In Proc. of The 2014 International Conference on Parallel and Distributed Processing Techniques and Applications(PDPTA'14), pp. 229-235
In this paper, we propose a task scheduling algorithm for multiprocessor systems with Turbo Boost and Hyper-Threading technologies. The proposed algorithm minimizes the total computation time taking account of dynamic changes of the processing speed by the two technologies, in addition to the network contention among the processors. We constructed a clock speed model with which the changes of processing speed with Turbo Boost and Hyper-threading can be estimated for various processor usage patterns. We then constructed a new scheduling algorithm that minimizes the total execution time of a task graph considering network contention and the two technologies. We evaluated the proposed algorithm by simulations and experiments with a multiprocessor system consisting of 4 PCs. In the experiment, the proposed algorithm produced a schedule that reduces the total execution time by 36% compared to conventional methods which are straightforward extensions of an existing method.
Featuring a brief overview of fault-tolerant mechanisms across various Big Data systems such as Google File system (GFS), Amazon Dynamo, Bigtable, Hadoop - Map Reduce, Facebook Cassandra along with description of an existing fault tolerant model
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
Abstract: The increasing energy consumption of Physical Machines (PM) in cloud data centers is nowadays a major problem, it has a negative impact on the environment while at the same time increasing the operational costs of data centers. This fosters the development of more energy-efficient scheduling approaches. In this study, we study the barriers of knowledge in energy efficiency for cloud data centers.
And introdution to MR and Hadoop and an view on the opportunities to use MR with databases i.e., SQL-MapReduce by Teradata and In-database MR by Oracle.
The presentation was used during a class of Datenbanken Implementierungstechniken in 2013.
The Linux Scheduler: a Decade of Wasted Coresyeokm1
The talk I gave at Papers We Love #20 (Singapore) about this academic paper "The Linux Scheduler: a Decade of Wasted Cores" by a few researchers.
The video of this talk can be found here: https://engineers.sg/v/758
Here are some relevant links:
Paper: http://www.ece.ubc.ca/~sasha/papers/eurosys16-final29.pdf
Reference Slides: http://www.i3s.unice.fr/~jplozi/wastedcores/files/extended_talk.pdf
Reference summary: https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-cores/
Generic and Meta-Transformations for Model Transformation EngineeringDaniel Varro
"Although many model transformation approaches exist, their focus is almost exclusively put on functional correctness and intuitive description language while the importance of engineering issues such as reusability, maintainability, performance or compactness are neglected. To tackle these problems following the MDA philosophy, we argue in the paper that model transformations should also be regarded as models (i.e., as data). We demonstrate (i) how generic transformations can provide a very compact description of certain transformation problems and (ii) how meta-transformations can be designed that yield efficient transformations as their output model."
This is an extract from the abstract of our UML 2004 paper. In the current talk, I provide (1) a brief summary of the paper itself, and (2) an overview of research results on higher-order transformations in the past 10 years. This paper was also the first to present the VIATRA2 model transformation framework based on Eclipse, so a brief history of the tool itself will conclude the talk.
A brief overview of linux scheduler, context switch , priorities and scheduling classes as well as new features. Also provides an overview of preemption models in linux and how to use each model. all the examples are taken from http://www.discoversdk.com
- What we mean by EAS core and how it's distinct from the other components - also why it's so difficult to get it merged. (This is driven by key partner concerns).
- An update on misc work that's underway to resolve the upstreaming.
- Misc load balance pathway enhancements
- Wakeup pathway mods (cleanups, basic big.LITTLE capacity awareness etc)
- Periodic load balancer mods.
- Energy model expression (why this is important, partner perspectives/experience and bottlenecks)
- Proposals to get an expression into the mainline
- Optional boot-time auto-detection of capacity over-ridable by sysfs
- Leveraging the merged power coefficient bindings
- Leveraging the OPP bindings
.. to effectively get to EAS' struct sched_group_energy.
- How we are structuring things to ease upstream acceptance. What's helping, what's not, where partners can help.
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORScscpconf
Our main aim of research is to find the limit of Amdahl's Law for multicore processors, to make number of cores giving more efficiency to overall architecture of the CMP(Chip Multi
Processor a.k.a. Multicore Processor). As it is expected this limit will be in the architecture of Multicore Processor, or in the programming. We surveyed the architecture of the Multicore
processors of various chip manufacturers namely INTEL™, AMD™, IBM™ etc., and the various techniques there followed in, for improving the performance of the Multicore
Processors. We conducted cluster experiments to find this limit. In this paper we propose an alternate design of Multicore processor based on the results of our cluster experiment.
Affect of parallel computing on multicore processorscsandit
Our main aim of research is to find the limit of Amdahl's Law for multicore processors, to make
number of cores giving more efficiency to overall architecture of the CMP(Chip Multi
Processor a.k.a. Multicore Processor). As it is expected this limit will be in the architecture of
Multicore Processor, or in the programming. We surveyed the architecture of the Multicore
processors of various chip manufacturers namely INTEL™, AMD™, IBM™ etc., and the
various techniques there followed in, for improving the performance of the Multicore
Processors.
We conducted cluster experiments to find this limit. In this paper we propose an alternate design
of Multicore processor based on the results of our cluster experiment.
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryDoKC
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery - Shivani Gupta, Elotl & Sergey Pronin, Percona
Disaster Recovery(DR) is critical for business continuity in the face of widespread outages taking down entire data centers or cloud provider regions. DR relies on deployment to multiple locations, data replication, monitoring for failure and failover. The process is typically manual involving several moving parts, and, even in the best case, involves some downtime for end-users. A multi-cluster K8s control plane presents the opportunity to automate the DR setup as well as the failure detection and failover. Such automation can dramatically reduce RTO and improve availability for end-users. This talk (and demo) describes one such setup using the open source Percona Operator for PostgreSQL and a multi-cluster K8s orchestrator. The orchestrator will use policy driven placement to replicate the entire workload on multiple clusters (in different regions), detect failure using pluggable logic, and do failover processing by promoting the standby as well as redirecting application traffic
In the last few years energy efficiency of large scale infrastructures gained a lot of attention, as power consumption became one of the most impacting factors of the operative costs of a data-center and of its Total Cost of Ownership. Power consumption can be observed at different layers of the data-center: from the overall power grid, moving to each rack and arriving to each machine and system. Given the rise of application containers in the cloud computing scenario, it becomes more and more important to measure power consumption also at the application level, where power-aware schedulers and orchestrators can optimize the execution of the workloads not only from a performance perspective, but also considering performance/power trade-offs. DEEP-mon is a novel monitoring tool able to measure power consumption and attribute it for each thread and application container running in the system, without any previous knowledge regarding the characteristics of the application and without any kind of workload instrumentation. DEEP-mon is able to aggregate data for threads, application containers and hosts with a negligible impact on the monitored system and on the running workloads.
Information obtained with DEEP-mon open the way for a wide set of applications exploiting the capabilities offered by the monitoring tool, from power (and hence cost) metering of new software components deployed in the data center, to fine grained power capping and power-aware scheduling and co-location.
One of the biggest issues for a developer – whether they are an engineer at an OEM or working for a mobile AI application startup – is that their apps are at the mercy of pre-set power and performance settings as defined by OEMs or Silicon vendors. So how can a developer break through that barrier when it seems their hands are tied behind their backs? The Snapdragon Power Optimization SDK allows developers to control the CPU and GPU frequency much more finely from their own application logic. This provides developers with more control within the bounds of the power/thermal framework.
Session ID: SFO17-307
Session Name: WALT vs PELT : Redux
- SFO17-307
Speaker: Pavan Kumar Kondeti
Track: LMG
★ Session Summary ★
New data on the comparison of the WALT and PELT load tracking schemes in the scheduler
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-307/
Presentation:
Video: https://www.youtube.com/watch?v=r3QKEYpyetU
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
LAS16-105: Walkthrough of the EAS kernel adaptation to the Android Common KernelLinaro
LAS16-105: Walkthrough of the EAS kernel adaptation to the Android Common Kernel
Speakers: Juri Lelli
Date: September 26, 2016
★ Session Description ★
Walkthrough of the EAS kernel adaptation to the Android Common Kernel.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-105
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-105/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Fast switching of threads between cores - Advanced Operating SystemsRuhaim Izmeth
Fast switching of threads between cores is a published research paper on Operating systems, This is our attempt to decode the research and present to the class
DYNAMIC TASK SCHEDULING ON MULTICORE AUTOMOTIVE ECUSVLSICS Design
Automobile manufacturers are controlled by stringent govt. regulations for safety and fuel emissions and
motivated towards adding more advanced features and sophisticated applications to the existing electronic
system. Ever increasing customer’s demands for high level of comfort also necessitate providing even more
sophistication in vehicle electronics system. All these, directly make the vehicle software system more
complex and computationally more intensive. In turn, this demands very high computational capability of
the microprocessor used in electronic control unit (ECU). In this regard, multicore processors have
already been implemented in some of the task rigorous ECUs like, power train, image processing and
infotainment. To achieve greater performance from these multicore processors, parallelized ECU software
needs to be efficiently scheduled by the underlaying operating system for execution to utilize all the
computational cores to the maximum extent possible and meet the real time constraint. In this paper, we
propose a dynamic task scheduler for multicore engine control ECU that provides maximum CPU
utilization, minimized preemption overhead, minimum average waiting time and all the tasks meet their
real time deadlines while compared to the static priority scheduling suggested by Automotive Open Systems
Architecture (AUTOSAR).
Similar to LCU13: Power-efficient scheduling, and the latest news from the kernel summit (20)
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
Short
The growing amount of data captured by sensors and the real time constraints imply that not only big data analytics but also Machine Learning (ML) inference shall be executed at the edge. The multiple options for neural network acceleration in Arm-based platforms provide an unprecedented opportunity for new intelligent devices. It also raises the risk of fragmentation and duplication of efforts when multiple frameworks shall support multiple accelerators.
Andrea Gallo, Linaro VP of Segment Groups, will summarise the existing NN frameworks, accelerator solutions, and will describe the efforts underway in the Arm ecosystem.
Abstract
The dramatically growing amount of data captured by sensors and the ever more stringent requirements for latency and real time constraints are paving the way for edge computing, and this implies that not only big data analytics but also Machine Learning (ML) inference shall be executed at the edge. The multiple options for neural network acceleration in recent Arm-based platforms provides an unprecedented opportunity for new intelligent devices with ML inference. It also raises the risk of fragmentation and duplication of efforts when multiple frameworks shall support multiple accelerators.
Andrea Gallo, Linaro VP of Segment Groups, will summarise the existing NN frameworks, model description formats, accelerator solutions, low cost development boards and will describe the efforts underway to identify the best technologies to improve the consolidation and enable the competitive innovative advantage from all vendors.
Audience
The session will be useful for executives to engineers. Executives will gain a deeper understanding of the issues and opportunities. Engineers at NN acceleration IP design houses will take away ideas for how to collaborate in the open source community on their area of expertise, how to evaluate the performance and accelerate multiple NN frameworks without modifying them for each new IP, whether it be targeting edge computing gateways, smart devices or simple microcontrollers.
Benefits to the Ecosystem
The AI deep learning neural network ecosystem is starting just now and it has similar implications with open source as GPU and video accelerators had in the early days with user space drivers, binary blobs, proprietary APIs and all possible ways to protect their IPs. The session will outline a proposal for a collaborative ecosystem effort to create a common framework to manage multiple NN accelerators while at the same time avoiding to modify deep learning frameworks with multiple forks.
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
Talk Title: Huawei’s requirements for the ARM based HPC solution readiness
Talk Abstract:
A high level review of a wide range of requirements to architect an ARM based competitive HPC solution is provided. The review combines both Industry and Huawei’s unique views with the intend to communicate openly not only the alignment and support in ongoing efforts carried over by other ARM key players but to brief on the areas of differentiation that Huawei is investing towards the research, development and deployment of homegrown ARM based HPC solution(s).
Speaker: Joshua Mora
Speaker Bio:
20 years of experience in research and development of both software and hardware for high performance computing. Currently leading the architecture definition and development of ARM based HPC solutions, both hardware and software, all the way to the applications (ie. turnkey HPC solutions for different compute intensive markets where ARM will succeed !!).
Bud17 113: distribution ci using qemu and open qaLinaro
“Delivering a well working distribution is hard. There are a lot of different hardware platforms that need to be verified and the software stack is in a big flux during development phases. In rolling releases, this gets even worse, as nothing ever stands still. The only sane answer to that problem are working Continuous Integration tests. The SUSE way to check whether any change breaks normal distribution behavior is OpenQA. Using OpenQA we can automatically run tests that hard working QA people did manually in the old days. That way we have fast enough turnaround times to find and reject breaking changes This session shows how OpenQA works, what pitfalls we had to make ARM work with OpenQA and what we’re doing to improve it for ARM specific use cases.”
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
Speaker: Renato Golin
Speaker Bio:
He started programming in the late 80's in C for PCs after a few years playing with 8-bit computers, but he only started programming professionally in the late 90's during the .com bubble. After many years working on Internet's back-end, he moved to UK and worked a few years on bioinformatics at EBI before joining ARM, where he worked on the DS-5 debugger and on the EDG-to-LLVM bridge, where he became the LLVM Tech Lead. Recently, he worked with large clusters and big data at HPCC before moving to Linaro.
Talk Title: OpenHPC Automation with Ansible
Talk Abstract: "In order to test OpenHPC packages and components and to use it as a
platform to benchmark HPC applications, Linaro is developing an automated deployment strategy, using Ansible, Mr-Provisioner and Jenkins, to install the
OS, OpenHPC and prepare the environment on varied architectures (Arm, x86). This work is meant to replace the existing ageing Bash-based recipes upstream while still keeping the documents intact. Our aim is to make it easier to vary hardware configuration, allow for different provisioning techniques and mix internal infrastructure logic to different labs, while still using the same recipes. We hope this will help more people use OpenHPC with a better out-of-the-box experience and with more robust results"
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
Speaker: Pavel Shamis
Company: Arm
Speaker Bio:
"Pavel is a Principal Research Engineer at ARM with over 16 years of experience in development HPC solutions. His work is focused on co-design software and hardware building blocks for high-performance interconnect technologies, development communication middleware and novel programming models. Prior to joining ARM, he spent five years at Oak Ridge National Laboratory (ORNL) as a research scientist at Computer Science and Math Division (CSMD). In this role, Pavel was responsible for research and development multiple projects in high-performance communication domain including: Collective Communication Offload (CORE-Direct & Cheetah), OpenSHMEM, and OpenUCX. Before joining ORNL, Pavel spent ten years at Mellanox Technologies, where he led Mellanox HPC team and was one of the key driver in enablement Mellanox HPC software stack, including OFA software stack, OpenMPI, MVAPICH, OpenSHMEM, and other.
Pavel is a recipient of prestigious R&D100 award for his contribution in development of the CORE-Direct collective offload technology and he published in excess of 20 research papers.
"
Talk Title: HPC network stack on ARM
Talk Abstract:
Applications, programming languages, and libraries that leverage sophisticated network hardware capabilities have a natural advantage when used in today¹s and tomorrow's high-performance and data center computer environments. Modern RDMA based network interconnects provides incredibly rich functionality (RDMA, Atomics, OS-bypass, etc.) that enable low-latency and high-bandwidth communication services. The functionality is supported by a variety of interconnect technologies such as InfiniBand, RoCE, iWARP, Intel OPA, Cray¹s Aries/Gemini, and others. Over the last decade, the HPC community has developed variety user/kernel level protocols and libraries that enable a variety of high-performance applications over RDMA interconnects including MPI, SHMEM, UPC, etc. With the emerging availability HPC solutions based on ARM CPU architecture it is important to understand how ARM integrates with the RDMA hardware and HPC network software stack. In this talk, we will overview ARM architecture and system software stack, including MPI runtimes, OpenSHMEM, and OpenUCX.
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...Linaro
Speaker: Jay Kruemcke
Speaker Company: SUSE
Bio:
"Jay is responsible for the SUSE Linux server products for High Performance Computing, 64-bit ARM systems, and SUSE Linux for IBM Power servers.
Jay has built an extensive career in product management including using social media for client collaboration, product positioning, driving future product directions, and evangelizing the capabilities and future directions for dozens of enterprise products.
"
Talk Title: It just keeps getting better - SUSE enablement for Arm
Talk Abstract:
SUSE has been delivering commercial Linux support for Arm based servers since 2016. Initially the focus was on high end servers for HPC and Ceph based software defined storage. But we have enabled a number of other Arm SoCs and are even supporting the Raspberry Pi. This session will cover the SUSE products that are available for the Arm platform and view to the future.
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
Speakers: Gilad Shainer and Scot Schultz
Company: Mellanox Technologies
Talk Title: Intelligent Interconnect Architecture to Enable Next
Generation HPC
Talk Abstract:
The latest revolution in HPC interconnect architecture is the development of In-Network Computing, a technology that enables handling and accelerating application workloads at the network level. By placing data-related algorithms on an intelligent network, we can overcome the new performance bottlenecks and improve the data center and applications performance. The combination of In-Network Computing and ARM based processors offer a rich set of capabilities and opportunities to build the next generation of HPC platforms.
Gilad Shainer Bio:
Gilad Shainer has served as Mellanox's vice president of marketing since March 2013. Previously, Mr. Shainer was Mellanox's vice president of marketing development from March 2012 to March 2013. Mr. Shainer joined Mellanox in 2001 as a design engineer and later served in senior marketing management roles between July 2005 and February 2012. Mr. Shainer holds several patents in the field of high-speed networking and contributed to the PCI-SIG PCI-X and PCIe specifications. Gilad Shainer holds a MSc degree (2001, Cum Laude) and a BSc degree (1998, Cum Laude) in Electrical Engineering from the Technion Institute of Technology in Israel.
Scot Schultz Bio:
Scot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies. Joining the Mellanox team in 2013, Schultz is 30-year veteran of the computing industry. Prior to joining Mellanox, he spent the past 17 years at AMD in various engineering and leadership roles in the area of high performance computing. Scot has also been instrumental with the growth and development of various industry organizations including the Open Fabrics Alliance, and continues to serve as a founding board-member of the OpenPOWER Foundation and Director of Educational Outreach and founding member of the HPC-AI Advisory Council.
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018
Bio: "Yutaka Ishikawa is the project leader of developing the post K
supercomputer. From 1987 to 2001, he was a member of AIST (former
Electrotechnical Laboratory), METI. From 1993 to 2001, he was the
chief of Parallel and Distributed System Software Laboratory at Real
World Computing Partnership. He led development of cluster system
software called SCore, which was used in several large PC cluster
systems around 2004. From 2002 to 2014, he was a professor at the
University Tokyo. He led a project to design a commodity-based
supercomputer called T2K open supercomputer. As a result, three
universities, Tsukuba, Tokyo, and Kyoto, obtained each supercomputer
based on the specification in 2008. He was also involved with the
design of the Oakleaf-PACS, the successor of T2K supercomputer in both
Tsukuba and Tokyo, whose peak performance is 25PF."
Session Title: Post-K and Arm HPC Ecosystem
Session Description:
"Post-K, a flagship supercomputer in Japan, is being developed by Riken
and Fujitsu. It will be the first supercomputer with Armv8-A+SVE.
This talk will give an overview of Post-K and how RIKEN and Fujitsu
are currently working on software stack for an Arm architecture."
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
Event: Arm Architecture HPC Workshop by Linaro and HiSilicon
Location: Santa Clara, CA
Speaker: Andrew J Younge
Talk Title: Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Supercomputing
Talk Desc: The Vanguard program looks to expand the potential technology choices for leadership-class High Performance Computing (HPC) platforms, not only for the National Nuclear Security Administration (NNSA) but for the Department of Energy (DOE) and wider HPC community. Specifically, there is a need to expand the supercomputing ecosystem by investing and developing emerging, yet-to-be-proven technologies and address both hardware and software challenges together, as well as to prove-out the viability of such novel platforms for production HPC workloads.
The first deployment of the Vanguard program will be Astra, a prototype Petascale Arm supercomputer to be sited at Sandia National Laboratories during 2018. This talk will focus on the arthictecural details of Astra and the significant investments being made towards the maturing the Arm software ecosystem. Furthermore, we will share initial performance results based on our pre-general availability testbed system and outline several planned research activities for the machine.
Bio: Andrew Younge is a R&D Computer Scientist at Sandia National Laboratories with the Scalable System Software group. His research interests include Cloud Computing, Virtualization, Distributed Systems, and energy efficient computing. Andrew has a Ph.D in Computer Science from Indiana University, where he was the Persistent Systems fellow and a member of the FutureGrid project, an NSF-funded experimental cyberinfrastructure test-bed. Over the years, Andrew has held visiting positions at the MITRE Corporation, the University of Southern California / Information Sciences Institute, and the University of Maryland, College Park. He received his Bachelors and Masters of Science from the Computer Science Department at Rochester Institute of Technology (RIT) in 2008 and 2010, respectively.
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
Session ID: HKG18-501
Session Name: HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Speaker: Chris Redpath
Track: Mobile, Kernel
★ Session Summary ★
This session will introduce the changes to EAS planned for 4.14 kernel, and how Arm hopes that EAS will develop in future. EAS has already evolved from an Arm/Linaro joint project to involving a much wider community of SoC vendors, Google and interested device manufacturers. We will highlight the product-specific pieces remaining in the Android Common Kernel EAS implementation, and our plans to provide an upstreaming plan for each product feature. In particular, the new 'simplified energy model' is designed to provide mainline-friendliness and comparable performance using a simple DT expression of cpu power/performance.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-501/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-501.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-501.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Mobile, Kernel
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
"Session ID: HKG18-501
Session Name: HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
Speaker: Chris Redpath
Track: Mobile, Kernel
★ Session Summary ★
This session will introduce the changes to EAS planned for 4.14 kernel, and how Arm hopes that EAS will develop in future. EAS has already evolved from an Arm/Linaro joint project to involving a much wider community of SoC vendors, Google and interested device manufacturers. We will highlight the product-specific pieces remaining in the Android Common Kernel EAS implementation, and our plans to provide an upstreaming plan for each product feature. In particular, the new 'simplified energy model' is designed to provide mainline-friendliness and comparable performance using a simple DT expression of cpu power/performance.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-501/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-501.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-501.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Mobile, Kernel
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
"Session ID: HKG18-315
Session Name: HKG18-315 - Why the ecosystem is a wonderful thing warts and all
Speaker: Andrew Wafaa
Track: Ecosystem Day
★ Session Summary ★
The Arm ecosystem is a vibrant place, but it's not always smooth sailing. This presentation will go through the highs and lows of getting the ecosystem fully Arm enabled.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-315/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-315.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-315.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Ecosystem Day
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
"Session ID: HKG18-115
Session Name: HKG18-115 - Partitioning ARM Systems with the Jailhouse Hypervisor
Speaker: Jan Kiszka
Track: Security
★ Session Summary ★
The open source hypervisor Jailhouse provides hard partitioning of multicore systems to co-locate multiple Linux or RTOS instances side by side. It aims at low complexity and minimal footprint to achieve deterministic behavior and enable certifications according to safety or security standards. In this session, we would like to look at the ARM-specific status of Jailhouse and discuss applications, to-dos and possible collaborations around it with the ARM community. The session is intended to be half presentation, half Q&A / discussion.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-115/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-115.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-115.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Security
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
"Session ID: HKG18-TR08
Session Name: HKG18-TR08 - Upstreaming SVE in QEMU
Speaker: Alex Bennée,Richard Henderson
Track: Enterprise
★ Session Summary ★
ARM's Scalable Vector Extensions is an innovative solution to processing highly data parallel workloads. While several out-of-tree attempts at implementing SVE support for QEMU existed, we took a fundamentally different approach to solving key challenges and therefore pursued a from-scratch QEMU SVE implementation in Linaro. Our strategic choice was driven by several factors. First as an ""upstream first"" organisation we were focused on a solution that would be readily accepted by the upstream project. This entailed doing our development in the open on the project mailing lists where early feedback and community consensus can be reached.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-tr08/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-tr08.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-tr08.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Enterprise
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
HKG18-113- Secure Data Path work with i.MX8MLinaro
"Session ID: HKG18-113
Session Name: HKG18-113 - Secure Data Path work with i.MX8M
Speaker: Cyrille Fleury
Track: Digital Home
★ Session Summary ★
NXP presentation on Secure Data Path work with i.MX8M Soc. Demonstrate 4K PlayReady playback with Android 8.1 running on i.MX8M. Focus on security (MS SL3000 and Widevine level 1)
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-113/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-113.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-113.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Digital Home
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
"Session ID: HKG18-120
Session Name: HKG18-120 - Structured Documentation and Validation for Device Tree
Speaker: Grant Likely
Track: Kernel
★ Session Summary ★
Devicetree has become the dominant hardware configuration language used when building embedded systems. Projects using Devicetree now include Linux, U-Boot, Android, FreeBSD, and Zephyr. However, it is notoriously difficult to write correct Devicetree data files. The dtc tools perform limited tests for valid data, and there there is not yet a way to add validity test for specific hardware descriptions. Neither is there a good way to document requirements for specific bindings. Work is underway to solve these problems. This session will present a proposal for adding Devicetree schema files to the Devicetree toolchain that can be used to both validate data and produce usable documentation.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-120/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-120.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-120.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Kernel
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
"Session ID: HKG18-223
Session Name: HKG18-223 - Trusted Firmware M : Trusted Boot
Speaker: Tamas Ban
Track: LITE
★ Session Summary ★
An overview of the trusted boot concept and firmware update on the ARMv8-M based platform and how MCUBoot acts as a BL2 bootloader for TF-M.
Trusted Firmware M
In October 2017, Arm announced the vision of Platform Security Architecture (PSA) - a common framework to allow everyone in the IoT ecosystem to move forward with stronger, scalable security and greater confidence. There are three key stages to the Platform Security Architecture: Analysis, Architecture and Implementation which are described at https://developer.arm.com/products/architecture/platform-security-architecture.
_Trusted Firmware M, i.e. TF-M, is the Arm project to provide an open source reference implementation firmware that will conform to the PSA specification for M-Class devices. Early access to TF-M was released in December 2017 and it is being made public during Linaro Connect. The implementation should be considered a prototype until the PSA specifications reach release state and the code aligns._
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-223/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-223.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-223.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: LITE
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961"
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
2. 2
Topics Overview
Timeline
Towards a unified scheduler driven power policy
Task placement based on CPU suitability
Kernel Summit Feedback
Status
Questions?
3. 3
Timeline
May – Ingo's response to the task packing patches from
VincentG reignited discussions on power-aware scheduling
Early July – Posted proposed patches for a power aware
scheduler based on a power driver running in conjunction
with the current scheduler
Avoid big changes to the already complex current scheduler
Migrate functionality back in to the scheduler when we had worked
out the kinks
Sept – At Plumbers there was a relatively broad agreement
with the approach
October – Morten reposts patchset with refined APIs between
power driver and the scheduler
LKS – Reopened the discussion. More on this later
4. 4
Unified scheduler driven power policy … Why ?
big.LITTLE MP patches are tested, stable and performant
Take the principles learnt during the implementation and apply to
an upstream solution
Existing power management frameworks are not coordinated
(cpufreq, cpuidle) with the scheduler
E.g. the scheduler decides which cpu to wake up or idle without
having any knowledge about C-states. cpuidle is left to do its best
based on these uninformed choices.
The scheduler is the most obvious place coordinate power
management at it has the best view of the overall system load.
The scheduler knows when tasks are scheduled and decides the
load balance. cpufreq has to wait until it can see the result of the
scheduler decisions before it can react.
Task packing in the scheduler needs P and C-state information to
make informed decisions.
5. 5
Existing Power Policies
Frequency scaling: cpufreq
Generic governor + platform specific driver
Decides target frequency based on overall cpu load.
Idle state selection: cpuidle
Generic governor + platform specific driver
Attempts to predict idle time when cpus enter idle.
Scheduler:
Completely generic and unaware of cpufreq and cpuidle policies.
Determines when and where a task runs, i.e. on which cpu.
Task placement considering CPU suitability required.
6. 6
cpu1cpu1
Existing Power Policies
cpu0cpu0
Freq Load
T
Scheduler
policy
cpufreq
policy
cpuidle
policy
Powerrq
T
Load balance
idle
Current load (pre-3.11)
Current load (3.11)
No coordination between power policies to avoid
conflicting/suboptimal decisions.
Is it a problem?
7. 7
Issues
Scheduler->cpufreq->scheduler cpu load feedback loop
From 3.11 the scheduler uses tracked load for load-balancing.
Tracked load is impacted by frequency scaling. Lower frequency
leads to higher tracked load for the same task.
Hindering new power-aware scheduling features
Task packing: Needs feedback from cpufreq to determine when cpus
are full.
Topology aware task placement: Needs topology information inside
the scheduler to determine the most optimal cpus to use when the
system is partially loaded.
Heterogeneous systems (big.LITTLE): Needs topology information
and accurate load tracking.
Thermal also needs to be considered
8. 8
Power scheduler proposal
Power driver (drivers/*/?.c)Scheduler (fair.c) Power framework (power.c)
Helper function
library
Driver registrationsched_domain
Hierarchy
(Generic topology)
Load balance
algorithms
Detailed platform
topology
Platform HW driver
Load tracking
Platform perf. and
energy monitoring
Performance state
selection
Sleep state
selection
“Important tasks”
cgroup
+ New generic info
(pack, heterogeneous, ...)
+ Packing,
+ P & C-state aware,
+ Heterogeneous
+ Scale invariant
Abstract power
driver/topology
interface
Existing policy algorithms
Library (drivers/power/?.c)
9. 9
Task placement based on CPU suitability
Part of the power scheduler proposal
sched_domain hierarchy
Load balance algorithm (Heterogeneous)
Existing big.LITTLE MP Patches
Definition: CFS scheduler optimization for heterogeneous platforms.
Attempts to select task affinity to optimize power and performance
based on task load and CPU type
Hosted at
http://git.linaro.org/gitweb?p=arm/big.LITTLE/mp.git
Co-exists with existing (CFS) scheduler code
Guarded by CONFIG_SCHED_HMP
Setup HMP domains as a dependency to topology code
Implement big.LITTLE MP functionality inside scheduler mainline code
10. 10
Task placement scheduler architectural bricks
1) Additional sched domain data structures
2) Specify sched domain level for task placement
3) Unweighted instantaneous load signal
4) Task placement hook in select task
5) Task placement hook in load balance
6) Task placement idle pull
11. 11
Brick 1: Additional sched domain data structures
big.LITTLE MP:
struct hmp_domain
struct hmp_domain {
struct cpumask cpus;
struct cpumask possible_cpus;
struct list_head hmp_domains;
}
Task placement based on CPU suitability:
Use the existing sched groups in CPU sched domain level
Add task load ranges into CPU, sched domain and group
12. 12
Brick 2: Specify sched domain level
big.LITTLE MP:
No additional sched domain flag
Deletes SD_LOAD_BALANCE flag in CPU level
Task placement based on CPU suitability:
Adds SD_SUITABILITY flag to CPU level
13. 13
Brick 3: Unweighted instantaneous load signal
big.LITTLE MP & Task placement based on CPU suitability:
For sched entity and cfs_rq
struct sched_avg {
u32 runnable_avg_sum, runnable_avg_period;
u64 last_runnable_update;
s64 decay_count;
unsigned long load_avg_contrib;
unsigned long load_avg_ratio;
}
sched entity: runnable_avg_sum * NICE_0_LOAD / (runnable_avg_period + 1)
cfs_rq: set in [update/enqueue/dequeue]_entity_load_avg()
14. 14
Brick 4: Task placement hook in select task
big.LITTLE MP:
Force new non-kernel tasks onto big CPUs until
load stabilises
Least loaded CPU of big cluster is used
Task placement based on CPU suitability:
Use task load ranges of previous CPU and
(initialized) task load ratio to set new CPU
15. 15
Brick 5: Task placement hook in load balance
big.LITTLE MP:
Completely bypasses load_balance() in CPU level
hmp_force_up_migration() in run_rebalance_domains()
Calls hmp_up_migration() for migration to faster CPU
Calls hmp_offload_down() for using little CPUs when idle
Does not use env->imbalance or something equivalent
Task placement based on CPU suitability:
Happens inside load_balance()
Find most unsuitable queue (i.e. find source run-queue)
Move unsuitable tasks (counterpart to load balance)
Move one unsuitable task (counterpart to active load balance)
Cannot use env->imbalance to control load balance
Using grp_load_avg_ratio/(NICE_0_LOAD * sg->group_weight) <= THRESHOLD
Falling back to 'mainline load balance' in case condition is not meet (destination
group is already overloaded)
16. 16
Brick 6: Task placement idle pull
big.LITTLE MP:
Big CPU pulls running task above the threshold from little CPU
Task placement based on CPU suitability:
Not necessary because idle_balance()->load_balance() is not
suppressed on CPU level by missing SD_LOAD_BALANCE flag
Idle pull happens inside load_balance
17. 17
Kernel Summit Feedback
Good to get active discussion
First time with everybody in the same room
LWN article - “The power-aware scheduling mini-summit”
Key points made
Power benchmarks are needed for evaluation
Use-case descriptions are needed to define common ground.
The scheduler needs energy/power information to make power-aware
scheduling decisions.
Power-awareness should be moved into the scheduler.
cpufreq is not fit for its purpose and should go away.
cpuidle will be integrated in the scheduler. Possibly support by
new per task properties, such as latency constraints
Are there ways to replay energy scenarios?
Linsched or perf sched
18. 18
Kernel Summit feedback observations
All part of the open-source process
Discussions have raised awareness of the issues
Maintainers recognise the need for improved power management
Iterative approach necessary but the steps are clear
Maintainers have a clear server/desktop background
ARM community can help educate this audience on embedded
requirements
Benchmarking for power could be hard to do in a simple way
Cyclic test, sysbench type tests unlikely to yield realistic results in real
systems
However, full accuracy not required
Power models necessarily complex and often closely guarded
secrets
Collection and reporting of meaningful metrics is probably sufficient
19. 19
Status
Latest Power-aware scheduling patches on LKML
https://lkml.org/lkml/2013/10/11/547
Task placement based on CPU suitability patches prepared
Proof of concept done
Waiting for right time to post to lists
Feedback from Linux kernel Summit needs to be discussed