The document discusses using Neural Engine on A11 and A12 devices. It provides log outputs showing Neural Engine (ANE) being used on an iPhone Xs Max and not being used on an iPhone 8 Plus and iPhone 6s, which have A11 and earlier chips. It also shares code for checking the compute units and provides links to example projects for using Neural Engine on Core ML models.
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
This is a talk at AI Nextcon Seattle on Feb 12, 2020.
An overview of TensorFlow Lite and various resources for helping you deploy TFLite models to mobile and edge devices. Walk through an example of end to end on-device ML: train a model from scratch, convert to TFLite and deploy it.
Session ID: SFO17-307
Session Name: WALT vs PELT : Redux
- SFO17-307
Speaker: Pavan Kumar Kondeti
Track: LMG
★ Session Summary ★
New data on the comparison of the WALT and PELT load tracking schemes in the scheduler
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-307/
Presentation:
Video: https://www.youtube.com/watch?v=r3QKEYpyetU
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Intel® Software
Explore how to build a unified framework based on FFmpeg and GStreamer to enable video analytics on all Intel® hardware, including CPUs, GPUs, VPUs, FPGAs, and in-circuit emulators.
About the author: Priya Autee is software engineer at Intel working on various leading edge IA features and Intel(R) RDT expert. She is focused on prototyping and researching open source APIs like DPDK, Intel(R) RDT etc. to support NFV/compute sensitive requirements on Intel Architecture. She holds Masters in Computer Science from Arizona State University, Arizona.
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
This is a talk at AI Nextcon Seattle on Feb 12, 2020.
An overview of TensorFlow Lite and various resources for helping you deploy TFLite models to mobile and edge devices. Walk through an example of end to end on-device ML: train a model from scratch, convert to TFLite and deploy it.
Session ID: SFO17-307
Session Name: WALT vs PELT : Redux
- SFO17-307
Speaker: Pavan Kumar Kondeti
Track: LMG
★ Session Summary ★
New data on the comparison of the WALT and PELT load tracking schemes in the scheduler
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-307/
Presentation:
Video: https://www.youtube.com/watch?v=r3QKEYpyetU
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Intel® Software
Explore how to build a unified framework based on FFmpeg and GStreamer to enable video analytics on all Intel® hardware, including CPUs, GPUs, VPUs, FPGAs, and in-circuit emulators.
About the author: Priya Autee is software engineer at Intel working on various leading edge IA features and Intel(R) RDT expert. She is focused on prototyping and researching open source APIs like DPDK, Intel(R) RDT etc. to support NFV/compute sensitive requirements on Intel Architecture. She holds Masters in Computer Science from Arizona State University, Arizona.
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
This manual is “How to Build” manual for OpenCV with OpenCL for Android.
If you want to “Use OpenCL on OpenCV” ONLY,
Please see
http://github.com/noritsuna/OpenCVwithOpenCL4AndroidNDKSample
IRQs: the Hard, the Soft, the Threaded and the PreemptibleAlison Chaiken
The Linux kernel supports a diverse set of interrupt handlers that partition work into immediate and deferred tasks. The talk introduces the major varieties and explains how IRQs differ in the real-time kernel.
syzkaller is an unsupervised, coverage-guided Linux syscall fuzzer.
The presentation covers basic of operation of the fuzzer, gives tutorial on how to run it and how to extend it to fuzz new drivers.
Accelerated Linux Core Dump Analysis training public slidesDmitry Vostokov
The slides from Software Diagnostics Services Linux core dump analysis training. The training description: "Learn how to analyse Linux process crashes and hangs, navigate through process core memory dump space and diagnose corruption, memory leaks, CPU spikes, blocked threads, deadlocks, wait chains, and much more. This book uses a unique and innovative pattern-oriented diagnostic analysis approach to speed up the learning curve. The training consists of 13 practical step-by-step exercises using GDB debugger highlighting more than 25 memory analysis patterns diagnosed in 64-bit process core memory dumps. The training also includes source code of modelling applications, a catalogue of relevant patterns from Software Diagnostics Institute, and an overview of relevant similarities and differences between Windows and Linux user space memory dump analysis useful for engineers with Wintel background."
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
Talk by Brendan Gregg for USENIX ATC 2017.
"Flame graphs are a simple stack trace visualization that helps answer an everyday problem: how is software consuming resources, especially CPUs, and how did this change since the last software version? Flame graphs have been adopted by many languages, products, and companies, including Netflix, and have become a standard tool for performance analysis. They were published in "The Flame Graph" article in the June 2016 issue of Communications of the ACM, by their creator, Brendan Gregg.
This talk describes the background for this work, and the challenges encountered when profiling stack traces and resolving symbols for different languages, including for just-in-time compiler runtimes. Instructions will be included generating mixed-mode flame graphs on Linux, and examples from our use at Netflix with Java. Advanced flame graph types will be described, including differential, off-CPU, chain graphs, memory, and TCP events. Finally, future work and unsolved problems in this area will be discussed."
Talk by Brendan Gregg for YOW! 2021. "The pursuit of faster performance in computing is the driving reason for many new technologies and updates. This talk discusses performance improvements now underway that you will likely be adopting soon, for processors (including 3D stacking and cloud vendor CPUs), memory (including DDR5 and high-bandwidth memory [HBM]), disks (including 3D Xpoint as a 3D NAND accelerator), networking (including QUIC and eXpress Data Path [XDP]), runtimes, hypervisors, and more. The future of performance is increasingly cloud-based, with hardware hypervisors and custom processors, meaningful observability of everything down to cycle stalls (even as cloud guests), and high-speed syscall-avoiding applications that use eBPF, FPGAs, and io_uring. The talk also discusses where future performance improvements might be expected, with predictions for new technologies."
HKG18-TR14 - Postmortem Debugging with CoresightLinaro
Session ID: HKG18-TR14
Session Name: HKG18-TR14 - Postmortem Debugging with Coresight
Speaker: Leo Yan
Track: Training
★ Session Summary ★
For most cases we can easily debug with kernel's oops dumping info, but sometimes we need to know more information for program execution flow before the issue happens. So we can rely on two tracing methods to reproduce the program execution flow, one method is using software tracing which is kernel's pstore method; another method is to rely on Coresight hardware tracing, this method also can avoid extra workload introduced by tracing itself. Coresight has provided two mechanisms for Postmortem debugging, one method is Coresight CPU debug module so we can extract CPU program counter info, this is quite straightforward to debug CPU lockup issue; Another is Coresight panic kdump, we connect kernel kdump mechanism to extract Coresight tracing data so we can reproduce the last execution flow before panic (even hang issue with some tweaking in kernel). This session wants to go through these topics and demonstrate the debugging tools on 96boards Hikey in 25 minutes session.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-tr14/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-tr14.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-tr14.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Training
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
Talk about bcc/eBPF for SCALE15x (2017) by Brendan Gregg. "BPF (Berkeley Packet Filter) has been enhanced in the Linux 4.x series and now powers a large collection of performance analysis and observability tools ready for you to use, included in the bcc (BPF Complier Collection) open source project. BPF nowadays can do system tracing, software defined networks, and kernel fast path: much more than just filtering packets! This talk will focus on the bcc/BPF tools for performance analysis, which make use of other built in Linux capabilities: dynamic tracing (kprobes and uprobes) and static tracing (tracepoints and USDT). There are now bcc tools for measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more. These lead to performance wins large and small, especially when instrumenting areas that previously had zero visibility. Tracing superpowers have finally arrived, built in to Linux."
Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.
This manual is “How to Build” manual for OpenCV with OpenCL for Android.
If you want to “Use OpenCL on OpenCV” ONLY,
Please see
http://github.com/noritsuna/OpenCVwithOpenCL4AndroidNDKSample
IRQs: the Hard, the Soft, the Threaded and the PreemptibleAlison Chaiken
The Linux kernel supports a diverse set of interrupt handlers that partition work into immediate and deferred tasks. The talk introduces the major varieties and explains how IRQs differ in the real-time kernel.
syzkaller is an unsupervised, coverage-guided Linux syscall fuzzer.
The presentation covers basic of operation of the fuzzer, gives tutorial on how to run it and how to extend it to fuzz new drivers.
Accelerated Linux Core Dump Analysis training public slidesDmitry Vostokov
The slides from Software Diagnostics Services Linux core dump analysis training. The training description: "Learn how to analyse Linux process crashes and hangs, navigate through process core memory dump space and diagnose corruption, memory leaks, CPU spikes, blocked threads, deadlocks, wait chains, and much more. This book uses a unique and innovative pattern-oriented diagnostic analysis approach to speed up the learning curve. The training consists of 13 practical step-by-step exercises using GDB debugger highlighting more than 25 memory analysis patterns diagnosed in 64-bit process core memory dumps. The training also includes source code of modelling applications, a catalogue of relevant patterns from Software Diagnostics Institute, and an overview of relevant similarities and differences between Windows and Linux user space memory dump analysis useful for engineers with Wintel background."
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.
USENIX ATC 2017: Visualizing Performance with Flame GraphsBrendan Gregg
Talk by Brendan Gregg for USENIX ATC 2017.
"Flame graphs are a simple stack trace visualization that helps answer an everyday problem: how is software consuming resources, especially CPUs, and how did this change since the last software version? Flame graphs have been adopted by many languages, products, and companies, including Netflix, and have become a standard tool for performance analysis. They were published in "The Flame Graph" article in the June 2016 issue of Communications of the ACM, by their creator, Brendan Gregg.
This talk describes the background for this work, and the challenges encountered when profiling stack traces and resolving symbols for different languages, including for just-in-time compiler runtimes. Instructions will be included generating mixed-mode flame graphs on Linux, and examples from our use at Netflix with Java. Advanced flame graph types will be described, including differential, off-CPU, chain graphs, memory, and TCP events. Finally, future work and unsolved problems in this area will be discussed."
Talk by Brendan Gregg for YOW! 2021. "The pursuit of faster performance in computing is the driving reason for many new technologies and updates. This talk discusses performance improvements now underway that you will likely be adopting soon, for processors (including 3D stacking and cloud vendor CPUs), memory (including DDR5 and high-bandwidth memory [HBM]), disks (including 3D Xpoint as a 3D NAND accelerator), networking (including QUIC and eXpress Data Path [XDP]), runtimes, hypervisors, and more. The future of performance is increasingly cloud-based, with hardware hypervisors and custom processors, meaningful observability of everything down to cycle stalls (even as cloud guests), and high-speed syscall-avoiding applications that use eBPF, FPGAs, and io_uring. The talk also discusses where future performance improvements might be expected, with predictions for new technologies."
HKG18-TR14 - Postmortem Debugging with CoresightLinaro
Session ID: HKG18-TR14
Session Name: HKG18-TR14 - Postmortem Debugging with Coresight
Speaker: Leo Yan
Track: Training
★ Session Summary ★
For most cases we can easily debug with kernel's oops dumping info, but sometimes we need to know more information for program execution flow before the issue happens. So we can rely on two tracing methods to reproduce the program execution flow, one method is using software tracing which is kernel's pstore method; another method is to rely on Coresight hardware tracing, this method also can avoid extra workload introduced by tracing itself. Coresight has provided two mechanisms for Postmortem debugging, one method is Coresight CPU debug module so we can extract CPU program counter info, this is quite straightforward to debug CPU lockup issue; Another is Coresight panic kdump, we connect kernel kdump mechanism to extract Coresight tracing data so we can reproduce the last execution flow before panic (even hang issue with some tweaking in kernel). This session wants to go through these topics and demonstrate the debugging tools on 96boards Hikey in 25 minutes session.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-tr14/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-tr14.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-tr14.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: Training
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
XPDDS17: Approach to Native Applications in XEN on ARM - Volodymyr Babchuk, E...The Linux Foundation
Today XEN comes to embedded systems, where it needs to be much closer to a hardware. In one hand hypervisor needs to mediate calls to Trusted Zone, control power, provide drivers for coprocessors, on other hand it needs to remain as small and as secure as possible. So natural approach is to offload all these tasks to something else (like stubdomain or native application).
ARM platform allow hypervisor to act as a common kernel by handling system calls from userspace.
In this talk Volodymyr will describe idea of native applications, compare them with stubdomains and share results of his Native Apps PoC.
Accelerated .NET Memory Dump Analysis training public slidesDmitry Vostokov
The slides from Software Diagnostics Services .NET memory dump analysis training. The training description: "Covers 22 .NET memory dump analysis patterns plus additional 11 unmanaged patterns. Learn how to analyze CLR 4 .NET application and service crashes and freezes, navigate through memory dump space (managed and unmanaged code) and diagnose corruption, leaks, CPU spikes, blocked threads, deadlocks, wait chains, resource contention, and much more. The training consists of practical step-by-step exercises using Microsoft WinDbg debugger to diagnose patterns in 64-bit and 32-bit process memory dumps. The training uses a unique and innovative pattern-oriented analysis approach to speed up the learning curve. The third edition was fully reworked to use the latest WinDbg version and Windows 10. It also includes 9 optional legacy exercises from the previous editions covering CLR 2 and 4, Windows Vista and Windows 7. Prerequisites: Basic .NET programming and debugging. Audience: Software technical support and escalation engineers, system administrators, DevOps, performance and reliability engineers, software developers and quality assurance engineers."
How can I be sure that my program or my computer is doing what it should do exactly when it should do it?
!
This is one of those questions that we have always asked ourselves in information technology.
!
The most common solution is to write other programs, or add other computers, to check the ones we need to check. We then typically add additional safety systems such as Uninterruptible Power Supply groups and backup communication lines. The security thus improves a little, but then we often start worrying about the security of the controller: indeed, who controls the controller?
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon YangLyon Yang
This is a light training/presentation talk.
My name is Lyon Yang and I am an IoT hacker. I live in sunny Singapore where IoT is rapidly being deployed – in production. This walkthrough will aim to shed light on the subject of IoT, from finding vulnerabilities in IoT devices to getting shiny hash prompts.
Our journey starts with a holistic view of IoT security, the issues faced by IoT devices and the common mistakes made by IoT developers. Things will then get technical as we progress into a both ARM and MIPS exploitation, followed by a ‘hack-along-with-us’ workshop where you will be exploiting a commonly found IoT daemon. If you are new to IoT or a seasoned professional you will likely learn something new in this workshop.
https://www.iotvillage.org/#schedule
Practical virtual network functions with Snabb (SDN Barcelona VI)Igalia
By Andy Wingo.
SDN and Network Programmability Meetup in Barcelona (VI)
21 June 2017
https://www.meetup.com/es-ES/SDN-and-Network-Programmability-Meetup-in-Barcelona
/events/239667457/?eventId=239667457
These slides were presented during technical event at my organization. It focuses on overview to find a root cause of the unexpected system down events. It is mainly useful for Linux or Unix system administrators. Here, I tried to cover all aspects of the topic. It took me more than 2 hours to present these slides, but one can also cover these slides within short time-span. Gray background of slides is implemented to hide the company logo and to preserve the confidentially of private template. However, The Knowledge is not restricted :)
XPDS13: Performance Optimization on Xen-based Android Device - Jack Ren, Inte...The Linux Foundation
Mobile devices, such as smart phones and tablets, are becoming de-facto everyday computing and communication devices, virtualization can bring additional benfits to mobile devices for both security and manageability. IT department may use hypervisor, as a highly secure solution, to manage autherized mobile devices, such as for network traffic monitoring, filtering, scan (for virus detection), and/or OS update/patching even when the guest OS becomes completely dead. We insert Xen to the mobile OS Android to deprivilege Android as guest for security and manageability purpose. However, the usage case of mobile device is quit different with that of server, for example mobile devices runs completely different benchmarks (mostly multimedia focused) vs. that in server (mostly responsiveness focused). We analyze the gap of Xen as a mobile hypervisor and present how we improve the performance.
Smartphones, tablets, TVs, cars and smartwatches: Android is everywhere enabling users and developers with rich set of applications, libraries and services. Android Things brings such a power to virtually any object, any “thing”: using a low-cost (yet powerful) board, developer can add intelligence and connectivity to home, industries, vehicles and even medical appliances. This presentation introduces practical concepts around the Android Things platform and how to have fun with it.
Large-Scale Optimization Strategies for Typical HPC Workloadsinside-BigData.com
In this deck from PASC 2019, Liu Yu from Inspur presents: Large-Scale Optimization Strategies for Typical HPC Workloads.
"Ensuring performance of applications running on large-scale clusters is one of the primary focuses in HPC research. In this talk, we will show our strategies on performance analysis and optimization for applications in different fields of research using large-scale HPC clusters. Our strategies are designed to comprehensively analyze runtime features of applications, parallel mode of the physical model, algorithm implementation and other technical details. This three levels of strategy covers platform optimization, technological innovation, and model innovation, and targeted optimization based on these features. State-of-the-art CPU instructions, network communication and other modules, and innovative parallel mode of some applications have been optimized. After optimization, it is expected that these applications will outperform their non-optimized counterparts with obvious increase in performance."
Watch the video: https://wp.me/p3RLHQ-kwB
Learn more: http://en.inspur.com/en/2403285/2403287/2403295/index.html
and
https://pasc19.pasc-conference.org/program/keynote-presentations/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Analyzing OS X Systems Performance with the USE MethodBrendan Gregg
Talk for MacIT 2014. This talk is about systems performance on OS X, and introduces the USE Method to check for common performance bottlenecks and errors. This methodology can be used by beginners and experts alike, and begins by constructing a checklist of the questions we’d like to ask of the system, before reaching for tools to answer them. The focus is resources: CPUs, GPUs, memory capacity, network interfaces, storage devices, controllers, interconnects, as well as some software resources such as mutex locks. These areas are investigated by a wide variety of tools, including vm_stat, iostat, netstat, top, latency, the DTrace scripts in /usr/bin (which were written by Brendan), custom DTrace scripts, Instruments, and more. This is a tour of the tools needed to solve our performance needs, rather than understanding tools just because they exist. This talk will make you aware of many areas of OS X that you can investigate, which will be especially useful for the time when you need to get to the bottom of a performance issue.
Exploring Thermal Related Stuff in iDevices using Open-Source ToolKoan-Sin Tan
This is the era of so-called “dark silicon.” Thermal control is an important but seldom-talked topic. I could not find public information on how iOS does it. Recent checkm8 and follow-on checkra1n enable jailbreaking of iPhone 5s – iPhone X running iOS 12.3 and up. So that we can explore these devices with open-source tools
TensorFlow is the most popular machine learning framework nowadays. TensorFlow Lite (TFLite), open sourced in late 2017, is TensorFlow’s runtime designed for mobile devices, esp. Android cell phones. TFLite is getting more and more mature. One the most interesting new components introduced recently are its GPU delegate and new NNAPI delegate. The GPU delegate uses Open GL ES compute shader on Android platforms and Metal shade on iOS devices. The original NNAPI delegate is an all-or-nothing design (if one of the ops in the compute graph is not supported by NNAPI, the whole graph is not delegated). The new one is a per-op design. When an op in a graph is not supported by NNAPI, the op is automatically fell back to the CPU runtime. I’ll have a quick review TFLite and its interpreter, then walk the audience through example usage of the two delegates and important source code of them.
A peek into Python's Metaclass and Bytecode from a Smalltalk UserKoan-Sin Tan
Understanding object model and bytecode is a crucial part in understanding an interpreted object-oriented language. Smalltalk, one of the oldest object-oriented programming languages, has a great object model and has been used bytecode and VM since 1970s. It is interesting to compare Smalltalk's and Python's object model and bytecode. Guido once said "I remember being surprised by its use of metaclasses (which is quite different from that in Python or Ruby!) when I read about them much later. " and "Smalltalk's bytecode was a bigger influence of Python's bytecode though." It is interesting to compare Smalltalk's and Python's metacalss and bytecode.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Why You Cannot Use Neural Engine to Run Your NN Models on A11 Devices?
1. Can I use Neural Engine
to run my neural networks
on A11 devices?
Koan-Sin Tan
freedom@computer.org
Hsinch Coding Serfs Meeting, Nov, 2018
2. https://www.anandtech.com/show/13392/the-iphone-xs-xs-max-review-unveiling-the-
silicon-secrets/5
• AnandTech is one of my favorite tech sites. Usually, they provides
good technical analysis
• E.g., Apple’s CPUs
• cache sizes
• execution units
• various instruction latency
• Not good enough for NN accelerators on mobile phones
• floating-point VGG16, Inception V3, and ResNet34?
• come on, are you still in Neolithic era?
ANE on A12, how about A11?
3. Why I said VGG16 is
Neolithic Era
• Lightweight models are there
• MobileNet V1 could have roughly
the same top-1 accuracy event
with quantized uint8
• MobileNet V2 could have better
top-1 accuracy
• Mnasnet could be better than
MobileNet V1
• Classification, object detection,
segmentation, etc.
• 8-bit quantization are good enough
for many cases
https://github.com/tensorflow/models/raw/master/research/slim/nets/mobilenet/
madds_top1_accuracy.png
4. How to use Neural Engine
• According to Apple:
• A11: 600 G ops per second, A12: 5 T ops per second
• Yes, by default, it's enabled on A12 device. If you have pre-iOS 10.12 apps built on top of Core ML, they
should be able to use it automatically. But, not on A11 devices.
• How to verify it?
• MLConfiguration [1]: instance variable
@property(readwrite) MLComputeUnits computeUnits;
• there is usesCPUOnly for VNRequest in iOS11, but not something like MLComputUnits
• See my example [2]
[1] https://developer.apple.com/documentation/coreml/mlmodelconfiguration?language=objc
[2] https://github.com/freedomtan/coremlbenchmark/
5. Why not VNRequest?
• Since I mentioned VNRequest in Vision.framework, why not VNCoreMLRequest?
• Yes, I wrote simple VNCoreMLRequest based app before. Both Swift and objective-c
ones [1][2].
• Simplified interface and image crop and scale for you.
• Yes, image operations time.
• This actually reminds us an important system software issue.
• Modern cellphone SoCs use DVFS and all kinds of energy-saving techniques
extensively. How can use get good performance?
• Inference with camera on is usually faster than with camera off!!!
[1] https://github.com/freedomtan/SimpleInceptionV3/
[2] https://github.com/freedomtan/SimpleInceptionV3-ObjC
6. Neural Engine in Action
• H11ANESevicesThread
• A12 is for iPhone11,x
• No H10ANEServicesThread
• So, who started
H11ANEServicesThread? There is no
anything named H11 in /System/
Library/Frameworks/
CoreML.framework/CoreML
• It seems it’s in /System/Library/
PrivateFrameworks/
ANEServices.framework/
ANEServices
• A12 devices only
14. Mach-O Headers
• Mac OS X ABI Mach-O File Format Reference, no longer
available on Apple web site, google it.
• headers: /usr/include/mach-o/loader.h
• objc runtime
• https://opensource.apple.com/source/objc4/
objc4-723/, https://opensource.apple.com/tarballs/
objc4/objc4-723.tar.gz
15. Dive a bit deeper into Core
ML
• Frameworks and some binaries used to be shipped unstripped as parts of iPhoneOS
SDK in Xcode. Not anymore, most framework binaries are in dyld_shared_cache.
• Fortunately, It’s quite easy to check iOS file system nowadays. Apple stopped encrypting
.ipsw since iOS 10 beta (more than 2 years ago). So, get a .ipsw, unzip it (remember it's
a .zip file), then mount the largest .dmg (this needs extra steps on Windows and Linux
though). E.g.,
1. get iOS 12.0 ipsw for iPhone Xs Max [1]. See [2] for other firmwares.
2. unzip it.
3. mount 048-10782-224.dmg, that's it. You can see the whole filesystem used by
iPhone Xs Max.
• Thus, we can get /System/Library/Caches/com.apple.dyld/
dyld_shared_cache_arm* we want
[1] http://updates-http.cdn-apple.com/2018FallFCS/fullrestores/091-65188/11BE19F6-AC8E-11E8-A312-F5CEDE149863/iPhone11,4,iPhone11,6_12.0_16A366_Restore.ipsw
[2] https://www.theiphonewiki.com/wiki/Firmware/iPhone/12.x
16. Dive a bit deeper into Core
ML
• If you are on macOS and have Xcode installed, there are some binaries
with symbols in ~/Library/Developer/Xcode/iOS
DeviceSupport/12.1 (16B92) arm64e/
• What do I mean by “some”? E.g., there is /System/Library/
PrivateFrameworks/AppleNeuralEngine.framework/
XPCServices/ANECompilerService.xpc/
ANECompilerService on A12 devices, but not in Xcode’s support
library
• Yes, we can find /System/Library/Frameworks/
CoreML.framework/CoreML
• Even /System/Library/Caches/com.apple.dyld/
dyld_shared_cache_arm* is there
17. extract binaries from
dyld_shared_cache
• jtool can do it for you. E.g.,
• list
~/work/ios-hacking/tools/jtool -l /Volumes/Peace16A366.D331OS/System/Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e
• extract
~/work/ios-hacking/tools/jtool -e /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine /Volumes/Peace16A366.D331OS/System/
Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e
Extracting /System/Library/PrivateFrameworks/AppleNeuralEngine.framework/AppleNeuralEngine at 0x2be22000 into dyld_shared_cache_arm64e.AppleNeuralEngine
• dyld source code
• https://opensource.apple.com/source/dyld/dyld-551.4/, https://
opensource.apple.com/tarballs/dyld/dyld-551.4.tar.gz
• Read dyld source and [1] for more about dyld_shared_cache
[1] https://iphonedevwiki.net/index.php/Dyld_shared_cache
19. kernel side
• So, how about extract or just put ANE related stuff into A11
devices?
• Well, if you look into kernel_cache of A11 and A12 devices
• As expected, we can see lots of H11ANE information in
A12 kernel_cache
• A11 kernel_cache does mentioned H11ANE several
times, but it seems important modules are not there.
• So, I guess if we don’t jailbreak and root, we are out of luck!
21. Isn’t XNU (Darwin source
code open)?
• Well, there are more than 200 kernel modules, only some of them
are open
$ ~/work/ios-hacking/tools/jtool2 -k ../../iphonex/ipsw/kernelcache.release.iphone10b
0xfffffff00583c000:com.apple.kpi.mach
0xfffffff00583c080:com.apple.kpi.private
0xfffffff00583c100:com.apple.kpi.unsupported
0xfffffff00583c180:com.apple.kpi.iokit
0xfffffff00583c200:com.apple.kpi.libkern
0xfffffff00583c280:com.apple.kpi.bsd
0xfffffff00583c300:com.apple.iokit.IONetworkingFamily
0xfffffff00583de00:com.apple.iokit.IOTimeSyncFamily
0xfffffff0058416c0:com.apple.iokit.IOSlowAdaptiveClockingFamily
0xfffffff005841c40:com.apple.iokit.IOStorageFamily
0xfffffff005842e80:com.apple.iokit.IOReportFamily
0xfffffff005843680:com.apple.driver.AppleARMPlatform
0xfffffff00584cd80:com.apple.driver.AppleSamsungSPI
0xfffffff00584dd00:com.apple.kpi.dsep
0xfffffff00584dd80:com.apple.kec.corecrypto
…