This document summarizes a presentation given by Chirag Patel and Tijmen Blankevoort of Qualcomm AI Research on model efficiency techniques for edge AI. They discuss why model efficiency is important for on-device AI due to constraints like power and thermal limits. They overview techniques like quantization, conditional compute, neural architecture search, and compilation that can shrink AI models and efficiently run them on hardware. Specifically, they find that integer quantization through techniques like post-training and quantization-aware training can achieve similar accuracy as floating point models but provide much better performance per watt. Overall, the presentation advocates that integer quantization is the best approach for efficient AI inference on edge devices.
AI firsts: Leading from research to proof-of-conceptQualcomm Research
AI has made tremendous progress over the past decade, with many advancements coming from fundamental research from many decades ago. Accelerating the pipeline from research to commercialization has been daunting since scaling technologies in the real world faces many challenges beyond the theoretical work done in the lab. Qualcomm AI Research has taken on the task of not only generating novel AI research but also being first to demonstrate proof-of-concepts on commercial devices, enabling technology to scale in the real world. This presentation covers:
The challenges of deploying cutting-edge research on real-world mobile devices
How Qualcomm AI Research is solving system and feasibility challenges with full-stack optimizations to quickly move from research to commercialization
Examples where Qualcomm AI Research has had industrial or academic firsts
The need for intelligent, personalized experiences powered by AI is ever-growing. Our devices are producing more and more data that could help improve our AI experiences. How do we learn and efficiently process all this data from edge devices while maintaining privacy? On-device learning rather than cloud training can address these challenges. In this presentation, we’ll discuss:
- Why on-device learning is crucial for providing intelligent, personalized experiences without sacrificing privacy
- Our latest research in on-device learning, including few-shot learning, continuous learning, and federated learning
- How we are solving system and feasibility challenges to move from research to commercialization
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. Plus, to make AI truly ubiquitous, networks need to run on the end device within a tight power and thermal budget. One approach to help address these issues is quantization, which attempts to reduce the number of bits used for weight parameters and activation calculations without sacrificing model accuracy. This presentation covers: why quantization is important, existing quantization challenges, Qualcomm AI Research's existing quantization research, and how developers and researchers can take advantage of quantization on Qualcomm Snapdragon.
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. To make AI truly ubiquitous, it needs to run on the end device within tight power and thermal budgets. Advancements in multiple areas are necessary to improve AI model efficiency, including quantization, compression, compilation, and neural architecture search (NAS). In this presentation, we’ll discuss:
- Qualcomm AI Research’s latest model efficiency research
- Our new NAS research to optimize neural networks more easily for on-device efficiency
- How the AI community can take advantage of this research though our open-source projects, such as the AI Model Efficiency Toolkit (AIMET) and AIMET Model Zoo
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Research
How do you find the best solution when faced with many choices? Combinatorial optimization is a field of mathematics that seeks to find the most optimal solutions for complex problems involving multiple variables. There are numerous business verticals that can benefit from combinatorial optimization, whether transport, supply chain, or the mobile industry.
More recently, we’ve seen gains from AI for combinatorial optimization, leading to scalability of the method, as well as significant reductions in cost. This method replaces the manual tuning of traditional heuristic approaches with an AI agent that provides a fast metric estimation.
In this presentation you will find out:
Why AI is crucial in combinatorial optimization
How it can be applied to two use cases: improving chip design and hardware-specific compilers
The state-of-the-art results achieved by Qualcomm AI Research
Data compression has increased by leaps and bounds over the years due to technical innovation, enabling the proliferation of streamed digital multimedia and voice over IP. For example, a regular cadence of technical advancement in video codecs has led to massive reduction in file size – in fact, up to a 1000x reduction in file size when comparing a raw video file to a VVC encoded file. However, with the rise of machine learning techniques and diverse data types to compress, AI may be a compelling tool for next-generation compression, offering a variety of benefits over traditional techniques. In this presentation we discuss:
- Why the demand for improved data compression is growing
- Why AI is a compelling tool for compression in general
- Qualcomm AI Research’s latest AI voice and video codec research
- Our future AI codec research work and challenges
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2023/09/3d-sensing-market-and-industry-update-a-presentation-from-the-yole-group/
Florian Domengie, Senior Technology and Market Analyst at Yole Intelligence (part of the Yole Group), presents the “3D Sensing: Market and Industry Update” tutorial at the May 2023 Embedded Vision Summit.
While the adoption of mobile 3D sensing has slowed in Android phones, the market has still been growing fast, thanks to Apple. Apple is continuing to adopt 3D cameras for iPhones in both front and rear. Along the way, Apple has updated face ID and simplified and shrunk 3D camera optical structures. Meanwhile, due to Android phone OEMs mostly choosing not to incorporate 3D cameras, sensor suppliers and integrators have had to work hard to open up other consumer markets.
In addition to consumer markets, the use of 3D sensing has been blossoming in markets such as the industrial market and the nascent automotive market, where 3D sensing is increasingly used for advanced driver assistance systems and driver monitoring systems. In this talk, Domengie provides an overview of the main application, market, industry and technology trends of the 3D sensing industry.
AI firsts: Leading from research to proof-of-conceptQualcomm Research
AI has made tremendous progress over the past decade, with many advancements coming from fundamental research from many decades ago. Accelerating the pipeline from research to commercialization has been daunting since scaling technologies in the real world faces many challenges beyond the theoretical work done in the lab. Qualcomm AI Research has taken on the task of not only generating novel AI research but also being first to demonstrate proof-of-concepts on commercial devices, enabling technology to scale in the real world. This presentation covers:
The challenges of deploying cutting-edge research on real-world mobile devices
How Qualcomm AI Research is solving system and feasibility challenges with full-stack optimizations to quickly move from research to commercialization
Examples where Qualcomm AI Research has had industrial or academic firsts
The need for intelligent, personalized experiences powered by AI is ever-growing. Our devices are producing more and more data that could help improve our AI experiences. How do we learn and efficiently process all this data from edge devices while maintaining privacy? On-device learning rather than cloud training can address these challenges. In this presentation, we’ll discuss:
- Why on-device learning is crucial for providing intelligent, personalized experiences without sacrificing privacy
- Our latest research in on-device learning, including few-shot learning, continuous learning, and federated learning
- How we are solving system and feasibility challenges to move from research to commercialization
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. Plus, to make AI truly ubiquitous, networks need to run on the end device within a tight power and thermal budget. One approach to help address these issues is quantization, which attempts to reduce the number of bits used for weight parameters and activation calculations without sacrificing model accuracy. This presentation covers: why quantization is important, existing quantization challenges, Qualcomm AI Research's existing quantization research, and how developers and researchers can take advantage of quantization on Qualcomm Snapdragon.
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. To make AI truly ubiquitous, it needs to run on the end device within tight power and thermal budgets. Advancements in multiple areas are necessary to improve AI model efficiency, including quantization, compression, compilation, and neural architecture search (NAS). In this presentation, we’ll discuss:
- Qualcomm AI Research’s latest model efficiency research
- Our new NAS research to optimize neural networks more easily for on-device efficiency
- How the AI community can take advantage of this research though our open-source projects, such as the AI Model Efficiency Toolkit (AIMET) and AIMET Model Zoo
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Research
How do you find the best solution when faced with many choices? Combinatorial optimization is a field of mathematics that seeks to find the most optimal solutions for complex problems involving multiple variables. There are numerous business verticals that can benefit from combinatorial optimization, whether transport, supply chain, or the mobile industry.
More recently, we’ve seen gains from AI for combinatorial optimization, leading to scalability of the method, as well as significant reductions in cost. This method replaces the manual tuning of traditional heuristic approaches with an AI agent that provides a fast metric estimation.
In this presentation you will find out:
Why AI is crucial in combinatorial optimization
How it can be applied to two use cases: improving chip design and hardware-specific compilers
The state-of-the-art results achieved by Qualcomm AI Research
Data compression has increased by leaps and bounds over the years due to technical innovation, enabling the proliferation of streamed digital multimedia and voice over IP. For example, a regular cadence of technical advancement in video codecs has led to massive reduction in file size – in fact, up to a 1000x reduction in file size when comparing a raw video file to a VVC encoded file. However, with the rise of machine learning techniques and diverse data types to compress, AI may be a compelling tool for next-generation compression, offering a variety of benefits over traditional techniques. In this presentation we discuss:
- Why the demand for improved data compression is growing
- Why AI is a compelling tool for compression in general
- Qualcomm AI Research’s latest AI voice and video codec research
- Our future AI codec research work and challenges
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2023/09/3d-sensing-market-and-industry-update-a-presentation-from-the-yole-group/
Florian Domengie, Senior Technology and Market Analyst at Yole Intelligence (part of the Yole Group), presents the “3D Sensing: Market and Industry Update” tutorial at the May 2023 Embedded Vision Summit.
While the adoption of mobile 3D sensing has slowed in Android phones, the market has still been growing fast, thanks to Apple. Apple is continuing to adopt 3D cameras for iPhones in both front and rear. Along the way, Apple has updated face ID and simplified and shrunk 3D camera optical structures. Meanwhile, due to Android phone OEMs mostly choosing not to incorporate 3D cameras, sensor suppliers and integrators have had to work hard to open up other consumer markets.
In addition to consumer markets, the use of 3D sensing has been blossoming in markets such as the industrial market and the nascent automotive market, where 3D sensing is increasingly used for advanced driver assistance systems and driver monitoring systems. In this talk, Domengie provides an overview of the main application, market, industry and technology trends of the 3D sensing industry.
With uCPE/SD-WAN taking center stage in enabling software-defined Cloud services to enterprise branch offices globally, this session will provide a uCPE review from a solution, deployment and reference design standpoint.
Speaker: Sab Gosal, Segment Manager
Network Platforms Group (NPG), September 2018
State of transformers in Computer VisionDeep Kayal
Transformers have rapidly come up as a challenger network architecture to traditional convnets in computer vision. Here is a quick landscape analysis of the state of transformers in vision, as of 2021.
5G is going mainstream across the globe, and this is an exciting time to harness the low latency and high capacity of 5G to enable the metaverse. A distributed-compute architecture across device and cloud can enable rich extended reality (XR) user experiences. Virtual reality (VR) and mixed reality (MR) are ready for deployment in private networks, while augmented reality (AR) for wide area networks can be enabled in the near term with Wi-Fi powered AR glasses paired with a 5G-enabled phone. Device APIs enabling application adaptation is critical for good user experience. 5G standards are evolving to support the deployment of AR glasses at a large scale and setting the stage for 6G-era with the merging of the physical, digital, and virtual worlds. Techniques like perception-enhanced wireless offer significant potential to improve user experience. Qualcomm Technologies is enabling the XR industry with platforms, developer SDKs, and reference designs.
Check out this webinar to learn:
• How 5G and distributed-compute architectures enable the metaverse
• The latest results from our boundless XR 5G/6G testbed, including device APIs and perception-enhanced wireless
• 5G standards evolution for enhancing XR applications and the road to 6G
• How Qualcomm Technologies is enabling the industry with platforms, SDKs, and reference designs
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2023/06/accelerating-newer-ml-models-using-the-qualcomm-ai-stack-a-presentation-from-qualcomm/
Vinesh Sukumar, Senior Director and Head of AI/ML Product Management at Qualcomm Technologies, presents the “Accelerating Newer ML Models Using the Qualcomm AI Stack” tutorial at the May 2023 Embedded Vision Summit.
The Qualcomm AI Stack revolutionizes how Qualcomm thinks about AI software and provides the ultimate tool and user interface to enable ecosystem partners to create faster and smarter AI applications for all embedded form factors. Focusing on real user experience challenges centered around model deployment, Sakumar explains how the Snapdragon developer community leverages data types, quantization and neural architecture search—among others—to optimize complex AI architectures for emerging use cases.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2022/06/optimization-techniques-with-intels-openvino-to-enhance-performance-on-your-existing-hardware-a-presentation-from-intel/
Nico Galoppo, Principal Engineer (substituting for Ansley Dunn, Product Marketing Manager), and Ryan Loney, Technical Product Manager, both of Intel, present the “Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your Existing Hardware” tutorial at the May 2022 Embedded Vision Summit.
Whether you’re using TensorFlow, PyTorch or another framework, Galoppo and Loney show you optimization techniques to enhance performance on your existing hardware. With the OpenVINO Toolkit, built on the foundation of OneAPI, developers can utilize their own AI model or leverage one of the hundreds of pre-trained models available across vision and audio use cases.
In this presentation, you’ll learn how the Neural Network Compression Framework provides optimal model training templates for performance boosts while preserving accuracy, and how the Model Optimizer reduces complexity and makes model conversion faster. Other areas explored by Galoppo and Loney include auto device discovery to enable automatic load balancing and how to optimize for latency or throughput based on your workload.
Transforming enterprise and industry with 5G private networksQualcomm Research
The 3GPP put the spotlight on industry expansion in July with 5G NR Release 16 and set the stage for enterprise and industry verticals to look at how to provide high-performance wireless connectivity with 5G private networks. With a variety of options for spectrum, different network architectures, a rich feature set to meet the demanding needs of the industrial Internet of Things (IIoT), and the privacy and security required for business assurance, 5G private networks are poised to transform enterprise and industry.
Watch the webinar at: https://pages.questexnetwork.com/Webinar-Qualcomm-Registration-101520.html?source=Qualcomm
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...changedaeoh
computer vision 분야에서 dominant 한 Convolutional Layer를 일절 사용하지 않고, NLP에서 제안된 순수 Transformer의 architecture를 그대로 가져와 Attention과 일반 Feed Forward NN만을 이용하여 SOTA수준의 Image Classification Model을 구축한다.
TAVE research seminar 21.03.30 발표자료
발표자: 오창대
For the first time, the processor monitor is including FPGA, CPU, GPU, and APU including all the IDMs, fabless companies, and foundries in the business.
More information : https://www.i-micronews.com/products/application-processor-quarterly-market-monitor/
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
Swin Transformer가 최근 오브젝트디텍션 그리고 Semantic segmentation분야에서의 성능이 가장 좋은 모델 중 하나로
주목 받고 있습니다.
Swin Transformer nlp분야에서 많이 쓰이는 트랜스포머를 비전 분야에 적용한 모델로 Hierarchical feature maps과
Window-based Self-attention의 특징적입니다 Swin Transformer는 작년 구글에서 제안된 방법인
비전 트랜스포머의 한계점을 개선한 모델이라고 보시면 됩니다
트랜스포머의 한계란.. ㄷㄷ 이네요
이미지 처리팀의 김선옥님이 자세한 리뷰 도와주셨습니다!!
오늘도 많은 관심 미리 감사드립니다!!
https://youtu.be/L3sH9tjkvKI
With 5G comes slicing that will greatly benefit to the Automotive Industry to have virtual dedicated networks for different traffic, from Safety highly reliable to entertainment or less critical sensor data collection
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...Hyeongmin Lee
제가 이번에 소개드릴 논문은 NeRF와 같이 view synthesis를 하는 논문입니다. NeRF 이후로 NeRF의 문제점을 보완하기 위해 여러 방법들이 쏟아져 나왔는데요, 다른 한편으로는 발상의 전환을 통해 NeRF와 다른 방법을 활용하고자 하는 시도들도 있는 편입니다. 그러한 가장 대표적인 방법중 하나인 Neural Light Field Rendering 방식에 대해 설명드리겠습니다.
논문 링크: https://arxiv.org/abs/2106.02634
영상 링크: https://youtu.be/gxag8uvA2Sc
3D perception is crucial for understanding the real world. It offers many benefits and new capabilities over 2D across diverse applications, from XR and autonomous driving to IOT, camera, and mobile. 3D perception with machine learning is creating the new state of the art (SOTA) in areas, such as depth estimation, object detection, and neural scene representation. Making these SOTA neural networks feasible for real-world deployment on mobile devices constrained by power, thermal, and performance has been a challenge. Qualcomm AI Research has developed not only novel AI techniques for 3D perception but also full-stack AI optimizations to enable real-world deployments and energy-efficient solutions. This presentation explores the latest research that is enabling efficient 3D perception while maintaining neural network model accuracy. You’ll learn about:
- The advantages of 3D perception over 2D and the need for 3D perception across applications
- Advancements in 3D perception research by Qualcomm AI Research
- Our future 3D perception research directions
Inject precise synchronization into open compute serversADVA
For data centers, finance and 5G infrastructure, our new OSA 5400 TimeCard™ is the key to bringing packet time distribution to the edge or access network. Developed to the framework of the Open Compute Project’s (OCP) Time Appliance Project (TAP), it brings sophisticated timing capabilities to any open compute server, transforming it into a precise and stable PTP grandmaster, boundary clock, slave clock or NTP server.
With uCPE/SD-WAN taking center stage in enabling software-defined Cloud services to enterprise branch offices globally, this session will provide a uCPE review from a solution, deployment and reference design standpoint.
Speaker: Sab Gosal, Segment Manager
Network Platforms Group (NPG), September 2018
State of transformers in Computer VisionDeep Kayal
Transformers have rapidly come up as a challenger network architecture to traditional convnets in computer vision. Here is a quick landscape analysis of the state of transformers in vision, as of 2021.
5G is going mainstream across the globe, and this is an exciting time to harness the low latency and high capacity of 5G to enable the metaverse. A distributed-compute architecture across device and cloud can enable rich extended reality (XR) user experiences. Virtual reality (VR) and mixed reality (MR) are ready for deployment in private networks, while augmented reality (AR) for wide area networks can be enabled in the near term with Wi-Fi powered AR glasses paired with a 5G-enabled phone. Device APIs enabling application adaptation is critical for good user experience. 5G standards are evolving to support the deployment of AR glasses at a large scale and setting the stage for 6G-era with the merging of the physical, digital, and virtual worlds. Techniques like perception-enhanced wireless offer significant potential to improve user experience. Qualcomm Technologies is enabling the XR industry with platforms, developer SDKs, and reference designs.
Check out this webinar to learn:
• How 5G and distributed-compute architectures enable the metaverse
• The latest results from our boundless XR 5G/6G testbed, including device APIs and perception-enhanced wireless
• 5G standards evolution for enhancing XR applications and the road to 6G
• How Qualcomm Technologies is enabling the industry with platforms, SDKs, and reference designs
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2023/06/accelerating-newer-ml-models-using-the-qualcomm-ai-stack-a-presentation-from-qualcomm/
Vinesh Sukumar, Senior Director and Head of AI/ML Product Management at Qualcomm Technologies, presents the “Accelerating Newer ML Models Using the Qualcomm AI Stack” tutorial at the May 2023 Embedded Vision Summit.
The Qualcomm AI Stack revolutionizes how Qualcomm thinks about AI software and provides the ultimate tool and user interface to enable ecosystem partners to create faster and smarter AI applications for all embedded form factors. Focusing on real user experience challenges centered around model deployment, Sakumar explains how the Snapdragon developer community leverages data types, quantization and neural architecture search—among others—to optimize complex AI architectures for emerging use cases.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2022/06/optimization-techniques-with-intels-openvino-to-enhance-performance-on-your-existing-hardware-a-presentation-from-intel/
Nico Galoppo, Principal Engineer (substituting for Ansley Dunn, Product Marketing Manager), and Ryan Loney, Technical Product Manager, both of Intel, present the “Optimization Techniques with Intel’s OpenVINO to Enhance Performance on Your Existing Hardware” tutorial at the May 2022 Embedded Vision Summit.
Whether you’re using TensorFlow, PyTorch or another framework, Galoppo and Loney show you optimization techniques to enhance performance on your existing hardware. With the OpenVINO Toolkit, built on the foundation of OneAPI, developers can utilize their own AI model or leverage one of the hundreds of pre-trained models available across vision and audio use cases.
In this presentation, you’ll learn how the Neural Network Compression Framework provides optimal model training templates for performance boosts while preserving accuracy, and how the Model Optimizer reduces complexity and makes model conversion faster. Other areas explored by Galoppo and Loney include auto device discovery to enable automatic load balancing and how to optimize for latency or throughput based on your workload.
Transforming enterprise and industry with 5G private networksQualcomm Research
The 3GPP put the spotlight on industry expansion in July with 5G NR Release 16 and set the stage for enterprise and industry verticals to look at how to provide high-performance wireless connectivity with 5G private networks. With a variety of options for spectrum, different network architectures, a rich feature set to meet the demanding needs of the industrial Internet of Things (IIoT), and the privacy and security required for business assurance, 5G private networks are poised to transform enterprise and industry.
Watch the webinar at: https://pages.questexnetwork.com/Webinar-Qualcomm-Registration-101520.html?source=Qualcomm
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...changedaeoh
computer vision 분야에서 dominant 한 Convolutional Layer를 일절 사용하지 않고, NLP에서 제안된 순수 Transformer의 architecture를 그대로 가져와 Attention과 일반 Feed Forward NN만을 이용하여 SOTA수준의 Image Classification Model을 구축한다.
TAVE research seminar 21.03.30 발표자료
발표자: 오창대
For the first time, the processor monitor is including FPGA, CPU, GPU, and APU including all the IDMs, fabless companies, and foundries in the business.
More information : https://www.i-micronews.com/products/application-processor-quarterly-market-monitor/
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
Swin Transformer가 최근 오브젝트디텍션 그리고 Semantic segmentation분야에서의 성능이 가장 좋은 모델 중 하나로
주목 받고 있습니다.
Swin Transformer nlp분야에서 많이 쓰이는 트랜스포머를 비전 분야에 적용한 모델로 Hierarchical feature maps과
Window-based Self-attention의 특징적입니다 Swin Transformer는 작년 구글에서 제안된 방법인
비전 트랜스포머의 한계점을 개선한 모델이라고 보시면 됩니다
트랜스포머의 한계란.. ㄷㄷ 이네요
이미지 처리팀의 김선옥님이 자세한 리뷰 도와주셨습니다!!
오늘도 많은 관심 미리 감사드립니다!!
https://youtu.be/L3sH9tjkvKI
With 5G comes slicing that will greatly benefit to the Automotive Industry to have virtual dedicated networks for different traffic, from Safety highly reliable to entertainment or less critical sensor data collection
PR-386: Light Field Networks: Neural Scene Representations with Single-Evalua...Hyeongmin Lee
제가 이번에 소개드릴 논문은 NeRF와 같이 view synthesis를 하는 논문입니다. NeRF 이후로 NeRF의 문제점을 보완하기 위해 여러 방법들이 쏟아져 나왔는데요, 다른 한편으로는 발상의 전환을 통해 NeRF와 다른 방법을 활용하고자 하는 시도들도 있는 편입니다. 그러한 가장 대표적인 방법중 하나인 Neural Light Field Rendering 방식에 대해 설명드리겠습니다.
논문 링크: https://arxiv.org/abs/2106.02634
영상 링크: https://youtu.be/gxag8uvA2Sc
3D perception is crucial for understanding the real world. It offers many benefits and new capabilities over 2D across diverse applications, from XR and autonomous driving to IOT, camera, and mobile. 3D perception with machine learning is creating the new state of the art (SOTA) in areas, such as depth estimation, object detection, and neural scene representation. Making these SOTA neural networks feasible for real-world deployment on mobile devices constrained by power, thermal, and performance has been a challenge. Qualcomm AI Research has developed not only novel AI techniques for 3D perception but also full-stack AI optimizations to enable real-world deployments and energy-efficient solutions. This presentation explores the latest research that is enabling efficient 3D perception while maintaining neural network model accuracy. You’ll learn about:
- The advantages of 3D perception over 2D and the need for 3D perception across applications
- Advancements in 3D perception research by Qualcomm AI Research
- Our future 3D perception research directions
Inject precise synchronization into open compute serversADVA
For data centers, finance and 5G infrastructure, our new OSA 5400 TimeCard™ is the key to bringing packet time distribution to the edge or access network. Developed to the framework of the Open Compute Project’s (OCP) Time Appliance Project (TAP), it brings sophisticated timing capabilities to any open compute server, transforming it into a precise and stable PTP grandmaster, boundary clock, slave clock or NTP server.
Design Efficient Wireless Monitoring Platform for Recycling Point SpotsIJMTST Journal
There is a growing demand for low cost, very low power and reduced size monitoring systems with
wireless communications, to be used in different kinds of industrial environments. In several countries waste
separation and recycling is a major issue. Consequently, the number of recycling spots has been steadily
increasing. In order to ensure that recycle bins are properly maintained, several monitoring solutions have
been proposed. These still have several limitations, such as requiring wires for power and/or communications
and not being able to fit in all existing types of bins. This paper presents WECO, a wireless embedded solution
for monitoring the level of the bins located in recycling spots. The proposed system automatically alerts a
remote central station when a bin reaches a programmable filling level, thus avoiding the need to spot check
if the bin is full and ensuring that the recycling spot is kept clean. The developed prototype required
hardware-software co-design and aimed to meet the above mentioned requirements, resorting to the IEEE
802.15.4 protocol for wireless communications between all nodes in the network, each based on a
System-On-Chip CC2530 from Texas Instruments. Due to its wireless nature, the architecture requires a
battery for power supplying the nodes, with a life time of at least six years. The filling level readings of each
bin in a recycling spot are made using an ultrasonic sensor. The data collected by the monitoring platform is
then sent to the remote central station that processes it in order to optimize routes and establish a scheduled
collection of the recycling spots.
Implementation of 32 Bit Binary Floating Point Adder Using IEEE 754 Single Pr...iosrjce
Field Programmable Gate Arrays (FPGA) are increasingly being used to design high- end
computationally intense microprocessors capable of handling both fixed and floating- point mathematical
operations. Addition is the most complex operation in a floating-point unit and offers major delay while taking
significant area. Over the years, the VLSI community has developed many floating-point adder algorithms
mainly aimed to reduce the overall latency. The Objective of this paper to implement the 32 bit binary floating
point adder with minimum time. Floating point numbers are used in various applications such as medical
imaging, radar, telecommunications Etc. Here pipelined architecture is used in order to increase the
performance and the design is achieved to increase the operating frequency. The logic is designed using VHDL.
This paper discusses in detail the best possible FPGA implementation will act as an important design resource.
The performance criterion is latency in all the cases. The algorithms are compared for overall latency, area,
and levels of logic and analyzed specifically for one of the latest FPGA architectures provided by Xilinx.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals
This project describes a novel architecture based on Recursive Running Sum (RRS) filter implementation for wire and wireless data processing. UARTs are used for asynchronous serial data communication between remote embedded systems. If physical channel is noisy then, serial data bits get corrupted during transmission. The UART core described here, utilizes recursive running sum filter to remove noisy samples. Input data signal is directly sampled with system clock and samples are accumulated over a window size. The window size is user programmable and it should be set to one tenth of required bit period. The intermediate data bit is decoded using magnitude comparator. The advantage of this architecture is that baud rate is decided by the window size so there is no need of any external “timer module” which is normally required for standard UARTs. The Recursive Running Sum (RRS) filter architecture with programmable window size of M is designed and modules are implemented with VHDL language. This project implementation includes many applications in wireless data communication Systems like RF, Blue tooth, WIFI, ZigBee wireless sensor applications. Total coding written in VHDL language. Simulation in Modelsim Simulator, Synthesis done by XILINX ISE 9.2i. Synthesis result is verified by the Chipscope. Input signal given from the keyboard and output is seen by the help of HyperTerminal.
International Journal of Engineering Inventions (IJEI) provides a multidisciplinary passage for researchers, managers, professionals, practitioners and students around the globe to publish high quality, peer-reviewed articles on all theoretical and empirical aspects of Engineering and Science.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
Find out more about Infineon on our Homepage: www.infineon.com
Find here all Information about Battery powered Applications, Package Information and Motor Control Secection Matrix for OptiMOS™ and Small Signal.
Similar to Presentation - Model Efficiency for Edge AI (20)
Generative AI models, such as ChatGPT and Stable Diffusion, can create new and original content like text, images, video, audio, or other data from simple prompts, as well as handle complex dialogs and reason about problems with or without images. These models are disrupting traditional technologies, from search and content creation to automation and problem solving, and are fundamentally shaping the future user interface to computing devices. Generative AI can apply broadly across industries, providing significant enhancements for utility, productivity, and entertainment. As generative AI adoption grows at record-setting speeds and computing demands increase, on-device and hybrid processing are more important than ever. Just like traditional computing evolved from mainframes to today’s mix of cloud and edge devices, AI processing will be distributed between them for AI to scale and reach its full potential.
In this presentation you’ll learn about:
- Why on-device AI is key
- Full-stack AI optimizations to make on-device AI possible and efficient
- Advanced techniques like quantization, distillation, and speculative decoding
- How generative AI models can be run on device and examples of some running now
- Qualcomm Technologies’ role in scaling on-device generative AI
As generative AI adoption grows at record-setting speeds and computing demands increase, hybrid processing is more important than ever. But just like traditional computing evolved from mainframes and thin clients to today’s mix of cloud and edge devices, AI processing must be distributed between the cloud and devices for AI to scale and reach its full potential. In this talk you’ll learn:
• Why on-device AI is key
• Which generative AI models can run on device
• Why the future of AI is hybrid
• Qualcomm Technologies’ role in making hybrid AI a reality
- There is a rich roadmap of 5G technologies coming in the second half of the 5G decade with the 5G Advanced evolution
- 6G will be the future innovation platform for 2030 and beyond building on the 5G Advanced foundation
- 6G will be more than just a new radio design, expanding the role of AI, sensing and others in the connected intelligent edge
- Qualcomm is leading cutting-edge wireless research across six key technology vectors on the path to 6G
Bringing AI research to wireless communication and sensingQualcomm Research
AI for wireless is already here, with applications in areas such as mobility management, sensing and localization, smart signaling and interference management. Recently, Qualcomm Technologies has prototyped the AI-enabled air interface and launched the Qualcomm 5G AI Suite. These developments are possible thanks to expertise in both wireless and machine learning from over a decade of foundational research in these complementing fields.
Our approach brings together the modeling flexibility and computational efficiency of machine learning and the out-of-domain generalization and interpretability of wireless domain expertise.
In this webinar, Qualcomm AI Research presents an overview of state-of-the-art research at the intersection of the two fields and offers a glimpse into the future of the wireless industry.
Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.
Speakers:
Arash Behboodi, Machine Learning Research Scientist (Senior Staff Engineer/Manager), Qualcomm AI Research Daniel Dijkman, Machine Learning Research Scientist (Principal Engineer), Qualcomm AI Research
How will sidelink bring a new level of 5G versatility.pdfQualcomm Research
Today, the 5G system mainly operates on a network-to-device communication model, exemplified by enhanced mobile broadband use cases where all data transmissions are between the network (i.e., base station) and devices (e.g., smartphone). However, to fully deliver on the original 5G vision of supporting diverse devices, services, and deployment scenarios, we need to expand the 5G topology further to reach new levels of performance and efficiency.
That is why sidelink communication was introduced in 3GPP standards, designed to facilitate direct communication between devices, independent of connectivity via the cellular infrastructure. Beyond automotive communication, it also benefits many other 5G use cases such as IoT, mobile broadband, and public safety.
5G is designed to serve an unprecedented range of capabilities with a single global standard. With enhanced mobile broadband (eMBB), massive IoT (mIoT), and mission-critical IoT, the three pillars of 5G represent extremes in performance and associated complexity. For IoT services, NB-IoT and eMTC devices prioritize low power consumption and the lowest complexity for wide-area deployments (LPWA), while enhanced ultra-reliable, low-latency communication (eURLLC), along with time-sensitive networking (TSN), delivers the most stringent use case requirements. But there exists an opportunity to more efficiently address a broad range of mid-tier applications with capabilities ranging between these extremes.
In 5G NR Release 17, 3GPP introduced a new tier of reduced capability (RedCap) devices, also known as NR-Light. It is a new device platform that bridges the capability and complexity gap between the extremes in 5G today with an optimized design for mid-tier use cases. With the recent standards completion, NR-Light is set to efficiently expand the 5G universe to connect new frontiers.
Download this presentation to learn:
• What NR-Light is and why it can herald the next wave of 5G expansion
• How NR-Light is accelerating the growth of the connected intelligent edge
• Why NR-Light is a suitable 5G migration path for mid-tier LTE devices
Realizing mission-critical industrial automation with 5GQualcomm Research
Manufacturers seeking better operational efficiencies, with reduced downtime and higher yield, are at the leading edge of the Industry 4.0 transformation. With mobile system components and reliable wireless connectivity between them, flexible manufacturing systems can be reconfigured quickly for new tasks, to troubleshoot issues, or in response to shifts in supply and demand.
There is a long history of R&D collaboration between Bosch Rexroth and Qualcomm Technologies for the effective application of these 5G capabilities to industrial automation use cases. At the Robert Bosch Elektronik GmbH factory in Salzgitter, Germany, this collaboration has reached new heights.
Download this deck to learn how:
• Qualcomm Technologies and Bosch Rexroth are collaborating to accelerate the Industry 4.0 transformation
• 5G technologies deliver key capabilities for mission-critical industrial automation
• Distributed control solutions can work effectively across 5G TSN networks
• A single 5G technology platform solves connectivity and positioning needs for flexible manufacturing
3GPP Release 17: Completing the first phase of 5G evolutionQualcomm Research
This presentation summarizes 5G NR Release 17 projects that was completed in March 2022. It further enhances 5G foundation and expands into new devices, use cases, verticals.
Setting off the 5G Advanced evolution with 3GPP Release 18Qualcomm Research
In December 2021, 3GPP has reached a consensus on the scope of 5G NR Release 18. This is a significant milestone marking the beginning of 5G Advanced — the second wave of wireless innovations that will fulfill the 5G vision. Release 18 will build on the solid foundation set by Releases 15, 16, and 17, and it sets the longer-term evolution direction of 5G and beyond. This release will encompass a wide range of new and enhancement projects, ranging from improved MIMO and application of AI/ML-enabled air interface to extended reality optimizations and broader IoT support.
Cellular networks have facilitated positioning in addition to voice or data communications from the beginning, since 2G, and we’ve since grown to rely on positioning technology to make our lives safer, simpler, more productive, and even fun. Cellular positioning complements other technologies to operate indoors and outdoors, including dense urban environments where tall buildings interfere with satellite positioning. It works whether we’re standing still, walking, or in a moving vehicle. With 5G, cellular positioning breaks new ground to bring robust precise positioning indoors and outdoors, to meet even the most demanding Industry 4.0 needs.
As we look to the future, the Connected Intelligent Edge will bring a new dimension of positional insight to a broad range of devices, improving wireless use cases still under development. We’re already charting the course to 5G Advanced and beyond by working on the evolution of cellular positioning technology to include RF sensing for situational awareness.
Download the deck to learn more.
This presentation outlines the synergistic nature of 5G and AI -- two disruptive areas of innovations that can change the world. It illustrates the benefits of adopting AI for the advancements of 5G, as well as showcases the latest progress made by Qualcomm Technologies, Inc.
How to build high performance 5G networks with vRAN and O-RANQualcomm Research
5G networks are poised to deliver an unprecedented amount of data from a richer set of use cases than we have ever seen. This makes efficient networking in terms of scalability, cost, and power critical for the sustainable growth of 5G. Cloud technologies such as virtualization, containerization and orchestration are now powering a surge of innovation in virtualized radio access network (vRAN) infrastructure with modular hardware and software components, and standardized interfaces. While commercial off-the-shelf (COTS) hardware platforms provide the compute capacity for running vRAN software, hardware accelerators will also play a major role in offloading real-time and complex signal processing functions. Together, COTS platforms and hardware accelerators provide the foundation for building the intelligent 5G network and facilitate innovative new use cases with the intelligent wireless edge.
This presentation takes a look at the technology roadmap for 5G NR millimeter wave (mmWave). Including features such as integrated access and backhaul (IAB), enhancements in beam management, mobility, coverage, and more. For more information, please visit www.qualcomm.com/mmwave
Video data is abundant and being generated at ever increasing rates. Analyzing video with AI can provide valuable insights and capabilities for many applications ranging from autonomous driving and smart cameras to smartphones and extended reality. However, as video resolution and frame rates increase while AI video perception models become more complex, running these workloads in real time is becoming more challenging. This presentation explores the latest research that is enabling efficient video perception while maintaining neural network model accuracy. You’ll learn about:
- How video perception is crucial for understanding the world and making devices smarter
- The challenges of on-device real-time video perception at high resolution through AI
- Qualcomm AI Research’s latest research and techniques for efficient video perception
Checkout: https://www.qualcomm.com/AI
Enabling the rise of the smartphone: Chronicling the developmental history at...Qualcomm Research
Today’s smartphones are a marvel of modern technology — handheld devices with vast computing power, incredible multimedia and AI capabilities, and blazing fast data rates that support mobile browsing, social media interaction, and more. From humble beginnings as a cellphone focused purely on voice communication, the capability and functionality of modern smartphones have advanced tremendously. This presentation chronicles Qualcomm’s role in the rise of the smartphone from its initial beginnings to becoming the largest computing platform in the world. It includes:
- Key technology developments that led to today’s smartphones
- The role of Moore’s Law in driving new innovations and additional integration into mobile processors
- Qualcomm’s critical role in advancing the smartphone’s capabilities through groundbreaking innovations and key technology developments
This presentation provides an overview of important 5G innovations around new and enhanced use of spectrum. It also captures the current 5G spectrum status across the globe.
Today, we take it for granted that our mobile devices and applications just work out of the box — smartphones can roam virtually anywhere in the world, laptops can seamlessly connect to any Wi-Fi access point & Bluetooth peripheral, and the videos recorded on one device can be played back perfectly on any other device.
The magic behind all this? Technology standards. Not only do they power a wide range of systems and devices but also bring many benefits to the broader technology ecosystem. At Qualcomm Technologies, we are leading the standardization of many key technologies that will move the world forward.
Download this presentation to learn:
- The value of technology standards, specifically in the areas of cellular, Wi-Fi, Bluetooth, and video codecs
- Why standardized technologies are essential for industry growth and ecosystem development
- How standard bodies operate in a complex, challenging, and ever changing environment
- How Qualcomm is driving innovation in different technology standards
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Presentation - Model Efficiency for Edge AI
1. Chirag Patel
Engineer, Principal/Manager
Qualcomm AI Research
September 21, 2022
@QCOMResearch
The future of model efficiency
for edge AI
Tijmen Blankevoort
Director of Engineering
Qualcomm AI Research
Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc
2. 2
Our presenters
Why model efficiency is
important for on-device AI
Overview of integer quantization
(INT) versus floating point (FP)
3
1
2
4
Agenda
Chirag
Patel
Engineer, Principal/Manager,
Qualcomm AI Research
Tijmen
Blankevoort
Director, Engineering,
Qualcomm AI Research
5 Questions?
Open-source tools: AI Model
Efficiency Toolkit (AIMET) and
AIMET Model Zoo
AIMET and AIMET Model Zoo are products of Qualcomm Innovation Center, Inc.
Improving low-bit quantization
3. 3
3
Video monitoring
Extended reality Smart cities
Smart factories
Autonomous vehicles
Video conferencing
Smart homes
Smartphone
3
AI is being used all around us
increasing productivity, enhancing collaboration, and transforming industries
4. 4
4
Source: Welling
Will we have reached the capacity of the human brain?
Energy efficiency of the human brain is estimated
to be 100,000x better than current hardware
2025
Weight
parameter
count
1940 1950 1960 1970 1980 1990 2000 2010 2020 2030
1943: First NN (+/- N=10)
1988: NetTalk
(+/- N=20K)
2009: Hinton’s Deep
Belief Net (+/- N=10M)
2013: Google/Y!
(N=+/- 1B)
2025:
N = 100T = 1014
2017: Very large neural
networks (N=137B)
1012
1010
108
106
1014
104
102
100
Deep neural networks
are energy hungry
and growing fast
AI is being powered by the explosive
growth of deep neural networks
2021: Extremely large
neural networks (N=1.6T)
5. 5
Power and thermal
efficiency are essential
for on-device AI
The challenge of
AI workloads
Constrained mobile
environment
Very compute
intensive
Large,
complicated neural
network models
Complex
concurrencies
Always-on
Real-time
Must be thermally
efficient for sleek,
ultra-light designs
Storage/memory
bandwidth limitations
Requires long battery
life for all-day use
6. 6
Holistic
model efficiency
research
Multiple axes to shrink
AI models and efficiently
run them on hardware
Quantization
Learning to reduce
bit-precision while keeping
desired accuracy
Conditional
compute
Learning to execute only parts
of a large inference model
based on the input
Neural
architecture
search
Learning to design smaller
neural networks that are
on par or outperform
hand-designed
architectures on
real hardware
Compilation
Learning to compile
AI models for efficient
hardware execution
7. 7
7
AIMET and AIMET Model Zoo are products of Qualcomm Innovation Center, Inc.
Leading AI research and fast commercialization
Driving the industry towards integer inference and power-efficient AI
AI Model Efficiency Toolkit (AIMET)
AIMET Model Zoo
Relaxed Quantization
(ICLR 2019)
Data-free Quantization
(ICCV 2019)
AdaRound
(ICML 2020)
Bayesian Bits
(NeurIPS 2020)
Quantization
research
Quantization
open-sourcing
Overcoming Oscillations
(ICML 2022)
Transformer Quantization
(EMNLP 2021)
Joint Pruning and Quantization
(ECCV 2020)
FP8 Quantization
(NeurIPS 2022)
8. 8
1: FP32 model compared to quantized model
Leading
research to
efficiently
quantize
AI models
Promising results show that
low-precision integer inference
can become widespread
Virtually the same accuracy
between a FP32 and quantized
AI model through:
• Automated, data free,
post-training methods
• Automated training-based
mixed-precision method
Significant performance per watt
improvements through quantization
Automated reduction in precision
of weights and activations while
maintaining accuracy Models trained at
high precision
32-bit floating point
3452.3194
8-bit Integer
255
Increase in performance
per watt from savings in
memory and compute1
Inference at
lower precision
16-bit Integer
3452
01010101
Increase in performance
per watt from savings in
memory and compute1
up to
4X
4-bit Integer
15
Increase in performance
per watt from savings in
memory and compute1
01010101
up to
16X
up to
64X
01010101
0101
01010101 01010101 01010101 01010101
9. 9
What does it mean to quantize a neural network?
Weight and activation quantization can have different bit-precisions to maintain accuracy
Biases
Choose
quantization
bit-width
Choose
quantization
bit-width
• Simulated quantization ops
are added in the neural network
after each usage of weights
and activations, and after
every ‘operation’
• Quantization is generally
simulated in floating point
instead of running
in integer math
• Weights and activations can
be quantized with the same
or different precisions within
a model layer
• For example, W8A16 uses
quantized 8-bit weights and
16-bit activations. INT8 means
quantized 8-bit weight
and 8-bit activations.
Act quant
Input Conv / FC + RELU Output
Weights
Wt quant
10. 10
10
What algorithm to choose to improve accuracy?
Post-training quantization
(PTQ)
Quantization-aware training
(QAT)
Take pre-trained FP32 model and
convert it directly into fixed-point network
Train/fine-tune the network with the
simulated quantization operations in place
No need for the original
training pipeline
Requires access to the training
pipeline and labelled data
Data-free or small (unlabeled)
calibration set needed
Simple usage
(⇔ single API call)
Longer training times
Might not reach as high
accuracy as QAT
Hyper-parameter tuning
Achieves higher accuracy,
especially for lower bit-widths
11. 11
Which is the better format for quantizing neural networks?
Floating point vs integer
12. 12
12
2 2
S
4
S
2
S 2
𝑠: scale; 𝑚: mantissa; 𝑒: exponent; S: sign
INT8 and FP8 have the same number of values
but different distributions
Multiple FP8 formats exist,
and they consume more power than INT8
FP: 𝑧 = 𝑠 ⋅ 𝑚 ⋅ 2!
Formats
7
S
6 1
S
5 2
S
4 3
S
3 4
S
2 5
S
INT8
INT8
FP8 5/2
FP8 4/3
FP8 3/4
FP8 2/5
Formats most
commonly proposed
in the industry
INT: 𝑧 = 𝑠 ⋅ 𝑚
Mantissa Exponent
13. 13
8
7
6
5
4
2
0
3
1
10
8
6
4
2
0
Most layers of models
do not have large outliers
FP8 may be
useful for model
layers with
large outliers
Normal
Outlier-heavy distribution
Uniform
Weight distribution Signal-to-noise ratio (SNR)
higher better
INT8 5M2E 4M3E 3M4E 2M5E
INT8 5M2E 4M3E 3M4E 2M5E
SNR
INT8 5M2E 4M3E 3M4E 2M5E
12
10
8
6
4
2
0
SNR
SNR
Normal
Outlier-heavy distribution
Uniform
Some
outliers
14. 14
14
Several FP8 formats are required to get the best PTQ inference results
For different networks, different formats are better — it depends on the amount of outliers
Supporting multiple formats in hardware is expensive
Model FP32 Best FP8 format Best result Worst FP8 format Worst FP8 format
ResNet18 69.72% 69.66% 64.92%
MobileNetV2 71.70% 71.06% 49.51%
BERT 83.06 82.80 71.56
SalsaNext 55.80 55.67 55.12
HRNet 81.05 81.04 80.77
DeepLabV3
(MobileNetV2)
72.91 72.58
37.93
ViT 77.75% 77.71% 76.69
5 2
5 2
3 4
4 3
5 2
4 3
4 3
2 5
3 4
3 4
3 4
“FP8 Quantization: The Power of the Exponent”, NeurIPS 2022
2 5
5 2
2 5
15. 15
FP32 INT8 Best FP8*
69.72%
69.55% 69.66%
70.43% 69.82%
ResNet
FP32 INT8 Best FP8*
71.70%
70.94% 71.06%
71.82% 71.54%
MobileNet V2
FP32 INT8 Best FP8*
83.06
71.03 82.80
83.26 83.70
Bert
FP32 INT8 Best FP8*
72.91
71.24 72.58
73.99 72.41
DeepLabV3
*: Best FP8 is the best result from testing the different FP8 formats.
“FP8 Quantization: The Power of the Exponent”, NeurIPS 2022
INT8 has similar
results as FP8
with QAT
Outliers can
be suitably
trained with
QAT
PTQ
QAT
• No PTQ tricks –
Per-channel
• All QAT results
per-tensor quantization
• FP8 mantissa and
exponent format
was optimized for
this comparison
16. 16
FP32 INT
(W8A8)
INT
(W8A16)
Best FP8
result
ResNet18 69.72% 69.55% 69.75% 69.66%
HRNet 81.05 80.93 81.08 81.04
INT W8A16
accuracy is better
than
FP8 for all models
with PTQ
No real gap FP8/INT8
FP32 INT
(W8A8)
INT
(W8A16)
Best FP8
result
BERT 83.06 71.03 82.90 82.80
SalsaNext 55.80 54.22 55.82 55.67
ViT 77.75% 76.39% 77.73% 77.71%
MobileNetV2 69.72% 69.55% 69.75% 69.66%
• Min-max range setting
• Per-channel quantization
17. 17
INT16 performs
better than FP16
unless there are
large outliers
• 1,000 samples of 𝑋 ~
𝑁𝑜𝑟𝑚𝑎𝑙(0, 1)
• We add one outlier
and vary its value
• INT13 performs comparable
to FP16 in terms of MSE
10−4
10−5
10−7
10−6
0 250 500 750 1000
Outlier
1250 1500 1750 2000
MSE
MSE Error for w/o activation function for sigma =1.0, 100 neurons
int16
float16
18. 18
INT16
outperforms
FP16 in accuracy
and runs faster
in hardware
MobileNetV2
71.74%
FP32
71.69%
FP16
71.74%
INT16
EfficientDet-D1
40.08
FP32
40.07
FP16
40.07
INT16
19. 19
19
Integer quantization is the way to do AI inference
Enabled through PTQ and QAT techniques
Mixed precision gives the best of both worlds,
using extra precision only when necessary
INT 4 INT 8 INT 16
Best Better Good
Power efficiency
and latency
Best
Better
Good
Accuracy
20. Overcoming oscillations in quantization-aware training
Improving quantization-aware
training at lower bit-widths
21. 21
21
Poor validation accuracy is consistent across various learning rates and epochs during QAT
“Overcoming Oscillations in Quantization-Aware Training” (ICML 2022)
Validation accuracy for QAT is typically unstable
Why do we see the validation accuracy drop for 4-bit QAT?
Training accuracy
Epoch
Different
learning rates (LR)
71%
67%
62%
2 4 6 8 10 12 14 16 18 20
Validation accuracy
Epoch
44%
70%
60%
50%
42%
2 4 6 8 10 12 14 16 18 20
22. 22
22
2
0
−2
−4
Oscillations are present in QAT
Example of MobileNetV2 training (last 1000 iterations of training)
Quantized weights, 𝑞 𝑤
(Lowest bit of 4−bit weight)
Latent weights, 𝑤
(Positive FP values zoomed on 0.5)
Sign
and
Bit0
weights
Iterations during QAT
FP
weights
0.5004
0.5002
0.5000
0.4998
0.4996
0 200 400 600 1000
23. 23
Network Bits Pre-BN Acc. Post-BN Acc.
MobileNetV2 8 71.79 0.07
71.89 0.05
MobileNetV2 4 68.99 0.44
71.01 0.05
MobileNetV2 3 64.97 1.23
69.50 0.04
Method Train Loss Val. Acc. (%)
Baseline 1.3566 69.50
SR (mean + std) 1.3547 0.0053
69.58 0.09
SR (best) 1.3391 69.85
AdaRound 1.3070 70.12
+2.02
+4.53
Corrupts batch norm statistics
• At inference, BN uses running
statistics from training
• Oscillations lead to big changes
in statistics -> running statistics
are not a good estimate
• Solution: BN re-estimation
Disrupts model convergence
• At the end of training, oscillating weights
may not be on the correct ‘side’
• Stochastic rounding (SR) and binary
optimization (AdaRound) show that they
are indeed not in the best possible state.
• Oscillations prevent network from
converging to best local minimum
Oscillating weights are harmful when training a model
23
24. 24
EMA = Exponential Moving Average
Higher oscillation
frequencies during
QAT negatively
impact accuracy
Oscillation
occurs
The integer value changes
&
Its direction is opposite
to its previous one
25. 25
25
“Overcoming Oscillations in Quantization-Aware Training” (ICML 2022)
Oscillation dampening and iterative freezing fix the QAT issue
Dampening takes a regularizing approach:
the weights are forced closer to the bin center
Freezing the oscillating weights stabilizes training
and mitigates the unwanted effects of oscillations
100
80
60
40
20
0
−0.4 −0.2 0.0 0.2 0.4
wint − w/s wint − w/s
−0.4 −0.2 0.0 0.2 0.4
400
300
200
100
0
Dampening Freezing
Frozen
Not frozen
26. 26
1: “Overcoming Oscillations in Quantization-Aware Training” (ICML 2022)
2: “Learned step size quantization” (ICLR,2020)
We achieve
SOTA results
for INT4
quantization1
• Train with learned
step-size quantization (LSQ2)
and re-estimation
• Dampening and freezing
perform on par with
each other
Method W/A Val. Acc. (%)
Full-precision 32/32 65.1
LSQ*
(Esser et al., 2020)
4/4 61.0 (-4.1)
LSQ + BR
(Han et al., 2021)
4/4 61.5 (-3.6)
LSQ + Dampen
(ours)
4/4 63.7 (-1.4)
LSQ + Freeze
(ours)
4/4 63.6 (-1.5)
Method W/A Val. Acc. (%)
Full-precision 32/32 71.7
LSQ*
(Esser et al., 2020)
4/4 69.5 (−2.3)
LSQ + BR
(Han et al., 2021)
4/4 70.4 (−1.4)
LSQ + Dampen
(ours)
4/4 70.5 (−1.2)
LSQ + Freeze
(ours)
4/4 70.6 (−1.1)
MobileNetV3 MobileNetV2
28. 28
28
AIMET makes AI models small
Open-sourced GitHub project that includes state-of-the-art quantization
and compression techniques from Qualcomm AI Research
Features: State-of-the-art
network compression
tools
State-of-the-art
quantization
tools
Support for both
TensorFlow
and PyTorch
Benchmarks
and tests for
many models
Developed by
professional software
developers
If interested, please join the AIMET GitHub project: https://github.com/quic/aimet
Trained
AI model
AI Model Efficiency Toolkit
(AIMET)
Optimized
AI model
TensorFlow or PyTorch
Compression
Quantization
Deployed
AI model
29. AIMET
Providing advanced
model efficiency
features and benefits
Benefits
Lower memory
bandwidth
Lower
power
Lower
storage
Higher
performance
Maintains model
accuracy
Simple
ease of use
Features
Quantization
Compression
State-of-the-art INT8 and INT4 performance
Quantization-aware training
(QAT)
Efficient tensor decomposition
and removal of redundant
channels in convolution layers
Spatial singular value
decomposition (SVD)
Channel pruning
Visualization
Analysis tools for drawing insights
for quantization and compression
Weight ranges
Per-layer compression sensitivity
Quantization simulation
Post-training quantization
(PTQ) methods:
• Data-Free Quantization
• Adaptive Rounding (AdaRound),
• Automatic Mixed Precision (AMP)
• AutoQuant
29
30. 30
30
AIMET features and APIs are easy to use
Designed to fit naturally in the AI model development workflow for researchers, developers, and ISVs
PyTorch
model
PyTorch
Model
Train
No change
Same API
PyTorch
Model
Train
Create
QuantSim
Evaluate
Typical model
training workflow
User-friendly
QAT workflow in AIMET
No change
Same API
Evaluate
User-friendly APIs invoked directly
from the existing model pipeline
Example Jupyter notebooks on AIMET GitHub
AIMET
extensions extensions
Model optimization library
(techniques to compress & quantize models)
Framework specific API Algorithm API
Other
frameworks
31. 31
Low resolution Super resolution
First 4K
super-resolution
demo at 100+ FPS
on mobile
Our new machine-learning
based super-resolution method
8-bit quantized model created
using AIMET QAT
32. 32
With better
PTQ and QAT
techniques,
more models
will achieve better
power efficiency
AIMET enables accurate INT W4A8
for wide range of use cases
Task Model FP32 INT W4A8
Classification
ResNet50 76.10% 75.4%
ResNet18 69.75% 68.96%
EfficientNet-Lite 75.31% 74.33%
Regnext 78.3% 77.2%
Segmentation
Deeplabv3
(RN-50)
76.07% 75.91%
Super-resolution ABPN 31.97 dB 31.67 (dB)
Pose detection
PoseNet
(HRNet-32)
0.765 0.763
33. 33
33
Comparison between FP32 model and model quantized with AIMET
AIMET quantizes transformers with high accuracy,
comparable to FP32
Top-1 accuracy
81.30
FP32
80.88
INT8/
W8A16
(PTQ)
ViT base
GLUE
84.99
FP32
84.60
INT8
(QAT)
RoBERTa base
GLUE
82.73
FP32
81.95
INT8
(QAT)
BERT base
(uncased)
GLUE
79.21
FP32
78.61
INT8
(QAT)
DistilBERT base
(uncased)
34. 34
34
AIMET
Model Zoo
Accurate pre-trained 8-bit
quantized models
Image
classification
Semantic
segmentation
Pose
estimation
Speech
recognition
Object
detection
Super
resolution
35. 35
35
35
*: Comparison between FP32 model and INT8 model quantized with AIMET. For further details, check out: https://github.com/quic/aimet-model-zoo/
AIMET Model Zoo includes popular quantized AI models
Accuracy is maintained for INT8 models — less than 1% loss*
Top-1 accuracy*
75.21%
FP32
74.96%
INT8
ResNet-50 (v1)
Top-1 accuracy*
75%
FP32
74.21%
INT8
MobileNet-v2-1.4
Top-1 accuracy*
74.93%
FP32
74.99%
INT8
EfficientNet Lite
mAP*
0.2469
FP32
0.2456
INT8
ResNet-50 (v1)
mAP*
0.35
FP32
0.349
INT8
RetinaNet
mAP*
0.383
FP32
0.379
INT8
Pose estimation
PSNR*
25.45
FP32
24.78
INT8
SRGAN
Top-1 accuracy*
71.67%
FP32
71.14%
INT8
MobileNetV2
Top-1 accuracy*
75.42%
FP32
74.44%
INT8
EfficientNet-lite0
mIoU*
72.62%
FP32
72.22%
INT8
DeepLabV3+
mAP*
68.7%
FP32
68.6%
INT8
MobileNetV2-SSD-Lite
mAP*
0.364
FP32
0.359
INT8
Pose estimation
PSNR
25.51
FP32
25.5
INT8
SRGAN
WER*
9.92%
FP32
10.22%
INT8
DeepSpeech2
PSNR
32.75
FP32
32.69
INT8
ABPN
<1%
Loss in
accuracy*
36. 36
*: Comparison between FP32 model and INT8 model quantized with AIMET. For further details, check out: https://github.com/quic/aimet-model-zoo/
Super resolution
model suite
Wide variety
of models,
suited for fast,
energy-efficient
INT8 inference
• Virtually no accuracy
loss compared to FP32
• Simple and convenient
for developer integration
• Useful across diverse
applications, from gaming
and photography to XR
and autonomous driving
1 Anchor-based Plain Net (ABPN)
2 Robust Real-Time Single-Image Super Resolution (XLSR)
3 Super-Efficient Super Resolution (SESR)
PSNR (dB)
32.66
FP32
32.58
INT8
SESR-M73
PSNR (dB)
32.41
FP32
32.25
INT8
SESR-M33
PSNR (dB)
32.71
FP32
32.64
INT8
ABPN1
PSNR (dB)
32.57
FP32
32.30
INT8
XLSR2
PSNR (dB)
33.03
FP32
32.92
INT8
SESR-XL3
INT8 PSNR
and
visual quality
comparable
to FP32*
37. 37
37
Explore our open-source projects and tools
AIMET
State-of-the-art quantization
and compression techniques
github.com/quic/aimet
AIMET Model Zoo
Accurate pre-trained
8-bit quantized models
github.com/quic/aimet-model-zoo
Quantization
whitepaper
arxiv.org/abs/2201.08442
38. AI Frameworks
Qualcomm® Neural Processing SDK TF Lite
TF Lite Micro Direct ML
AI Runtimes
Programming Languages
Virtual platforms
Core Libraries
Math Libraries
Profilers & Debuggers
Compilers
System Interface SoC, accelerator drivers
Auto
XR Robotics
IoT
ACPC
Smartphones Cloud
Platforms
Qualcomm® AI Engine Direct (QNN)
Tools:
Emulation Support
Qualcomm Neural Processing SDK, Qualcomm AI Model Studio, and Qualcomm AI Engine Direct are products of Qualcomm Technologies, Inc. and/or its subsidiaries
Qualcomm AI
Model Studio
AIMET
AIMET
Model Zoo
NAS
Model
analyzers
Infrastructure:
39. 39
Model efficiency is key for
enabling on-device AI and
accelerating the growth of the
connected intelligent edge
INT8/16 perform better
than FP8/16
Qualcomm AI Research is
enabling 4-bit integer models
AIMET is making fixed-point
quantization possible at scale
without sacrificing accuracy