For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-leontiev
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Anton Leontiev, Embedded Software Architect at ELVEES, JSC, presents the "Designing a Stereo IP Camera From Scratch" tutorial at the May 2017 Embedded Vision Summit.
As the number of cameras in an intelligent video surveillance system increases, server processing of the video quickly becomes a bottleneck. On the other hand, when computer vision algorithms are moved to a resource-limited camera platform, their output quality is often unsatisfactory.
The effectiveness of vision algorithms for surveillance can be greatly improved by using a depth map in addition to the regular image. Thus, using a stereo camera is a way to enable offloading of advanced algorithms from servers to IP cameras. This talk covers the main problems arising during the design of an embedded stereo IP camera, including capturing video streams from two sensors, frame synchronization between sensors, stereo calibration algorithms, and, finally, disparity map calculation.
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...Chih-Chuan Cheng
Reducing the communication energy is essential to facilitate the growth of emerging mobile applications. In this paper, we introduce signal strength into location-based applications to reduce the energy consumption of mobile devices for data reception. First, we model the problem of data fetch scheduling, with the objective of minimizing the energy required to fetch location-based information without adversely impacting user experience. Then, we propose a dynamic-programming algorithm to solve the fundamental problem and prove its optimality in terms of energy savings. We also provide an optimality condition with respect to signal strength fluctuations. Finally, based on the algorithm, we consider implementation issues. We have also developed a virtual tour system integrated with existing web applications to validate the practicability of the proposed concept. The results of experiments conducted based on real-world case studies are very encouraging.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/introduction-to-simultaneous-localization-and-mapping-slam-a-presentation-from-gareth-cross/
Independent game developer (and former technical lead of state estimation at Skydio) Gareth Cross presents the “Introduction to Simultaneous Localization and Mapping (SLAM)” tutorial at the May 2021 Embedded Vision Summit.
This talk provides an introduction to the fundamentals of simultaneous localization and mapping (SLAM). Cross aims to provide foundational knowledge, and viewers are not expected to have any prerequisite experience in the field.
The talk consists of an introduction to the concept of SLAM, as well as practical design considerations in formulating SLAM problems. Visual inertial odometry is introduced as a motivating example of SLAM, and Cross explains how this problem is structured and solved.
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...inside-BigData.com
In this deck from the 2018 Swiss HPC Conference, Gilles Fourestey from EPFL presents: Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lensing Software.
"LENSTOOL is a gravitational lensing software that models mass distribution of galaxies and clusters. It was developed by Prof. Kneib, head of the LASTRO lab at EPFL, et al., starting from 1996. It is used to obtain sub-percent precision measurements of the total mass in galaxy clusters and constrain the dark matter self-interaction cross-section, a crucial ingredient to understanding its nature.
However, LENSTOOL lacks efficient vectorization and only uses OpenMP, which limits its execution to one node and can lead to execution times that exceed several months. Therefore, the LASTRO and the EPFL HPC group decided to rewrite the code from scratch and in order to minimize risk and maximize performance, a bottom-up approach that focuses on exposing parallelism at hardware and instruction levels was used. The result is a high performance code, fully vectorized on Xeon, Xeon Phis and GPUs that currently scales up to hundreds of nodes on CSCS’ Piz Daint, one of the fastest supercomputers in the world."
Watch the video: https://wp.me/p3RLHQ-ili
Learn more: https://infoscience.epfl.ch/record/234382/files/EPFL_TH8338.pdf?subformat=pdfa
and
http://www.hpcadvisorycouncil.com/events/2018/swiss-workshop/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-kim
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Minyoung Kim, Senior Research Engineer at Panasonic Silicon Valley Laboratory, presents the "A Fast Object Detector for ADAS using Deep Learning" tutorial at the May 2017 Embedded Vision Summit.
Object detection has been one of the most important research areas in computer vision for decades. Recently, deep neural networks (DNNs) have led to significant improvement in several machine learning domains, including computer vision, achieving the state-of-the-art performance thanks to their theoretically proven modeling and generalization capabilities. However, it is still challenging to deploy such DNNs on embedded systems, for applications such as advanced driver assistance systems (ADAS), where computation power is limited.
Kim and her team focus on reducing the size of the network and required computations, and thus building a fast, real-time object detection system. They propose a fully convolutional neural network that can achieve at least 45 fps on 640x480 frames with competitive performance. With this network, there is no proposal generation step, which can cause a speed bottleneck; instead, a single forward propagation of the network approximates the locations of objects directly.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/efficient-deep-learning-for-3d-point-cloud-understanding-a-presentation-from-facebook/
Bichen Wu, Research Scientist at Facebook Reality Labs, presents the “Efficient Deep Learning for 3D Point Cloud Understanding” tutorial at the May 2021 Embedded Vision Summit.
Understanding the 3D environment is a crucial computer vision capability required by a growing set of applications such as autonomous driving, AR/VR and AIoT. 3D visual information, captured by LiDAR and other sensors, is typically represented by a point cloud consisting of thousands of unstructured points.
Developing computer vision solutions to understand 3D point clouds requires addressing several challenges, including how to efficiently represent and process 3D point clouds, how to design efficient on-device neural networks to process 3D point clouds, and how to easily obtain data to train 3D models and improve data efficiency. In this talk, Wu shows how his company addresses these challenges as part of its “SqeezeSeg” research and presents a highly efficient, accurate, and data-efficient solution for on-device 3D point-cloud understanding.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-leontiev
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Anton Leontiev, Embedded Software Architect at ELVEES, JSC, presents the "Designing a Stereo IP Camera From Scratch" tutorial at the May 2017 Embedded Vision Summit.
As the number of cameras in an intelligent video surveillance system increases, server processing of the video quickly becomes a bottleneck. On the other hand, when computer vision algorithms are moved to a resource-limited camera platform, their output quality is often unsatisfactory.
The effectiveness of vision algorithms for surveillance can be greatly improved by using a depth map in addition to the regular image. Thus, using a stereo camera is a way to enable offloading of advanced algorithms from servers to IP cameras. This talk covers the main problems arising during the design of an embedded stereo IP camera, including capturing video streams from two sensors, frame synchronization between sensors, stereo calibration algorithms, and, finally, disparity map calculation.
Extend Your Journey: Introducing Signal Strength into Location-based Applicat...Chih-Chuan Cheng
Reducing the communication energy is essential to facilitate the growth of emerging mobile applications. In this paper, we introduce signal strength into location-based applications to reduce the energy consumption of mobile devices for data reception. First, we model the problem of data fetch scheduling, with the objective of minimizing the energy required to fetch location-based information without adversely impacting user experience. Then, we propose a dynamic-programming algorithm to solve the fundamental problem and prove its optimality in terms of energy savings. We also provide an optimality condition with respect to signal strength fluctuations. Finally, based on the algorithm, we consider implementation issues. We have also developed a virtual tour system integrated with existing web applications to validate the practicability of the proposed concept. The results of experiments conducted based on real-world case studies are very encouraging.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/introduction-to-simultaneous-localization-and-mapping-slam-a-presentation-from-gareth-cross/
Independent game developer (and former technical lead of state estimation at Skydio) Gareth Cross presents the “Introduction to Simultaneous Localization and Mapping (SLAM)” tutorial at the May 2021 Embedded Vision Summit.
This talk provides an introduction to the fundamentals of simultaneous localization and mapping (SLAM). Cross aims to provide foundational knowledge, and viewers are not expected to have any prerequisite experience in the field.
The talk consists of an introduction to the concept of SLAM, as well as practical design considerations in formulating SLAM problems. Visual inertial odometry is introduced as a motivating example of SLAM, and Cross explains how this problem is structured and solved.
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...inside-BigData.com
In this deck from the 2018 Swiss HPC Conference, Gilles Fourestey from EPFL presents: Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lensing Software.
"LENSTOOL is a gravitational lensing software that models mass distribution of galaxies and clusters. It was developed by Prof. Kneib, head of the LASTRO lab at EPFL, et al., starting from 1996. It is used to obtain sub-percent precision measurements of the total mass in galaxy clusters and constrain the dark matter self-interaction cross-section, a crucial ingredient to understanding its nature.
However, LENSTOOL lacks efficient vectorization and only uses OpenMP, which limits its execution to one node and can lead to execution times that exceed several months. Therefore, the LASTRO and the EPFL HPC group decided to rewrite the code from scratch and in order to minimize risk and maximize performance, a bottom-up approach that focuses on exposing parallelism at hardware and instruction levels was used. The result is a high performance code, fully vectorized on Xeon, Xeon Phis and GPUs that currently scales up to hundreds of nodes on CSCS’ Piz Daint, one of the fastest supercomputers in the world."
Watch the video: https://wp.me/p3RLHQ-ili
Learn more: https://infoscience.epfl.ch/record/234382/files/EPFL_TH8338.pdf?subformat=pdfa
and
http://www.hpcadvisorycouncil.com/events/2018/swiss-workshop/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-kim
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Minyoung Kim, Senior Research Engineer at Panasonic Silicon Valley Laboratory, presents the "A Fast Object Detector for ADAS using Deep Learning" tutorial at the May 2017 Embedded Vision Summit.
Object detection has been one of the most important research areas in computer vision for decades. Recently, deep neural networks (DNNs) have led to significant improvement in several machine learning domains, including computer vision, achieving the state-of-the-art performance thanks to their theoretically proven modeling and generalization capabilities. However, it is still challenging to deploy such DNNs on embedded systems, for applications such as advanced driver assistance systems (ADAS), where computation power is limited.
Kim and her team focus on reducing the size of the network and required computations, and thus building a fast, real-time object detection system. They propose a fully convolutional neural network that can achieve at least 45 fps on 640x480 frames with competitive performance. With this network, there is no proposal generation step, which can cause a speed bottleneck; instead, a single forward propagation of the network approximates the locations of objects directly.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/efficient-deep-learning-for-3d-point-cloud-understanding-a-presentation-from-facebook/
Bichen Wu, Research Scientist at Facebook Reality Labs, presents the “Efficient Deep Learning for 3D Point Cloud Understanding” tutorial at the May 2021 Embedded Vision Summit.
Understanding the 3D environment is a crucial computer vision capability required by a growing set of applications such as autonomous driving, AR/VR and AIoT. 3D visual information, captured by LiDAR and other sensors, is typically represented by a point cloud consisting of thousands of unstructured points.
Developing computer vision solutions to understand 3D point clouds requires addressing several challenges, including how to efficiently represent and process 3D point clouds, how to design efficient on-device neural networks to process 3D point clouds, and how to easily obtain data to train 3D models and improve data efficiency. In this talk, Wu shows how his company addresses these challenges as part of its “SqeezeSeg” research and presents a highly efficient, accurate, and data-efficient solution for on-device 3D point-cloud understanding.
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/an-introduction-to-single-photon-avalanche-diodes-a-new-type-of-imager-for-computer-vision-a-presentation-from-the-university-of-wisconsin-madison/
Sebastian Bauer, Postdoctoral Student at the University of Wisconsin – Madison, presents the “Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for Computer Vision” tutorial at the May 2021 Embedded Vision Summit.
The single-photon avalanche diode (SPAD) is an emerging image sensing technology with unique capabilities relevant to computer vision applications. Originally designed for imaging in low-light conditions, the ultra-high time resolution of SPADs also helps to achieve extremely high dynamic range, motion blur-free images and even seeing around corners. The use of SPADs in recent iPhone models has spurred increased interest in the use of SPADs in commercial products.
In this talk, Bauer introduces SPAD-based imagers, explains how they work, presents their fundamental capabilities, and identifies their key strengths and weaknesses relative to conventional image sensors. He also shows how they can be used in a variety of applications.
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Alex Conway
Slides for my talk on:
"Convolutional Neural Networks for Image Classification"
...at the Cape Town Deep Learning Meet-up 20170620
https://www.meetup.com/Cape-Town-deep-learning/events/240485642/
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2018-embedded-vision-summit-benosman
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Ryad B. Benosman, Professor at the University of Pittsburgh Medical Center, Carnegie Mellon University and Sorbonne Universitas, presents the "What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applications" tutorial at the May 2018 Embedded Vision Summit.
In this presentation, Benosman introduces neuromorphic, event-based approaches for image sensing and processing. State-of-the-art image sensors suffer from severe limitations imposed by their very principle of operation. These sensors acquire the visual information as a series of “snapshots” recorded at discrete point in time, hence time-quantized at a predetermined frame rate, resulting in limited temporal resolution, low dynamic range and a high degree of redundancy in the acquired data. Nature suggests a different approach: Biological vision systems are driven and controlled by events happening within the scene in view, and not – like conventional image sensors – by artificially created timing and control signals that have no relation to the source of the visual information.
Translating the frameless paradigm of biological vision to artificial imaging systems implies that control over the acquisition of visual information is no longer imposed externally on an array of pixels but rather the decision making is transferred to each individual pixel, which handles its own information individually. Benosman introduces the fundamentals underlying such bio-inspired, event-based image sensing and processing approaches, and explores their strengths and weaknesses. He shows that bio-inspired vision systems have the potential to outperform conventional, frame-based vision acquisition and processing systems and to establish new benchmarks in terms of data compression, dynamic range, temporal resolution and power efficiency in applications such as 3D vision, object tracking, motor control and visual feedback loops, in real-time.
This presentation was for my Honours project proposal. Presented to a chair of lecturers and peers.
It outlines the problem I aimed to tackle and the issues that I had discovered during research.
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/10/an-introduction-to-single-photon-avalanche-diodes-a-new-type-of-imager-for-computer-vision-a-presentation-from-the-university-of-wisconsin-madison/
Sebastian Bauer, Postdoctoral Student at the University of Wisconsin – Madison, presents the “Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for Computer Vision” tutorial at the May 2021 Embedded Vision Summit.
The single-photon avalanche diode (SPAD) is an emerging image sensing technology with unique capabilities relevant to computer vision applications. Originally designed for imaging in low-light conditions, the ultra-high time resolution of SPADs also helps to achieve extremely high dynamic range, motion blur-free images and even seeing around corners. The use of SPADs in recent iPhone models has spurred increased interest in the use of SPADs in commercial products.
In this talk, Bauer introduces SPAD-based imagers, explains how they work, presents their fundamental capabilities, and identifies their key strengths and weaknesses relative to conventional image sensors. He also shows how they can be used in a variety of applications.
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Alex Conway
Slides for my talk on:
"Convolutional Neural Networks for Image Classification"
...at the Cape Town Deep Learning Meet-up 20170620
https://www.meetup.com/Cape-Town-deep-learning/events/240485642/
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2018-embedded-vision-summit-benosman
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Ryad B. Benosman, Professor at the University of Pittsburgh Medical Center, Carnegie Mellon University and Sorbonne Universitas, presents the "What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applications" tutorial at the May 2018 Embedded Vision Summit.
In this presentation, Benosman introduces neuromorphic, event-based approaches for image sensing and processing. State-of-the-art image sensors suffer from severe limitations imposed by their very principle of operation. These sensors acquire the visual information as a series of “snapshots” recorded at discrete point in time, hence time-quantized at a predetermined frame rate, resulting in limited temporal resolution, low dynamic range and a high degree of redundancy in the acquired data. Nature suggests a different approach: Biological vision systems are driven and controlled by events happening within the scene in view, and not – like conventional image sensors – by artificially created timing and control signals that have no relation to the source of the visual information.
Translating the frameless paradigm of biological vision to artificial imaging systems implies that control over the acquisition of visual information is no longer imposed externally on an array of pixels but rather the decision making is transferred to each individual pixel, which handles its own information individually. Benosman introduces the fundamentals underlying such bio-inspired, event-based image sensing and processing approaches, and explores their strengths and weaknesses. He shows that bio-inspired vision systems have the potential to outperform conventional, frame-based vision acquisition and processing systems and to establish new benchmarks in terms of data compression, dynamic range, temporal resolution and power efficiency in applications such as 3D vision, object tracking, motor control and visual feedback loops, in real-time.
This presentation was for my Honours project proposal. Presented to a chair of lecturers and peers.
It outlines the problem I aimed to tackle and the issues that I had discovered during research.
Recent Progress on Object Detection_20170331Jihong Kang
This slide provides a brief summary of recent progress on object detection using deep learning.
The concept of selected previous works(R-CNN series/YOLO/SSD) and 6 recent papers (uploaded to the Arxiv between Dec/2016 and Mar/2017) are introduced in this slide.
Most papers are focusing on improving the performance of small object detection.
Implementation of digital image watermarking techniques using dwt and dwt svd...eSAT Journals
computerized substance. Data took care of on web and mixed media system framework is in advanced structure. Computerized watermarking is only the innovation in which there is inserting of different data in advanced substance, which we need to shield from illicit replicating. Computerized picture watermarking is concealing data in any structure (content, picture, sound and video) in unique picture without corrupting its perceptual quality. On the off chance that of Discrete Wavelet Transform (DWT), deterioration of the first picture is completed to insert the watermark. Moreover, if there should arise an occurrence of cross breed system (DWT-SVD) firstly picture is decayed by and after that watermark is installed in solitary qualities acquired by application of Singular Value Decomposition (SVD). DWT and SVD are utilized in combination to enhance the nature of watermarking. We have the procedures which are looked at on the premise of Peak Signal to Noise Ratio (PSNR) esteem at various benefits of scaling component; high estimation of PSNR is coveted because it displays great intangibility of the strategy.
Implementation of digital image watermarking techniques using dwt and dwt svd...eSAT Journals
Abstract
These days, in every field there is gigantic utilization of computerized substance. Data took care of on web and mixed media system framework is in advanced structure. Computerized watermarking is only the innovation in which there is inserting of different data in advanced substance, which we need to shield from illicit replicating. Computerized picture watermarking is concealing data in any structure (content, picture, sound and video) in unique picture without corrupting its perceptual quality. On the off chance that of Discrete Wavelet Transform (DWT), deterioration of the first picture is completed to insert the watermark. Moreover, if there should arise an occurrence of cross breed system (DWT-SVD) firstly picture is decayed by and after that watermark is installed in solitary qualities acquired by application of Singular Value Decomposition (SVD). DWT and SVD are utilized in combination to enhance the nature of watermarking. We have the procedures which are looked at on the premise of Peak Signal to Noise Ratio (PSNR) esteem at various benefits of scaling component; high estimation of PSNR is coveted because it displays great intangibility of the strategy.
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMLAI2
Regularization and transfer learning are two popular techniques to enhance generalization on unseen data, which is a fundamental problem of machine learning. Regularization techniques are versatile, as they are task- and architecture-agnostic, but they do not exploit a large amount of data available. Transfer learning methods learn to transfer knowledge from one domain to another, but may not generalize across tasks and architectures, and may introduce new training cost for adapting to the target task. To bridge the gap between the two, we propose a transferable perturbation, MetaPerturb, which is meta-learned to improve generalization performance on unseen data. MetaPerturb is implemented as a set-based lightweight network that is agnostic to the size and the order of the input, which is shared across the layers. Then, we propose a meta-learning framework, to jointly train the perturbation function over heterogeneous tasks in parallel. As MetaPerturb is a set-function trained over diverse distributions across layers and tasks, it can generalize to heterogeneous tasks and architectures. We validate the efficacy and generality of MetaPerturb trained on a specific source domain and architecture, by applying it to the training of diverse neural architectures on heterogeneous target datasets against various regularizers and fine-tuning. The results show that the networks trained with MetaPerturb significantly outperform the baselines on most of the tasks and architectures, with a negligible increase in the parameter size and no hyperparameters to tune.
The 'Rubble of the North' -a solution for modelling the irregular architectur...3D ICONS Project
The 'Rubble of the North' -a solution for modelling the irregular architecture of Ireland's historic monuments - a presentation given by Rob Shaw of the Discovery Programme, Ireland at the 3D ICONS workshop at the ISPRS Technical Commission V Symposium, which was held in Riva del Garda, Italy on 23-25 June 2014.
The presentation gives and overview of the digitisation, the challenges faced, solutions and deliverables.
Structured Forests for Fast Edge Detection [Paper Presentation]Mohammad Shaker
A Paper Presentation for "Structured Forests for Fast Edge Detection" by Dollár, Piotr, and C. Lawrence Zitnick at Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.
Improving Hardware Efficiency for DNN ApplicationsChester Chen
Speaker: Dr. Hai (Helen) Li is the Clare Boothe Luce Associate Professor of Electrical and Computer Engineering and Co-director of the Duke Center for Evolutionary Intelligence at Duke University
In this talk, I will introduce a few recent research spotlights by the Duke Center for Evolutionary Intelligence. The talk will start with the structured sparsity learning (SSL) method which attempts to learn a compact structure from a bigger DNN to reduce computation cost. It generates a regularized structure with high execution efficiency. Our experiments on CPU, GPU, and FPGA platforms show on average 3~5 times speedup of convolutional layer computation of AlexNet. Then, the implementation and acceleration of DNN applications on mobile computing systems will be introduced. MoDNN is a local distributed system which partitions DNN models onto several mobile devices to accelerate computations. ApesNet is an efficient pixel-wise segmentation network, which understands road scenes in real-time, and has achieved promising accuracy. Our prospects on the adoption of emerging technology will also be given at the end of this talk, offering the audiences an alternative thinking about the future evolution and revolution of modern computing systems.
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...Ravi Kiran B.
Modern perception pipelines in autonomous driving (AD) systems are based on Deep Neural Networks (DNNs) which utilize multiple hyper-parameter configurations and training strategies. Data augmentations is now a well-established training strategy to improve the generalization of DNNs, especially in a low dataset regime. Self-supervised learning and semi-supervised methods depend heavily on data augmentation strategies. In this study we view generalization due to data augmentations training DNNs since they implicitly model the geometric, viewpoint based transformations present on images/pointclouds due to noise, perspective, motion of the ego-vehicle. We shortly review current data augmentation strategies for perception tasks in AD, and recent developments on understanding its effects on model generalization.
In the talk we shall review data augmentation strategies through two case studies:
- Improving model performance of monocular 3D object detection model by using geometry preserving data augmentations on images
- Understand the role of data augmentation in reducing data redundancy and improving label efficiency within an active learning pipeline
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is an open access international journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Similar to [AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for All-day Vision (20)
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for All-day Vision
1. Multispectral Transfer Network:
Unsupervised Depth Estimation for All-day Vision
AAAI 2018, New Orleans
Namil Kim*, Yukyung Choi*, Soonmin Hwang, In So Kweon
KAIST RCV Lab / All-day Vision Team
*Equal contributions
2. Problem definition
Why we are interesting in depth?
“Crucial information” to understand the world around us
*From NVidia
It is necessary to 3D understanding for self-decision making
3. Problem definition
How do we usually get “dense depth”
in any time of the day?
RGB-Stereo 3D LiDAR
DayNight
≤ 11.45m≥ 23.89m
4 points
2 points
LiDAR
0.16°
Sensitive Sparse
6. Idea to all-day depth estimation
Day Night
Illumination change
RGB
O X
Unsupervised
Learning
Unsupervised
Learning
7. Idea to all-day depth estimation
Day Night
Illumination change
RGBThermal
O X
Robust to illumination change
Unsupervised
Learning
Unsupervised
Learning
8. Idea to all-day depth estimation
Day Night
Illumination change
RGBThermal
Alignment
O X
Thermal-to-depth
#1
#2
Unsupervised
Learning
Unsupervised
Learning
9. Idea to all-day depth estimation
Day Night
Illumination change
RGBThermal
Alignment
O X
Thermal-to-depth
Adaptation
Robust to illumination change
Unsupervised
Learning
Unsupervised
Learning
10. Requirements #1
Multispectral (RGB-Thermal) dataset
RGB stereo pair
Alignment between thermal and RGB(left)
3D measurement
Yukyung Choi et al., KAIST Multispectral Recognition Dataset in Day and Night, TITS’18
11. Requirements #2
Multispectral (RGB-Thermal) Transfer Network
Aim: Thermal to depth prediction
Data: Thermal and aligned left RGB
(+ right RGB, stereo pair)
Model: unsupervised method
RGBThermal
Alignment
O
U.S.L
Thermal-to-depth
12. Proposed framework
What is Multispectral Transfer Network?
@Supervised method @Unsupervised method
@MTN method
14. Key Ideas of Proposed MTN (Overview)
1) Efficient Multi-task Learning
Predicting Depth, Surface Normals and Semantic Labels
with a Common Multi-Scale Convolutional Architecture,
ICCV2015.
Without annotated data:
Propose an efficient multi-task methodology
Depth and Chromaticity
- surface normal
- semantic labeling
- object pose annotation
* Most of works under an indoor.
(difficulty of collecting sources of
subsequent task in outdoor)
Multi-task learning for
depth estimation
No human-intensive data
Relevance to the depth
Contextual information
15.
16. Key Ideas of Proposed MTN (1/4)
Predicting Depth, Surface Normals and Semantic Labels
with a Common Multi-Scale Convolutional Architecture,
ICCV2015.
- surface normal
- semantic labeling
- object pose annotation
* Most of works under an indoor.
(difficulty of collecting sources of
subsequent task in outdoor)
Previous works:
No human-intensive data
Relevance to the depth
Contextual information
Our work: Chromaticity
1) Efficient Multi-task Learning
Without annotated data:
Propose an efficient multi-task methodology
17. Key Ideas of Proposed MTN (2/4)
Interleaver Module:
to directly interleave the chromaticity into the depth estimation
“Skip-connection meets Inter-leaver for the feature learning”
Encoder Decoder
Multispectral Transfer Network (MTN)
2) Novel Module for Multi-task learning
Thermal Input
Disparity Output
Chromaticity Output
Conv.
DeConv.
Interleaver
Skip Connect.
Forward flow
18. Key Ideas of Proposed MTN (2/4)
2) Novel Module for Multi-task learning
1. Global/Un-Pooling + L2 Norm.
Enlarge receptive field [ParseNet] + feature transformation
2. Gating mechanism
Control the degree of the effectiveness of another task
to the main task. (especially in back-propagation).
3. Up-sampling and adding to previous output
Equipped in every skip-connected flows
(fully-connections between layers)
19. Key Ideas of Proposed MTN (2/4)
2) Novel Module for Multi-task learning
Do not have to find an optimal split point or
parameters. <c.f.,(b), (c), (d)>
Reduce adverse effects from inbuilt sharing
mechanism. <c.f.,(a), (b)>
Optimize the same strategy as the general multi-task
learning in end-to-end manner. <c.f., (d)>
In the inference, the Interleaver unit can be
removed. <c.f., (d)>
(a) Fully Shared Architecture
(c) No shared Architecture (d) Connected Architecture
(b) Partial Split Architectures
Previous Multi-task Learning Our Multi-task Learning
20. Key Ideas of Proposed MTN (3/4)
3) Photometric Correction
“Thermal Crossover”
Thermal-infrared image is not directly affected by changing lighting conditions.
However, thermal-infrared image suffers indirectly from cyclic illumination.
21. Key Ideas of Proposed MTN (4/4)
Propose the adaptive scaled sigmoid to stably train the
model as the bilinear activation function.
From the initial smaller maximum disparity 𝛽0,
we iteratively increase the value 𝛼 at each epoch
to cover the large disparity level in end of training.
According to the derivative,
this is not stable for large quantities in initial stages
4) Adaptive scaled sigmoid function
26. Conclusion
𝑰𝒏𝒕𝒆𝒓𝒍𝒆𝒂𝒗𝒆𝒓
in every skip-connected layer.
1. Pooling mechanism + L2 Norm.
(enlarge receptive field)
2. Gated Unit via Convolution
3. Up-sampling
Employ multi-task learning for depth estimation
Novel architecture for multi-task learning: Interleaver
Photometric correction is helpful to deal with a thermal image.
Adaptive sigmoid function help stable converge.
Visual perception such as object detection, semantic segmentation, and tracking is one of essential techniques to recognize objects around us.
For the advanced self-driving stage which is fully automatic decision making, the depth is crucial when it comes to understand the world by using objects we found.
If you would like to know where you are, where you want to go, and how far away are,
rhe depth estimation in all-day conditions is the most important technical issue for autonomous systems.
The following natural question is “how do we get high-quality depth information from the world? In a general way, we have a stereo camera and 2D/3D-LiDAR.The stereo camera works well at day-time, but as you can see, it totally fails at night-time because RGB camera has a dependency to illumination changes. LiDAR sensors have some advantages at night-time, but the depth measurement is very sparse and sensitive to weather conditions such as raining and snowing.
From these perspectives, we argue that an alternative sensor is needed to overcome the limitations.
Therefore, we proposed a thermal camera as alternative sensor beyond RGB.
We can divide the single-image depth estimation into three folds as follows:
(Click) Given the input image and the corresponding depth image, we can train the model to predict the depth directly. However, for the outdoor scenario, it is very difficult to obtain the high-quality and dense depth image for the training purpose.
(Click) To overcome this fact, unsupervised meth od has been proposed recently. Given the rectified stereo images, the model is trained to learn the geometric relation between two images, called the inverse depth as disparity. Without the high-quality depth ground truth, this approach has a big advantage to only using the calibrated stereo.
// (구체적인 방법론 스킵) When we optimize the model, we first warp the input image by using the predicted disparity to make the synthetic right image, and then we minimize the pixel-wise difference between the output right image and target right image.
(Click) Last approach is a semi-supervised method. This model is trained to minimize an unsupervised- and a supervised loss from the additional supervision as LiDAR.
From these perspectives, our model is based on the unsupervised framework to predict the depth image.
Now, we explain the key concept of all-day depth estimation.
As we mentioned above, RGB-based unsupervised approaches operates on the well-lit conditions,
while it performs poor on ill-lit conditions such as night, dawn, sunrise, and sunset.
Compared to RGB images, the thermal image has a great advantage to capture the contents regardless of amount of lighting, Because a thermal sensor measures a long wavelength radiation emitted by subjects. Therefore, various thermal sensors have been increasingly used in modern robotics and computer vision research on all-day recognition.
Let’s suppose that we have a depth prediction model trained by RGB images in day-time.
And if we have an aligned thermal image to the input RGB image, we can make a pair of the thermal and the corresponding depth image in day-time.
So, if we can train the model by using the thermal and depth corresponding pair in day-time conditions, it is possible to “adapt” this model to night conditions, since a thermal image is less affected by illumination conditions.
To sum up, we can train the model by using thermal input images and RGB stereo pair to estimate the depth in all-day conditions.
This is a key concept of the proposed multispectral transfer network.
For our purpose, there are two main requirements when estimating all-day depths from thermal images.
The first requirement is large-scale multispectral datasets. So, we designed the multispectral dataset for depth estimations.
Our dataset is captured with calibrated RGB stereo pair, a thermal image co-aligned with left-view RGB images and 3D measurements.
As shown in the example, we focus on real-world driving conditions and we captured all scenarios in well-lit and ill-lit conditions.
Another requirement is the modeling the framework with multispectral pairs to predict depth information.
Now, I will explain the proposed multispectral transfer network.
Based on the unsupervised framework and our multispectral dataset, we propose MTN method for single thermal based depth estimation.
Instead of two separate models including RGB-based unsupervised model and thermal-based supervised model, we combined two functions into a single model.
Now, I’ll introduce our four contributions in this work.
--
Our four contributions are as follows:
First) (we propose) Unsupervised(Efficient) Multi-task learning with firstly adjusting color information to depth estimation
Second) (we propose) A new type of multi-tasking architecture to prevent adverse effect of general multi-task topologies.
Third) (we propose) Photometric correction to augment thermal image with respect to cyclic illumination for all-day adaptation.
Forth) (we propose) Adaptive activation function for stable training.
In Today’s talk, I will present second contribution due to the presentation time.
In general, surface normal, segmentation mask, and poses have been used as an additional tasks for multi-tasking.
In the outdoor environment, however, it is not easy to gather this kind of heavy supervision as auxiliary data.
Therefore, we choose “chromaticity” as an auxiliary task of multi-task learning as following three reasons.
First, obtaining color information doesn’t need human-intensive annotation.
Second, color information has been used in many depth-related works in terms of keeping geometric structure.
Third, it has been proved that color information includes a contextual information so that it is useful for the network to learn a local structure.
The second contribution is to propose a new module called “Interleaver” for the better multi-task learning. Skip-connected Encoder-decoder models have been widely used in previous pixel-level prediction tasks. Based on this model, we place the interLeaver at the middle of skip-connection feature maps.
Our goal is to increase the representation power for depth estimations / by explicitly focusing on important features of the additional task as color regression / and suppressing unnecessary ones of it.
Since the skip-connected input has mostly spatial informative features, we adopt our module to blend the meaningful features from the color regression tasks together.
We assume that both depth and color-related tasks have different point of views in receptive field, so that we first pass into global/un-pooling mechanism similar to [ParseNet] and then feed it into gated convolution to control the degree of the effectiveness of an additional tasks. We place Interleaver in every skip-connected paths.
Compared to previous multi-tasking topologies, our gating mechanism of Interleaver encourages the model to find the optimal split position and reduce the adverse effects which commonly happed to the conventional methods.
Therefore, we argue that our interleaver can be a generalized topologies of skip-connection of encoder-decoder architectures. In each skip-connected layers, our interleaver automatically learn the control parameters.
Moreover, unlike the connected architectures, the Interleaver can be removed in the inference step for depth estimations.
The third contribution is about data augmentation for all-day adaptation. The thermal images are invariant to the illumination, but variant to the temperature. This makes the contrast of thermal image varies over time.
To relieve this effect, we proposed the data-driven correction function which tunes up the thermal image for temperature-variant augmentation in training phase.
The final key contribution is to propose an adaptive activation function. For disparity estimation, the network is easily prone to predict larger disparity than ground truth, so it causes divergence of network training when we used in bilinear sampler.
So our trick is to increase the maximum disparity iteratively with the scaled sigmoid activation.
Here are our results at day time. (Pointing) This is the thermal input, (Pointing) corresponding color image for visualization, and (Pointing) ground truth.
To verify our contributions, we show several results. (Pointing)
You can see this black and white error map to compare these results easily.
As you can see in the binary error map, our final MTN model shows most accurate result.
In the table, our final MTN model also achieves best performance in every metrics.
All the multi-task models, LsMTN/DsMTN/ and MTN, outperforms single task model, so our multi-task learning is effective.
And the LsMTN and DsMTN are outperformed by our final model which uses interleaver modules. So empirically say, our novel module works well for multi-task learning.
Finally, we can show that our photometric correction is another important factor, too.
These are the results at night time. As you can see here, thermal image shows good visibility even the color image is totally dark.
In this situation, it’s so hard to get dense ground truth. So, we proposed a metric which measures the performance on the locations from LiDAR ground truth.
For the reasonable comparison, our metric measures not only the ordinal relations between points, but also measures the depth accuracy simultaneously
With a new metric, our final model still shows good performance.
This is a video result for our method. Colors are mapped just for visualization. Even our method takes a single thermal image for inference, it achieves reasonable accuracy.
In conclusion, we employ multi-task learning for depth estimation. And we propose a novel architecture for effective multi-task learning. Our photometric correction and adaptive sigmoid function are usefull for training. Also we show that our interleaver helps the receptive field of the network wider and valid.
That’s it. Thank you.
Please refer to the following website for the dataset and related codes used in this paper.