Slides for paper reading in VietNam AI Community in Japan
Explanation on MobileNet V2: Inverted Residuals and Linear Bottlenecks, a paper in CVPR 23018
The document summarizes improvements made in MobileNetV3 models, including using complementary search techniques to find efficient building blocks, modifying nonlinearities like h-swish to be more efficient, and improving expensive layers through techniques like removing unnecessary projections. It also describes experiments that showed MobileNetV3 models achieving better performance versus V1/V2 models on tasks like image classification, object detection, and semantic segmentation while maintaining high efficiency for mobile applications.
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 169번째 논문 review입니다.
이번에 살펴본 논문은 Google에서 발표한 EfficientNet입니다. efficient neural network은 보통 mobile과 같은 제한된 computing power를 가진 edge device를 위한 작은 network 위주로 연구되어왔는데, 이 논문은 성능을 높이기 위해서 일반적으로 network를 점점 더 키워나가는 경우가 많은데, 이 때 어떻게 하면 더 효율적인 방법으로 network을 키울 수 있을지에 대해서 연구한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1905.11946
영상링크: https://youtu.be/Vhz0quyvR7I
The document summarizes research on developing efficient convolutional neural network architectures called MobileNets that are well-suited for mobile and embedded vision applications. The key ideas are using depthwise separable convolutions to factorize standard convolutions and using a width multiplier and resolution multiplier to control model size. Experiments show MobileNets achieve higher accuracy and speed than prior mobile networks on image classification and object detection tasks while having a smaller memory footprint.
Introduction to Convolutional Neural NetworksHannes Hapke
This document provides an introduction to machine learning using convolutional neural networks (CNNs) for image classification. It discusses how to prepare image data, build and train a simple CNN model using Keras, and optimize training using GPUs. The document outlines steps to normalize image sizes, convert images to matrices, save data formats, assemble a CNN in Keras including layers, compilation, and fitting. It provides resources for learning more about CNNs and deep learning frameworks like Keras and TensorFlow.
Recursive neural networks (RNNs) were developed to model recursive structures like images, sentences, and phrases. RNNs construct feature representations recursively from components. Later models like recursive autoencoders (RAEs), matrix-vector RNNs (MV-RNNs), and recursive neural tensor networks (RNTNs) improved on RNNs by handling unlabeled data, incorporating different composition rules, and reducing parameters. These recursive models achieved strong performance on tasks like image segmentation, sentiment analysis, and paraphrase detection.
Slides from Portland Machine Learning meetup, April 13th.
Abstract: You've heard all the cool tech companies are using them, but what are Convolutional Neural Networks (CNNs) good for and what is convolution anyway? For that matter, what is a Neural Network? This talk will include a look at some applications of CNNs, an explanation of how CNNs work, and what the different layers in a CNN do. There's no explicit background required so if you have no idea what a neural network is that's ok.
The document summarizes improvements made in MobileNetV3 models, including using complementary search techniques to find efficient building blocks, modifying nonlinearities like h-swish to be more efficient, and improving expensive layers through techniques like removing unnecessary projections. It also describes experiments that showed MobileNetV3 models achieving better performance versus V1/V2 models on tasks like image classification, object detection, and semantic segmentation while maintaining high efficiency for mobile applications.
PR-169: EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksJinwon Lee
TensorFlow-KR 논문읽기모임 PR12 169번째 논문 review입니다.
이번에 살펴본 논문은 Google에서 발표한 EfficientNet입니다. efficient neural network은 보통 mobile과 같은 제한된 computing power를 가진 edge device를 위한 작은 network 위주로 연구되어왔는데, 이 논문은 성능을 높이기 위해서 일반적으로 network를 점점 더 키워나가는 경우가 많은데, 이 때 어떻게 하면 더 효율적인 방법으로 network을 키울 수 있을지에 대해서 연구한 논문입니다. 자세한 내용은 영상을 참고해주세요
논문링크: https://arxiv.org/abs/1905.11946
영상링크: https://youtu.be/Vhz0quyvR7I
The document summarizes research on developing efficient convolutional neural network architectures called MobileNets that are well-suited for mobile and embedded vision applications. The key ideas are using depthwise separable convolutions to factorize standard convolutions and using a width multiplier and resolution multiplier to control model size. Experiments show MobileNets achieve higher accuracy and speed than prior mobile networks on image classification and object detection tasks while having a smaller memory footprint.
Introduction to Convolutional Neural NetworksHannes Hapke
This document provides an introduction to machine learning using convolutional neural networks (CNNs) for image classification. It discusses how to prepare image data, build and train a simple CNN model using Keras, and optimize training using GPUs. The document outlines steps to normalize image sizes, convert images to matrices, save data formats, assemble a CNN in Keras including layers, compilation, and fitting. It provides resources for learning more about CNNs and deep learning frameworks like Keras and TensorFlow.
Recursive neural networks (RNNs) were developed to model recursive structures like images, sentences, and phrases. RNNs construct feature representations recursively from components. Later models like recursive autoencoders (RAEs), matrix-vector RNNs (MV-RNNs), and recursive neural tensor networks (RNTNs) improved on RNNs by handling unlabeled data, incorporating different composition rules, and reducing parameters. These recursive models achieved strong performance on tasks like image segmentation, sentiment analysis, and paraphrase detection.
Slides from Portland Machine Learning meetup, April 13th.
Abstract: You've heard all the cool tech companies are using them, but what are Convolutional Neural Networks (CNNs) good for and what is convolution anyway? For that matter, what is a Neural Network? This talk will include a look at some applications of CNNs, an explanation of how CNNs work, and what the different layers in a CNN do. There's no explicit background required so if you have no idea what a neural network is that's ok.
Convolutional neural network (CNN / ConvNet's) is a part of Computer Vision. Machine Learning Algorithm. Image Classification, Image Detection, Digit Recognition, and many more. https://technoelearn.com .
This presentation displays the applications of CNNs, a quick review about Neural Networks and their drawbacks, the convolution process, padding, striding, convolution over volume, types of layers in CNN, max pool layer, fully connected layer, and lastly the famous CNNs, LetNet-5, AlexNet, VGG-16, ResNet and GoogLeNet.
In this presentation we discuss the convolution operation, the architecture of a convolution neural network, different layers such as pooling etc. This presentation draws heavily from A Karpathy's Stanford Course CS 231n
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/09/introduction-to-dnn-model-compression-techniques-a-presentation-from-xailient/
Sabina Pokhrel, Customer Success AI Engineer at Xailient, presents the “Introduction to DNN Model Compression Techniques” tutorial at the May 2021 Embedded Vision Summit.
Embedding real-time large-scale deep learning vision applications at the edge is challenging due to their huge computational, memory, and bandwidth requirements. System architects can mitigate these demands by modifying deep-neural networks to make them more energy efficient and less demanding of processing resources by applying various model compression approaches.
In this talk, Pokhrel provides an introduction to four established techniques for model compression. She discusses network pruning, quantization, knowledge distillation and low-rank factorization compression approaches.
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
This document outlines Anusua Trivedi's talk on transfer learning and fine-tuning deep neural networks. The talk covers traditional machine learning versus deep learning, using deep convolutional neural networks (DCNNs) for image analysis, transfer learning and fine-tuning DCNNs, recurrent neural networks (RNNs), and case studies applying these techniques to diabetic retinopathy prediction and fashion image caption generation.
This document provides an overview of convolutional neural networks (CNNs). It describes that CNNs are a type of deep learning model used in computer vision tasks. The key components of a CNN include convolutional layers that extract features, pooling layers that reduce spatial size, and fully-connected layers at the end for classification. Convolutional layers apply learnable filters in a local receptive field, while pooling layers perform downsampling. The document outlines common CNN architectures, such as types of layers, hyperparameters like stride and padding, and provides examples to illustrate how CNNs work.
Deep learning based object detection basicsBrodmann17
The document discusses different approaches to object detection in images using deep learning. It begins with describing detection as classification, where an image is classified into categories for what objects are present. It then discusses approaches that involve separating detection into a classification head and localization head. The document also covers improvements like R-CNN which uses region proposals to first generate candidate object regions before running classification and bounding box regression on those regions using CNN features. This helps address issues with previous approaches like being too slow when running the CNN over the entire image at multiple locations and scales.
This document summarizes the evolution of convolutional neural networks from LeNet in 1998 to ResNet in 2015. It describes key networks like AlexNet, VGG, GoogleNet, and ResNet and their contributions to improving accuracy on tasks like the ImageNet challenge. The networks progressed from LeNet's basic convolutional layers to deeper networks enabled by techniques like dropout, ReLU activations, and residual connections, leading to substantially improved accuracy over time.
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. Plus, to make AI truly ubiquitous, networks need to run on the end device within a tight power and thermal budget. One approach to help address these issues is quantization, which attempts to reduce the number of bits used for weight parameters and activation calculations without sacrificing model accuracy. This presentation covers: why quantization is important, existing quantization challenges, Qualcomm AI Research's existing quantization research, and how developers and researchers can take advantage of quantization on Qualcomm Snapdragon.
Artificial Intelligence: Artificial Neural NetworksThe Integral Worm
This document summarizes artificial neural networks (ANN), which were inspired by biological neural networks in the human brain. ANNs consist of interconnected computational units that emulate neurons and pass signals to other units through connections with variable weights. ANNs are arranged in layers and learn by modifying the weights between units based on input and output data to minimize error. Common ANN algorithms include backpropagation for supervised learning to predict outputs from inputs.
This document discusses various regularization techniques for deep learning models. It defines regularization as any modification to a learning algorithm intended to reduce generalization error without affecting training error. It then describes several specific regularization methods, including weight decay, norm penalties, dataset augmentation, early stopping, dropout, adversarial training, and tangent propagation. The goal of regularization is to reduce overfitting and improve generalizability of deep learning models.
The document discusses convolutional neural networks (CNNs). It begins with an introduction and overview of CNN components like convolution, ReLU, and pooling layers. Convolution layers apply filters to input images to extract features, ReLU introduces non-linearity, and pooling layers reduce dimensionality. CNNs are well-suited for image data since they can incorporate spatial relationships. The document provides an example of building a CNN using TensorFlow to classify handwritten digits from the MNIST dataset.
This document discusses very deep convolutional networks for large-scale image recognition. It describes network configurations that use 3x3 convolutional filters with max pooling layers and fully connected layers. The networks have 11 or 19 weight layers and use 1x1 convolutional filters to introduce nonlinearity. Classification experiments on ImageNet data with over 1 million training images achieve top-1 and top-5 error rates.
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
This document discusses using a convolutional neural network called LeNet to perform handwritten digit recognition on the MNIST dataset. It begins with an abstract that outlines using LeNet, a type of convolutional network, to accurately classify handwritten digits from 0 to 9. It then provides background on convolutional networks and how they can extract and utilize features from images to classify patterns with translation and scaling invariance. The document implements LeNet using the Keras deep learning library in Python to classify images from the MNIST dataset, which contains labeled images of handwritten digits. It analyzes the architecture of LeNet and how convolutional and pooling layers are used to extract features that are passed to fully connected layers for classification.
Introduction to Recurrent Neural NetworkKnoldus Inc.
The document provides an introduction to recurrent neural networks (RNNs). It discusses how RNNs differ from feedforward neural networks in that they have internal memory and can use their output from the previous time step as input. This allows RNNs to process sequential data like time series. The document outlines some common RNN types and explains the vanishing gradient problem that can occur in RNNs due to multiplication of small gradient values over many time steps. It discusses solutions to this problem like LSTMs and techniques like weight initialization and gradient clipping.
This document discusses convolutional neural networks (CNNs). It explains that CNNs were inspired by research on the human visual system and take a similar approach to teach computers to identify objects in images. The document outlines the key components of CNNs, including convolutional and pooling layers to extract features from images, as well as fully connected layers to classify objects. It also notes that CNNs take pixel data as input and use many examples to generalize and make predictions, similar to how humans learn visual recognition.
Model Compression (NanheeKim)
@NanheeKim @nh9k
질문이 있으면 언제든지 연락주세요!
공부한 것을 바탕으로 작성한 ppt입니다.
출처는 슬라이드 마지막에 있습니다!
Please, feel free to contact me, if you have any questions!
github: https://github.com/nh9k
email: kimnanhee97@gmail.com
This document provides an overview of convolutional neural networks and summarizes four popular CNN architectures: AlexNet, VGG, GoogLeNet, and ResNet. It explains that CNNs are made up of convolutional and subsampling layers for feature extraction followed by dense layers for classification. It then briefly describes key aspects of each architecture like ReLU activation, inception modules, residual learning blocks, and their performance on image classification tasks.
The document discusses neural networks, including human neural networks and artificial neural networks (ANNs). It provides details on the key components of ANNs, such as the perceptron and backpropagation algorithm. ANNs are inspired by biological neural systems and are used for applications like pattern recognition, time series prediction, and control systems. The document also outlines some current uses of neural networks in areas like signal processing, anomaly detection, and soft sensors.
PowerGraph is a distributed graph processing system that is well-suited for analyzing large natural graphs. It introduces a vertex-cut partitioning approach that distributes vertices across machines rather than edges. This addresses limitations of previous systems when processing graphs with power-law degree distributions. PowerGraph uses a Gather-Apply-Scatter decomposition that allows vertex programs to be parallelized across machines. It also employs techniques like delta caching to optimize performance. Evaluation on real-world graphs demonstrated reduced communication, runtime, and storage compared to previous systems.
Garbage Classification Using Deep Learning TechniquesIRJET Journal
The document discusses using deep learning techniques for garbage classification. It compares the performance of different models, including support vector machines with HOG features, simple convolutional neural networks (CNNs), CNNs with residual blocks, and a hybrid model combining CNN features with HOG features. The CNN models generally performed best, with the simple CNN achieving over 93% accuracy on test data. Residual blocks did not significantly improve performance over simple CNNs. Combining CNN and HOG features was also considered but did not clearly outperform CNNs alone. Overall, CNN models were shown to effectively classify garbage using these image datasets.
Convolutional neural network (CNN / ConvNet's) is a part of Computer Vision. Machine Learning Algorithm. Image Classification, Image Detection, Digit Recognition, and many more. https://technoelearn.com .
This presentation displays the applications of CNNs, a quick review about Neural Networks and their drawbacks, the convolution process, padding, striding, convolution over volume, types of layers in CNN, max pool layer, fully connected layer, and lastly the famous CNNs, LetNet-5, AlexNet, VGG-16, ResNet and GoogLeNet.
In this presentation we discuss the convolution operation, the architecture of a convolution neural network, different layers such as pooling etc. This presentation draws heavily from A Karpathy's Stanford Course CS 231n
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/09/introduction-to-dnn-model-compression-techniques-a-presentation-from-xailient/
Sabina Pokhrel, Customer Success AI Engineer at Xailient, presents the “Introduction to DNN Model Compression Techniques” tutorial at the May 2021 Embedded Vision Summit.
Embedding real-time large-scale deep learning vision applications at the edge is challenging due to their huge computational, memory, and bandwidth requirements. System architects can mitigate these demands by modifying deep-neural networks to make them more energy efficient and less demanding of processing resources by applying various model compression approaches.
In this talk, Pokhrel provides an introduction to four established techniques for model compression. She discusses network pruning, quantization, knowledge distillation and low-rank factorization compression approaches.
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
This document outlines Anusua Trivedi's talk on transfer learning and fine-tuning deep neural networks. The talk covers traditional machine learning versus deep learning, using deep convolutional neural networks (DCNNs) for image analysis, transfer learning and fine-tuning DCNNs, recurrent neural networks (RNNs), and case studies applying these techniques to diabetic retinopathy prediction and fashion image caption generation.
This document provides an overview of convolutional neural networks (CNNs). It describes that CNNs are a type of deep learning model used in computer vision tasks. The key components of a CNN include convolutional layers that extract features, pooling layers that reduce spatial size, and fully-connected layers at the end for classification. Convolutional layers apply learnable filters in a local receptive field, while pooling layers perform downsampling. The document outlines common CNN architectures, such as types of layers, hyperparameters like stride and padding, and provides examples to illustrate how CNNs work.
Deep learning based object detection basicsBrodmann17
The document discusses different approaches to object detection in images using deep learning. It begins with describing detection as classification, where an image is classified into categories for what objects are present. It then discusses approaches that involve separating detection into a classification head and localization head. The document also covers improvements like R-CNN which uses region proposals to first generate candidate object regions before running classification and bounding box regression on those regions using CNN features. This helps address issues with previous approaches like being too slow when running the CNN over the entire image at multiple locations and scales.
This document summarizes the evolution of convolutional neural networks from LeNet in 1998 to ResNet in 2015. It describes key networks like AlexNet, VGG, GoogleNet, and ResNet and their contributions to improving accuracy on tasks like the ImageNet challenge. The networks progressed from LeNet's basic convolutional layers to deeper networks enabled by techniques like dropout, ReLU activations, and residual connections, leading to substantially improved accuracy over time.
Artificial Intelligence (AI), specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. Plus, to make AI truly ubiquitous, networks need to run on the end device within a tight power and thermal budget. One approach to help address these issues is quantization, which attempts to reduce the number of bits used for weight parameters and activation calculations without sacrificing model accuracy. This presentation covers: why quantization is important, existing quantization challenges, Qualcomm AI Research's existing quantization research, and how developers and researchers can take advantage of quantization on Qualcomm Snapdragon.
Artificial Intelligence: Artificial Neural NetworksThe Integral Worm
This document summarizes artificial neural networks (ANN), which were inspired by biological neural networks in the human brain. ANNs consist of interconnected computational units that emulate neurons and pass signals to other units through connections with variable weights. ANNs are arranged in layers and learn by modifying the weights between units based on input and output data to minimize error. Common ANN algorithms include backpropagation for supervised learning to predict outputs from inputs.
This document discusses various regularization techniques for deep learning models. It defines regularization as any modification to a learning algorithm intended to reduce generalization error without affecting training error. It then describes several specific regularization methods, including weight decay, norm penalties, dataset augmentation, early stopping, dropout, adversarial training, and tangent propagation. The goal of regularization is to reduce overfitting and improve generalizability of deep learning models.
The document discusses convolutional neural networks (CNNs). It begins with an introduction and overview of CNN components like convolution, ReLU, and pooling layers. Convolution layers apply filters to input images to extract features, ReLU introduces non-linearity, and pooling layers reduce dimensionality. CNNs are well-suited for image data since they can incorporate spatial relationships. The document provides an example of building a CNN using TensorFlow to classify handwritten digits from the MNIST dataset.
This document discusses very deep convolutional networks for large-scale image recognition. It describes network configurations that use 3x3 convolutional filters with max pooling layers and fully connected layers. The networks have 11 or 19 weight layers and use 1x1 convolutional filters to introduce nonlinearity. Classification experiments on ImageNet data with over 1 million training images achieve top-1 and top-5 error rates.
Handwritten Digit Recognition using Convolutional Neural NetworksIRJET Journal
This document discusses using a convolutional neural network called LeNet to perform handwritten digit recognition on the MNIST dataset. It begins with an abstract that outlines using LeNet, a type of convolutional network, to accurately classify handwritten digits from 0 to 9. It then provides background on convolutional networks and how they can extract and utilize features from images to classify patterns with translation and scaling invariance. The document implements LeNet using the Keras deep learning library in Python to classify images from the MNIST dataset, which contains labeled images of handwritten digits. It analyzes the architecture of LeNet and how convolutional and pooling layers are used to extract features that are passed to fully connected layers for classification.
Introduction to Recurrent Neural NetworkKnoldus Inc.
The document provides an introduction to recurrent neural networks (RNNs). It discusses how RNNs differ from feedforward neural networks in that they have internal memory and can use their output from the previous time step as input. This allows RNNs to process sequential data like time series. The document outlines some common RNN types and explains the vanishing gradient problem that can occur in RNNs due to multiplication of small gradient values over many time steps. It discusses solutions to this problem like LSTMs and techniques like weight initialization and gradient clipping.
This document discusses convolutional neural networks (CNNs). It explains that CNNs were inspired by research on the human visual system and take a similar approach to teach computers to identify objects in images. The document outlines the key components of CNNs, including convolutional and pooling layers to extract features from images, as well as fully connected layers to classify objects. It also notes that CNNs take pixel data as input and use many examples to generalize and make predictions, similar to how humans learn visual recognition.
Model Compression (NanheeKim)
@NanheeKim @nh9k
질문이 있으면 언제든지 연락주세요!
공부한 것을 바탕으로 작성한 ppt입니다.
출처는 슬라이드 마지막에 있습니다!
Please, feel free to contact me, if you have any questions!
github: https://github.com/nh9k
email: kimnanhee97@gmail.com
This document provides an overview of convolutional neural networks and summarizes four popular CNN architectures: AlexNet, VGG, GoogLeNet, and ResNet. It explains that CNNs are made up of convolutional and subsampling layers for feature extraction followed by dense layers for classification. It then briefly describes key aspects of each architecture like ReLU activation, inception modules, residual learning blocks, and their performance on image classification tasks.
The document discusses neural networks, including human neural networks and artificial neural networks (ANNs). It provides details on the key components of ANNs, such as the perceptron and backpropagation algorithm. ANNs are inspired by biological neural systems and are used for applications like pattern recognition, time series prediction, and control systems. The document also outlines some current uses of neural networks in areas like signal processing, anomaly detection, and soft sensors.
PowerGraph is a distributed graph processing system that is well-suited for analyzing large natural graphs. It introduces a vertex-cut partitioning approach that distributes vertices across machines rather than edges. This addresses limitations of previous systems when processing graphs with power-law degree distributions. PowerGraph uses a Gather-Apply-Scatter decomposition that allows vertex programs to be parallelized across machines. It also employs techniques like delta caching to optimize performance. Evaluation on real-world graphs demonstrated reduced communication, runtime, and storage compared to previous systems.
Garbage Classification Using Deep Learning TechniquesIRJET Journal
The document discusses using deep learning techniques for garbage classification. It compares the performance of different models, including support vector machines with HOG features, simple convolutional neural networks (CNNs), CNNs with residual blocks, and a hybrid model combining CNN features with HOG features. The CNN models generally performed best, with the simple CNN achieving over 93% accuracy on test data. Residual blocks did not significantly improve performance over simple CNNs. Combining CNN and HOG features was also considered but did not clearly outperform CNNs alone. Overall, CNN models were shown to effectively classify garbage using these image datasets.
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph AnalysisJason Riedy
Applications in many areas analyze an ever-changing environment. On billion vertices graphs, providing snapshots imposes a large performance cost. We propose the first formal model for graph analysis running concurrently with streaming data updates. We consider an algorithm valid if its output is correct for the initial graph plus some implicit subset of concurrent changes. We show theoretical properties of the model, demonstrate the model on various algorithms, and extend it to updating results incrementally.
This document presents a traffic sign recognition system using a convolutional neural network (CNN) model. The authors train the CNN model on a German traffic sign dataset containing over 50,000 images across 43 classes. The proposed CNN architecture contains 4 VGGNet blocks with convolutional, max pooling, dropout and batch normalization layers. The model is trained for 45 epochs and achieves 96.9% accuracy and 11.4% test loss on the test set, outperforming other baseline models. The trained CNN model can accurately classify traffic sign images to assist with applications like self-driving cars.
This document summarizes a research paper on sparse graph attention networks (SGATs). SGATs apply an attention mechanism to only a subset of neighbors for each node to improve the scalability and memory efficiency of graph attention networks. The key ideas are a sparse attention mechanism using techniques like neighbor sampling and a binary gate attached to each edge. SGATs show advantages in scalability, memory usage, and performance on disassortative graphs by removing up to 80% of edges while maintaining classification accuracy. Evaluation on synthetic and real-world graphs demonstrates SGATs can identify and remove noisy edges.
IRJET- Design of Memristor based MultiplierIRJET Journal
This document describes the design of a 4-bit multiplier circuit using memristors. It begins with an introduction to memristors and their advantages over CMOS technology. It then discusses different window functions that can be used for memristor models and selects the Biolek window function. The document implements a 2-bit and 4-bit array multiplier circuit using memristor-CMOS hybrid logic gates. It analyzes the results in LTSpice and finds improvements in area and component count compared to traditional CMOS and other memristor-based designs. The document concludes memristors can help reduce area for multiplier circuits.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/02/improving-power-efficiency-for-edge-inferencing-with-memory-management-optimizations-a-presentation-from-samsung/
Nathan Levy, Project Leader at Samsung, presents the “Improving Power Efficiency for Edge Inferencing with Memory Management Optimizations” tutorial at the September 2020 Embedded Vision Summit.
In the race to power efficiency for neural network processing, optimizing memory use to reduce data traffic is critical. Many processors have a small local memory (typically SRAM) used as a scratch pad which can be used to reduce the expensive data traffic to and from a big remote memory (e.g., DRAM). The specific structure of neural networks allows for advanced optimization techniques to optimize the use of the local memory.
In this presentation, Levy describes the key aspects of memory management optimization for neural networks along with the trade-offs that must be managed in light of the processor architecture and the details of the network. In addition, he shows the importance of tailoring the memory management approach to the specific network, illustrated by analysis of a case study.
CE1009_Implementation of Civil IoT Architecture.pdfChenkai Sun
Billions of interconnected Internet-of-Things (IoT) devices collect huge amounts of real-time data. However, this massive stream of data presents technical challenges for processing and analysis and the digital gap between urban and rural areas is also a critical consideration. A powerful platform is crucial to cost-effectively and efficiently process such massive collections of messages. This work introduces the civil IoT architecture in Taiwan including the dedicated B20 spectrum, backbone network facilities, and a scalable data platform. The proposed system operates in Taiwan for IoT applications with real cases. In the experiment, we demonstrate the performance of signal coverage, throughput, real-time query and visualization, and a monitoring mechanism. The results showed that the presented architecture is efficient and effective for dealing with IoT scenarios in a cost-effective approach.
Using Graphs for Feature Engineering_ Graph Reduce-2.pdfWes Madrigal
GraphReduce is a solution for feature engineering on graph-structured enterprise data for machine learning. It represents tables as nodes in a graph and foreign keys as edges to flatten large datasets. It defines abstractions like cut dates and consideration periods to orient data in time. Nodes can be parameterized for primary keys, dates, file formats, and compute functions. This allows rapid development, testing, and deployment of feature pipelines across many tables for machine learning models. It was successfully used by FreightVerify to build daily updated models for supply chain monitoring from billions of events across over 20 tables.
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...Otávio Carvalho
This document summarizes research into distributing the workload of an IoT smart grid application between edge and cloud computing resources. The researchers implemented a three-layer architecture with sensors, edge nodes (Raspberry Pis), and cloud VMs. Their evaluation found that edge processing achieved higher throughput than cloud alone by reducing data sent to the cloud. Moving more workload to edge nodes and aggregating data at the edge also improved scalability. Future work could explore adaptive scheduling and evolving the architecture for general IoT applications.
Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI
This is an Image Semantic Segmentation project targeted on Satellite Imagery. The goal was to detect the pixel-wise segmentation map for various objects in Satellite Imagery including buildings, water bodies, roads etc. The data for this was taken from the Kaggle competition <https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection>.
We implemented FCN, U-Net and Segnet Deep learning architectures for this task.
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...TELKOMNIKA JOURNAL
In recent years, many applications have been implemented in embedded systems and mobile Internet of Things (IoT) devices that typically have constrained resources, smaller power budget, and exhibit "smartness" or intelligence. To implement computation-intensive and resource-hungry Convolutional Neural Network (CNN) in this class of devices, many research groups have developed specialized parallel accelerators using Graphical Processing Units (GPU), Field-Programmable Gate Arrays (FPGA), or Application-Specific Integrated Circuits (ASIC). An alternative computing paradigm called Stochastic Computing (SC) can implement CNN with low hardware footprint and power consumption. To enable building more efficient SC CNN, this work incorporates the CNN basic functions in SC that exploit correlation, share Random Number Generators (RNG), and is more robust to rounding error. Experimental results show our proposed solution provides significant savings in hardware footprint and increased accuracy for the SC CNN basic functions circuits compared to previous work.
Overview of the FlexPlan project. Focus on EU regulatory analysis and TSO-DSO...Leonardo ENERGY
Webinar recording at https://youtu.be/4s2GGlu-ylc
The FlexPlan project (https://flexplan-project.eu/) aims at establishing a new grid planning methodology making use of storage and flexible loads as an alternative to the build-up of new grid elements. After introducing the project, the webinar will focus on pan-European grid planning regulation and present practices of TSOs and DSOs.
QUILTS: Multidimensional Data Partitioning Framework Based on Query-Aware and...nishimurashoji
Presentation Slides at SIGMOD 2017
Talk video: https://www.youtube.com/watch?v=dHNsZnjwgww
My talk starts around 1:21:45.
Paper: https://dl.acm.org/citation.cfm?id=3035934&CFID=1010432390&CFTOKEN=34002366
IRJET- Single Precision Floating Point Arithmetic using VHDL CodingIRJET Journal
The document describes a VHDL implementation of single precision floating point arithmetic operations using an FPGA. It begins with an introduction to floating point arithmetic and FPGAs. It then discusses related work on floating point implementations and the IEEE 754 single precision format. The proposed algorithm and block diagram for a single precision floating point adder are presented. Simulation results demonstrating addition, subtraction, multiplication and division are also shown. The implementation of single precision floating point arithmetic using VHDL coding allows for low-cost and reprogrammable hardware. The design was synthesized using Xilinx tools and implemented on a Virtex-7 FPGA.
1. The document compares three models for predicting urban land prices: geographically weighted regression (GWR), hedonic regression, and boosted trees (XGBoost).
2. The results show that XGBoost had the highest percentage of predictions within 5%, 10%, and 20% error compared to the actual prices.
3. However, all three models still have limitations, such as only using Euclidean distance and not fully capturing local spatial effects. Improving the data quality and expanding the models could help increase prediction accuracy further.
Computational steering Interactive Design-through-Analysis for Simulation Sci...SURFevents
The document discusses computational steering and interactive design-through-analysis. It provides a vision of a unified computational framework that allows for rapid prototyping and accurate analysis of engineering designs. This framework would combine physics-informed machine learning for initial design exploration with isogeometric analysis for detailed analysis and optimization. The document then demonstrates some of the key concepts behind isogeometric analysis, including its use of B-spline basis functions to represent geometry, solutions, and right-hand sides, as well as its formulation as an abstract linear system.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
This document summarizes a research paper that proposes a distributed Canny edge detection algorithm with the following key points:
1. The algorithm divides an input image into overlapping blocks that can be processed independently and in parallel to reduce memory requirements, latency, and increase throughput compared to the original Canny algorithm.
2. A novel method is proposed for calculating hysteresis thresholds based on an 8-bin non-uniform quantized gradient magnitude histogram to reduce computational complexity compared to previous methods.
3. An FPGA architecture is presented for implementing the proposed distributed Canny algorithm, along with simulation results demonstrating it can process an image 16 times faster than the original Canny algorithm with no loss in performance.
This document summarizes the design of a single edge triggered D flip flop using the Gate Diffusion Input (GDI) technique to reduce power consumption. GDI allows implementing logic functions using fewer transistors which can reduce power and delay. The proposed D flip flop uses a master-slave configuration with 4 GDI cells and samples data on the falling clock edge. Simulation results for a 180nm process show the GDI design uses 18 transistors, has average power of 1.77uw and maximum delay of 5.48ps, providing power reductions of 40% over conventional designs. Therefore, the GDI technique is suitable for low power applications requiring high performance.
Similar to CVPR 2018 Paper Reading MobileNet V2 (20)
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
CVPR 2018 Paper Reading MobileNet V2
1. Pham Quang Khang
2018/8/18 Paper Reading Fest 20180819 1
MobileNet V2: Inverted Residuals
and Linear Bottlenecks
Mark Sandler et al. CVPR 2018
2. Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 2
3. Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 3
5. Evolution of ImageNet
■ 2012: AlexNet major debut for power of CNN
– Conv layers:3, 48, 128, 192, 192, 128
– FC layers: 2048, 2048
■ 2014: VGG 19 power of very deep network
– Conv layers: 19 conv3
– FC layers: 4096, 4096
■ 2015: ResNet very very very deep network
– 152-layers residual of various side conv
– No FC
■ 2014 – 2016: Inception -> Inception v4, Inception + ResNet
■ Xception (CVPR 2017)
■ MobileNet, ShuffleNet => it is time for architectures can fit on mobile
2018/8/18 Paper Reading Fest 20180819 5
6. Computation power requirements
■ Previous architectures required massive amount of memory and computational
power
■ In order to run image classification or detection on mobile devices, it is a must to
create lighter model with sufficient accuracy
2018/8/18 Paper Reading Fest 20180819 6
Model
ImageNet
Accuracy
Million
Mult-Adds
Million
Parameters
MobileNetV2 72.0% 300 3.4
MobileNet(1) 70.6 569 4.2
GoogleNet
(Inception)
69.8% 1550 6.8
VGG 16 71.5% 15300 138
Andrew G. Howard et al. 2017
Mark Sandler et al. 2018
7. Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 7
8. Depthwise Separable Conv
■ Conventional Conv: transform DF x DF x M (input size of DF and M
channels) to DF x DF x N, using DK x DK x M x N kernel
– Cost to compute one point in output: DKxDKxM
– Cost to compute whole output: DK x DK x M x DF x DF x N
■ Conv = filtering + combination
■ New way: split into 2 steps of filtering and combination
– Depthwise conv (filtering): use kernel size DKxDKx1 to first get
the DF x DF x M output 1
Cost: DK x DK x M x DF x DF
– Pointwise conv (combination): use kernel size 1x1xMxN to
combine channels of output 1 to final output of DF x DF x N
Cost: M x DF x DF x N
– Total cost: DF x DF x M x (DK x DK + N)
– With DK = 3, cost is down around 9 times
2018/8/18 Paper Reading Fest 20180819 8
Andrew G. Howard et al. 2017
9. ReLu and information lost
■ Manifold of interest: each activation tensor of dims ℎ𝑖 × 𝑤𝑖 × 𝑑𝑖 can be treated as
ℎ𝑖 × 𝑤𝑖 pixels with 𝑑𝑖 dimensions
■ Manifold of interest can be embedded in low-dimensional subspaces => reducing
the dimension of the layer would not cause information lost
■ Not so true with non-linear transformation like ReLU:
– If manifold of interest remains non-zero volume after ReLU transformation, it
corresponds to a linear transformation
– ReLU is capable of preserving complete information about input manifold, but
only if the input manifold lies in a low-dimensional subspace of input space
2018/8/18 Paper Reading Fest 20180819 9
Use linear bottleneck layers
10. Inverted Residuals and Linear Bottlenecks
■ Residual connections: improve the ability of gradient to propagate
■ Inverted: considerably more memory efficient
2018/8/18 Paper Reading Fest 20180819 10
Kaiming He et al. 2015
11. Unit block of MobileNet V2
■ Combining Depthwise Separable Convolutions, linear bottlenecks and inverted
residual block
■ Computational cost per block:
ℎ × 𝑤 × 𝑑 × 𝑡(𝑑′ + 𝑘2 + 𝑑)
■ With this, input and output dimension can
be relatively small
2018/8/18 Paper Reading Fest 20180819 11
Input Operator Output
ℎ × 𝑤 × 𝑑 1x1 conv2d, ReLU6 ℎ × 𝑤 × (𝑡𝑑)
ℎ × 𝑤 × 𝑡𝑑 3x3 dwise s=s, ReLU6
ℎ
𝑠
×
𝑤
𝑠
× (𝑡𝑑)
ℎ
𝑠
×
𝑤
𝑠
× 𝑡𝑑 Linear 1x1 conv2d
ℎ
𝑠
×
𝑤
𝑠
× 𝑑′
12. Inverted residual bottleneck for memory saving
■ Transformation function: 𝐹 𝑥 = 𝐴 ∙ 𝑁 ∙ 𝐵 𝑥
A: linear transformation: 𝑅 𝑠×𝑠×𝑘 → 𝑅 𝑠×𝑠×𝑛
N: ReLU6 ∙ dwise ∙ ReLU6: 𝑅 𝑠×𝑠×𝑛 → 𝑅 𝑠′×𝑠′×𝑛
B: linear transformation: 𝑅 𝑠′×𝑠′×𝑛 → 𝑅 𝑠′×𝑠′×𝑘′
■ Memory needed is:
𝑠2
𝑘 + 𝑠′2
𝑘′
+ 𝑂(max 𝑠2
, 𝑠′2
)
■ If expansion layers can be separated into t tensors (that concatenation of them
made up the tensors):
𝐹 𝑥 = σ𝑖=1
𝑡
( 𝐴𝑖 . 𝑁 . 𝐵𝑖) 𝑥
2018/8/18 Paper Reading Fest 20180819 12
A
N
B
13. Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 13
14. Architecture of the model
■ Each line is a sequence of 1 or
more identical layers, repeated n
times
■ Output channel number: c
■ First layer of each sequence has a
stride s and all others use stride 1
■ All spatial conv use 3x3 kernels
■ Bottleneck layer expansion factor t
■ Input resolution should be 96-224
■ Can use multiplier to use thinner
model
2018/8/18 Paper Reading Fest 20180819 14
Input Operator t c n s
2242
× 3 Conv2d - 32 1 2
1122
× 32 bottleneck 1 16 1 1
1122 × 16 bottleneck 6 24 2 2
562 × 24 bottleneck 6 32 3 2
282
× 32 bottleneck 6 64 4 2
142
× 64 bottleneck 6 96 3 1
142
× 96 bottleneck 6 160 3 2
72 × 160 bottleneck 6 320 1 1
72 × 320 Conv2d 1x1 - 1280 1 1
72
× 1280 Avgpool 7x7 - - 1 -
1 × 1 × 1280 Conv2d 1x1 - k -
16. Agendas
1. Motivation of research
2. Key components of MobileNet V2
a. Depthwise Separable Convolutions
b. Linear bottlenecks and inverted residual
c. Effect of linear bottlenecks and inverted residual
3. Architecture of MobileNet V2
4. Experiments and results
2018/8/18 Paper Reading Fest 20180819 16
17. ImageNet Classification
■ Tensorflow
■ RMSProp: decay and momentum of 0.9
■ Batchnorm after every layer
■ Weight decay of 0.00004
■ Initial learning rate 0.045
■ Learning rate decay 0.98 per epoch
■ 16 GPU
■ Batch size 96
2018/8/18 Paper Reading Fest 20180819 17
Model
ImageNet
Accuracy
Million
Mult-Adds
Million
Parameters
MobileNetV2 72.0% 300 3.4
MobileNet(1) 70.6 569 4.2
GoogleNet
(Inception)
69.8% 1550 6.8
VGG 16 71.5% 15300 138
18. Comparison between models for mobile (ImageNet)
■ MobileNet, ShuffleNet, NasNet ■ MobileNetV2 with different input
resolution vs NasNet, MobileNetV1,
Shuffle Net
2018/8/18 Paper Reading Fest 20180819 18
Model
ImageNet
Accuracy
Million
Mult-Adds
Million
Parameters
MobileNetV1
70.6 575 4.2
ShuffleNet(1.5) 71.5% 292 3.4
ShuffleNet (x2) 73.7% 524 5.4
NasNet-A 74% 564 5.3
MobileNetV2 72.0 300 3.4
MobileNetV2(1.
4)
74.7% 585 6.9
19. Object detection
■ Use MobileNet V2 as feature extractors for object detection with modified version of
Single Shot Detector (SSD) on COCO dataset
■ Compare with YOLOv2, original SSD
■ SSDLite: replace all normal conv with separable conv in SSD prediction layers
■ MNetV2 + SSDLite run on Pixel 1
2018/8/18 Paper Reading Fest 20180819 19Liu et al.2016
Model mAP
Ave. Precision
Params
Millions
MAdd CPU
SSD300 23.2 36.1 35.2B
SSD512 26.8 36.1 99.5B
YOLOv2 21.6 50.7 17.5B
MNet 1
SSDLite
22.2 5.1 1.3B 270ms
MNet 2
SSD Lite
22.1 4.3 0.8B 200ms
20. Thank you for listening. Time for Q&A
2018/8/18 Paper Reading Fest 20180819 20