HighLoad++ 2017
Зал «Найроби+Касабланка», 7 ноября, 15:00
Тезисы:
http://www.highload.ru/2017/abstracts/3044.html
Мы разработали технологию по детекту и распознаванию лиц для продуктов компании Mail.ru, которая показывает высокие результаты на известных тестах. Технология на данный момент используется в Мобильном Облаке@Mail.ru для кластеризации фотографий по людям, а также во внутренних сервисах компании.
...
Face recognition using artificial neural networkSumeet Kakani
This document provides an overview of a face recognition system that uses artificial neural networks. It describes the structure and processing of artificial neural networks, including convolutional networks. It discusses how the system works, including local image sampling, the self-organizing map, and the convolutional network. It then provides details about the implementation and applications of the system for face recognition, and concludes by discussing the benefits of the system.
This document summarizes a seminar presentation on face recognition using neural networks. It discusses face recognition, neural networks, the steps involved which include pre-processing, principle component analysis, and back propagation neural networks. Advantages of neural networks for face recognition are robustness to variations in faces and ability to learn from data. Face recognition has applications in security and identification.
face recognition using pca and neural network ppt................................................................................................................................................................................................................................
KaoNet: Face Recognition and Generation App using Deep LearningVan Huy
KaoNet is a face recognition and generation app using deep learning. It uses convolutional neural networks (CNNs) for face recognition and generative adversarial networks (GANs) for face generation. The app was trained on a dataset of celebrity faces collected from online sources. Initial results for face recognition were poor due to overfitting and limited data. Expanding the dataset improved validation accuracy to 98%. The GAN was also able to generate realistic looking faces after training.
This document discusses machine learning and neural networks. It provides information on supervised and unsupervised learning algorithms and applications such as speech recognition, driving cars, and data mining. It also describes the basic components of artificial neural networks including the input, hidden, and output layers. Finally, it summarizes the steps in a face recognition tool using neural networks including image acquisition, cropping, processing, and identification.
Abstract Face recognition is a form of computer vision that uses faces to identify a person or verify a person’s claimed identity. In this paper, a neural based algorithm is presented, to detect frontal views of faces. The dimensionality of input face image is reduced by the Principal component analysis and the Classification is by the neural back propagation network. This method is robust for a dataset of 300 face images and has better performance in terms of 80 – 90 % recognition rate.
This document summarizes a research paper that proposes a neural AdaBoost-based facial expression recognition system. The system uses Viola-Jones detection, Bessel transform downsampling, Gabor feature extraction, AdaBoost feature selection, and a multi-layer neural network classifier. The system was tested on the JAFFE and Yale facial expression databases, achieving average recognition rates of 96.83% and 92.2% respectively. Execution time for 100x100 pixel images was 14.5ms.
Face recognition using artificial neural networkSumeet Kakani
This document provides an overview of a face recognition system that uses artificial neural networks. It describes the structure and processing of artificial neural networks, including convolutional networks. It discusses how the system works, including local image sampling, the self-organizing map, and the convolutional network. It then provides details about the implementation and applications of the system for face recognition, and concludes by discussing the benefits of the system.
This document summarizes a seminar presentation on face recognition using neural networks. It discusses face recognition, neural networks, the steps involved which include pre-processing, principle component analysis, and back propagation neural networks. Advantages of neural networks for face recognition are robustness to variations in faces and ability to learn from data. Face recognition has applications in security and identification.
face recognition using pca and neural network ppt................................................................................................................................................................................................................................
KaoNet: Face Recognition and Generation App using Deep LearningVan Huy
KaoNet is a face recognition and generation app using deep learning. It uses convolutional neural networks (CNNs) for face recognition and generative adversarial networks (GANs) for face generation. The app was trained on a dataset of celebrity faces collected from online sources. Initial results for face recognition were poor due to overfitting and limited data. Expanding the dataset improved validation accuracy to 98%. The GAN was also able to generate realistic looking faces after training.
This document discusses machine learning and neural networks. It provides information on supervised and unsupervised learning algorithms and applications such as speech recognition, driving cars, and data mining. It also describes the basic components of artificial neural networks including the input, hidden, and output layers. Finally, it summarizes the steps in a face recognition tool using neural networks including image acquisition, cropping, processing, and identification.
Abstract Face recognition is a form of computer vision that uses faces to identify a person or verify a person’s claimed identity. In this paper, a neural based algorithm is presented, to detect frontal views of faces. The dimensionality of input face image is reduced by the Principal component analysis and the Classification is by the neural back propagation network. This method is robust for a dataset of 300 face images and has better performance in terms of 80 – 90 % recognition rate.
This document summarizes a research paper that proposes a neural AdaBoost-based facial expression recognition system. The system uses Viola-Jones detection, Bessel transform downsampling, Gabor feature extraction, AdaBoost feature selection, and a multi-layer neural network classifier. The system was tested on the JAFFE and Yale facial expression databases, achieving average recognition rates of 96.83% and 92.2% respectively. Execution time for 100x100 pixel images was 14.5ms.
This document provides an overview of deep learning and common deep learning concepts. It discusses that deep learning uses complex neural networks to determine representations of data, rather than requiring humans to engineer features. It also describes convolutional neural networks and how they are better than fully connected networks for tasks like image recognition. Additionally, it covers transfer learning and how pre-trained models can be adapted to new tasks by retraining final layers, reducing data and computation needs. Common deep learning architectures mentioned include AlexNet, VGG16, Inception and MobileNets.
This document summarizes research on deep learning approaches for face recognition. It describes the DeepFace model from Facebook, which used a deep convolutional network trained on 4.4 million faces to achieve state-of-the-art accuracy on the Labeled Faces in the Wild (LFW) dataset. It also summarizes the DeepID2 and DeepID3 models from Chinese University of Hong Kong, which employed joint identification-verification training of convolutional networks and achieved performance comparable or superior to DeepFace on LFW. Evaluation metrics for face verification and identification tasks are also outlined.
The document discusses network design and training issues for artificial neural networks. It covers architecture of the network including number of layers and nodes, learning rules, and ensuring optimal training. It also discusses data preparation including consolidation, selection, preprocessing, transformation and encoding of data before training the network.
The document describes a scene understanding model that generates natural language descriptions of images. It discusses how humans understand scenes, then outlines the key components of the model: convolutional neural networks to extract image features, transfer learning from pre-trained models, and recurrent neural networks to generate captions. The presentation includes details on CNNs, LSTMs, training the model on Flickr 30k images and captions, and a demonstration of captions generated for sample images of varying complexity.
Data Science - Part VIII - Artifical Neural NetworkDerek Kane
This lecture provides an overview of biological based learning in the brain and how to simulate this approach through the use of feed-forward artificial neural networks with back propagation. We will go through some methods of calibration and diagnostics and then apply the technique on three different data mining tasks: binary prediction, classification, and time series prediction.
This document summarizes a Kaggle competition on ultrasound nerve segmentation. It describes the data provided, which includes over 5000 training images and masks of the Brachial Plexus nerve. Several baselines are presented, with the top method being a U-Net model achieving a score of 0.62. The document then analyzes aspects of the winning solution in detail, which was based on a modified U-Net architecture with techniques like dropout, data augmentation, and an ensemble of models to achieve a final score of 0.70399. Other approaches tried like FCNs and Inception networks are also discussed.
Comparison of Learning Algorithms for Handwritten Digit RecognitionSafaa Alnabulsi
This document compares different machine learning algorithms for handwritten digit recognition on the MNIST dataset. Convolutional neural networks achieved the best results, with LeNet5 achieving 0.9% error and boosted LeNet4 achieving the lowest error rate of 0.7%. Neural networks required more training time but had faster recognition times and lower memory requirements compared to nearest neighbor classifiers. Overall, convolutional neural networks were best suited for handwritten digit recognition due to their ability to handle variations in size, position and orientation of digits.
This document provides an introduction to machine learning and neural networks. It discusses key concepts like supervised vs unsupervised learning, classification vs regression problems, and performance evaluation metrics. It also covers foundational machine learning techniques like k-nearest neighbors for classification and regression. Descriptive statistics concepts like mean, variance, correlation and covariance are introduced. Finally, it discusses visualizing data through scatter plots and histograms.
We seek to classify images into different emotions using a first 'intuitive' machine learning approach, then training models using convolutional neural networks and finally using a pretrained model for better accuracy.
This document discusses using a cascade correlation neural network (CCNN) to capture the drawing style of a caricaturist in order to automatically generate caricatures. It proposes extracting facial components from original images, mean faces, and caricatures to create training data. The CCNN is trained using this data to learn the exaggerations made by the caricaturist. Experiments show the CCNN can accurately predict nonlinear exaggerations to components. The approach aims to address limitations of existing caricature generation systems by learning an individual artist's unique style through training on their deformations of facial objects.
The document discusses using a convolutional neural network to recognize handwritten digits from the MNIST database. It describes training a CNN on the MNIST training dataset, consisting of 60,000 examples, to classify images of handwritten digits from 0-9. The CNN architecture uses two convolutional layers followed by a flatten layer and fully connected layer with softmax activation. The model achieves high accuracy on the MNIST test set. However, the document notes that the model may struggle with color images or images with more complex backgrounds compared to the simple black and white MNIST digits. Improving preprocessing and adapting the model for more complex real-world images is suggested for future work.
This document discusses the layers of convolutional neural networks (CNNs). It provides an overview of common CNN layers including convolutional layers, max pooling layers, padding, rectified linear unit (ReLU) nonlinearity, and fully connected layers. Convolutional layers extract features from input images using small filter matrices in a sliding window approach. Max pooling layers reduce the dimensionality of feature maps. Padding handles edge effects when filters are smaller than inputs. ReLU introduces nonlinearity. Fully connected layers flatten feature maps into vectors for classification. The document reviews the functions of these key CNN layers.
Detection and recognition of face using neural networkSmriti Tikoo
This document describes research on face detection and recognition using neural networks. It discusses using the Viola-Jones algorithm for face detection and a backpropagation neural network for face recognition. The Viola-Jones algorithm uses haar features, integral images, AdaBoost training, and cascading classifiers for real-time face detection. A backpropagation network with sigmoid activation functions is trained on facial images for recognition. Results show the network can accurately recognize faces after training. The document concludes the approach allows face recognition from an input image and discusses limitations and potential improvements.
Hand Gesture Recognition using OpenCV and Pythonijtsrd
Hand gesture recognition system has developed excessively in the recent years, reason being its ability to cooperate with machine successfully. Gestures are considered as the most natural way for communication among human and PCs in virtual framework. We often use hand gestures to convey something as it is non verbal communication which is free of expression. In our system, we used background subtraction to extract hand region. In this application, our PCs camera records a live video, from which a preview is taken with the assistance of its functionalities or activities. Surya Narayan Sharma | Dr. A Rengarajan "Hand Gesture Recognition using OpenCV and Python" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38413.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38413/hand-gesture-recognition-using-opencv-and-python/surya-narayan-sharma
This document provides an overview of deep learning concepts including neural networks, supervised and unsupervised learning, and key terms. It explains that deep learning uses neural networks with many hidden layers to learn features directly from raw data. Supervised learning algorithms learn from labeled examples to perform classification or regression on unseen data. Unsupervised learning finds patterns in unlabeled data. Key terms defined include neurons, activation functions, loss functions, optimizers, epochs, batches, and hyperparameters.
IRJET- Efficient Face Detection from Video Sequences using KNN and PCAIRJET Journal
1. The document proposes a new algorithm for efficient face detection from video sequences using K-Nearest Neighbors (KNN) and Principal Component Analysis (PCA).
2. PCA is used for feature extraction to reduce the dimensionality of the face images. KNN is then used for classification, where the k closest training examples are found based on Euclidean distance measures.
3. The proposed method achieves 99.47% accuracy on sample face images based on classification using 1NN, demonstrating the effectiveness of combining PCA for feature extraction with KNN for real-time face detection from video sequences.
Face recognition technology uses biometrics to automatically recognize individuals or verify their identity based on unique measurable characteristics of the human face. It analyzes 80 landmarks on the face such as distance between eyes, width of nose, cheekbones, and jawline. Face recognition is commonly used for identification from large crowds, verification for credit cards and passports, and does not require physical contact or specialized interpretation of results. Common methods of face recognition include eigenface analysis using principal component analysis to extract features from faces and match new images to those in a database. Recent applications include uses for immigration, security, and targeted advertising based on facial analysis.
Automatic gender and age classification has become quite relevant in the rise of social media platforms. However, the existing methods have not been completely successful in achieving this. Through this project, an attempt has been made to determine the gender and age based on a frame of the person. This is done by using deep learning, OpenCV which is capable of processing the real-time frames. This frame is given as input and the predicted gender and age are given as output. It is difficult to predict the exact age of a person using one frame due the facial expressions, lighting, makeup and so on so for this purpose various age ranges are taken, and the predicted age falls in one of them. The Adience dataset is used as it is a benchmark for face photos and includes various real-world imaging conditions like noise, lighting etc.
The document discusses machine learning algorithms used to predict personal income from census data. Three algorithms were tested: neural networks, support vector machines, and maximum entropy modeling. Maximum entropy modeling achieved the best results at 87.32% accuracy by using a selection of important features and excluding less predictive features like the third attribute. Voting the results of the three algorithms produced an accuracy of 85.57%.
Heuristic design of experiments w meta gradient searchGreg Makowski
Once you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?
* Give examples of the many model training parameters
* Track results in a "model notebook"
* Use a model metric that combines both accuracy and generalization to rank models
* How to strategically search over the model training parameters - use a gradient descent approach
* One way to describe an arbitrarily complex predictive system is by using sensitivity analysis
This document provides an overview of deep learning and common deep learning concepts. It discusses that deep learning uses complex neural networks to determine representations of data, rather than requiring humans to engineer features. It also describes convolutional neural networks and how they are better than fully connected networks for tasks like image recognition. Additionally, it covers transfer learning and how pre-trained models can be adapted to new tasks by retraining final layers, reducing data and computation needs. Common deep learning architectures mentioned include AlexNet, VGG16, Inception and MobileNets.
This document summarizes research on deep learning approaches for face recognition. It describes the DeepFace model from Facebook, which used a deep convolutional network trained on 4.4 million faces to achieve state-of-the-art accuracy on the Labeled Faces in the Wild (LFW) dataset. It also summarizes the DeepID2 and DeepID3 models from Chinese University of Hong Kong, which employed joint identification-verification training of convolutional networks and achieved performance comparable or superior to DeepFace on LFW. Evaluation metrics for face verification and identification tasks are also outlined.
The document discusses network design and training issues for artificial neural networks. It covers architecture of the network including number of layers and nodes, learning rules, and ensuring optimal training. It also discusses data preparation including consolidation, selection, preprocessing, transformation and encoding of data before training the network.
The document describes a scene understanding model that generates natural language descriptions of images. It discusses how humans understand scenes, then outlines the key components of the model: convolutional neural networks to extract image features, transfer learning from pre-trained models, and recurrent neural networks to generate captions. The presentation includes details on CNNs, LSTMs, training the model on Flickr 30k images and captions, and a demonstration of captions generated for sample images of varying complexity.
Data Science - Part VIII - Artifical Neural NetworkDerek Kane
This lecture provides an overview of biological based learning in the brain and how to simulate this approach through the use of feed-forward artificial neural networks with back propagation. We will go through some methods of calibration and diagnostics and then apply the technique on three different data mining tasks: binary prediction, classification, and time series prediction.
This document summarizes a Kaggle competition on ultrasound nerve segmentation. It describes the data provided, which includes over 5000 training images and masks of the Brachial Plexus nerve. Several baselines are presented, with the top method being a U-Net model achieving a score of 0.62. The document then analyzes aspects of the winning solution in detail, which was based on a modified U-Net architecture with techniques like dropout, data augmentation, and an ensemble of models to achieve a final score of 0.70399. Other approaches tried like FCNs and Inception networks are also discussed.
Comparison of Learning Algorithms for Handwritten Digit RecognitionSafaa Alnabulsi
This document compares different machine learning algorithms for handwritten digit recognition on the MNIST dataset. Convolutional neural networks achieved the best results, with LeNet5 achieving 0.9% error and boosted LeNet4 achieving the lowest error rate of 0.7%. Neural networks required more training time but had faster recognition times and lower memory requirements compared to nearest neighbor classifiers. Overall, convolutional neural networks were best suited for handwritten digit recognition due to their ability to handle variations in size, position and orientation of digits.
This document provides an introduction to machine learning and neural networks. It discusses key concepts like supervised vs unsupervised learning, classification vs regression problems, and performance evaluation metrics. It also covers foundational machine learning techniques like k-nearest neighbors for classification and regression. Descriptive statistics concepts like mean, variance, correlation and covariance are introduced. Finally, it discusses visualizing data through scatter plots and histograms.
We seek to classify images into different emotions using a first 'intuitive' machine learning approach, then training models using convolutional neural networks and finally using a pretrained model for better accuracy.
This document discusses using a cascade correlation neural network (CCNN) to capture the drawing style of a caricaturist in order to automatically generate caricatures. It proposes extracting facial components from original images, mean faces, and caricatures to create training data. The CCNN is trained using this data to learn the exaggerations made by the caricaturist. Experiments show the CCNN can accurately predict nonlinear exaggerations to components. The approach aims to address limitations of existing caricature generation systems by learning an individual artist's unique style through training on their deformations of facial objects.
The document discusses using a convolutional neural network to recognize handwritten digits from the MNIST database. It describes training a CNN on the MNIST training dataset, consisting of 60,000 examples, to classify images of handwritten digits from 0-9. The CNN architecture uses two convolutional layers followed by a flatten layer and fully connected layer with softmax activation. The model achieves high accuracy on the MNIST test set. However, the document notes that the model may struggle with color images or images with more complex backgrounds compared to the simple black and white MNIST digits. Improving preprocessing and adapting the model for more complex real-world images is suggested for future work.
This document discusses the layers of convolutional neural networks (CNNs). It provides an overview of common CNN layers including convolutional layers, max pooling layers, padding, rectified linear unit (ReLU) nonlinearity, and fully connected layers. Convolutional layers extract features from input images using small filter matrices in a sliding window approach. Max pooling layers reduce the dimensionality of feature maps. Padding handles edge effects when filters are smaller than inputs. ReLU introduces nonlinearity. Fully connected layers flatten feature maps into vectors for classification. The document reviews the functions of these key CNN layers.
Detection and recognition of face using neural networkSmriti Tikoo
This document describes research on face detection and recognition using neural networks. It discusses using the Viola-Jones algorithm for face detection and a backpropagation neural network for face recognition. The Viola-Jones algorithm uses haar features, integral images, AdaBoost training, and cascading classifiers for real-time face detection. A backpropagation network with sigmoid activation functions is trained on facial images for recognition. Results show the network can accurately recognize faces after training. The document concludes the approach allows face recognition from an input image and discusses limitations and potential improvements.
Hand Gesture Recognition using OpenCV and Pythonijtsrd
Hand gesture recognition system has developed excessively in the recent years, reason being its ability to cooperate with machine successfully. Gestures are considered as the most natural way for communication among human and PCs in virtual framework. We often use hand gestures to convey something as it is non verbal communication which is free of expression. In our system, we used background subtraction to extract hand region. In this application, our PCs camera records a live video, from which a preview is taken with the assistance of its functionalities or activities. Surya Narayan Sharma | Dr. A Rengarajan "Hand Gesture Recognition using OpenCV and Python" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38413.pdf Paper Url: https://www.ijtsrd.com/computer-science/other/38413/hand-gesture-recognition-using-opencv-and-python/surya-narayan-sharma
This document provides an overview of deep learning concepts including neural networks, supervised and unsupervised learning, and key terms. It explains that deep learning uses neural networks with many hidden layers to learn features directly from raw data. Supervised learning algorithms learn from labeled examples to perform classification or regression on unseen data. Unsupervised learning finds patterns in unlabeled data. Key terms defined include neurons, activation functions, loss functions, optimizers, epochs, batches, and hyperparameters.
IRJET- Efficient Face Detection from Video Sequences using KNN and PCAIRJET Journal
1. The document proposes a new algorithm for efficient face detection from video sequences using K-Nearest Neighbors (KNN) and Principal Component Analysis (PCA).
2. PCA is used for feature extraction to reduce the dimensionality of the face images. KNN is then used for classification, where the k closest training examples are found based on Euclidean distance measures.
3. The proposed method achieves 99.47% accuracy on sample face images based on classification using 1NN, demonstrating the effectiveness of combining PCA for feature extraction with KNN for real-time face detection from video sequences.
Face recognition technology uses biometrics to automatically recognize individuals or verify their identity based on unique measurable characteristics of the human face. It analyzes 80 landmarks on the face such as distance between eyes, width of nose, cheekbones, and jawline. Face recognition is commonly used for identification from large crowds, verification for credit cards and passports, and does not require physical contact or specialized interpretation of results. Common methods of face recognition include eigenface analysis using principal component analysis to extract features from faces and match new images to those in a database. Recent applications include uses for immigration, security, and targeted advertising based on facial analysis.
Automatic gender and age classification has become quite relevant in the rise of social media platforms. However, the existing methods have not been completely successful in achieving this. Through this project, an attempt has been made to determine the gender and age based on a frame of the person. This is done by using deep learning, OpenCV which is capable of processing the real-time frames. This frame is given as input and the predicted gender and age are given as output. It is difficult to predict the exact age of a person using one frame due the facial expressions, lighting, makeup and so on so for this purpose various age ranges are taken, and the predicted age falls in one of them. The Adience dataset is used as it is a benchmark for face photos and includes various real-world imaging conditions like noise, lighting etc.
The document discusses machine learning algorithms used to predict personal income from census data. Three algorithms were tested: neural networks, support vector machines, and maximum entropy modeling. Maximum entropy modeling achieved the best results at 87.32% accuracy by using a selection of important features and excluding less predictive features like the third attribute. Voting the results of the three algorithms produced an accuracy of 85.57%.
Heuristic design of experiments w meta gradient searchGreg Makowski
Once you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?
* Give examples of the many model training parameters
* Track results in a "model notebook"
* Use a model metric that combines both accuracy and generalization to rank models
* How to strategically search over the model training parameters - use a gradient descent approach
* One way to describe an arbitrarily complex predictive system is by using sensitivity analysis
This document provides an overview of deep learning concepts and techniques for computer vision applications using MATLAB. It discusses traditional machine learning versus deep learning, popular pretrained deep learning models, building and training convolutional neural networks (CNNs), and using transfer learning to fine-tune pretrained models on new datasets with fewer samples. The key techniques covered are loading pretrained networks, replacing the final layers for a new task, training the modified network on a smaller labeled dataset, and evaluating the trained model on test data. The document aims to explain deep learning workflows and enable readers to implement techniques like transfer learning using MATLAB.
This document discusses techniques for improving deep learning models and reducing overfitting, including regularization, batch normalization, and transfer learning. It provides explanations and examples of common regularization techniques like weight decay, dropout, and early stopping. It also explains batch normalization and how it helps speed up training and reduce internal covariate shift. Finally, it introduces transfer learning as a way to utilize pre-trained models on new tasks by freezing earlier layers and fine-tuning later layers.
This document provides an overview of deep learning including why it is used, common applications, strengths and challenges, common algorithms, and techniques for developing deep learning models. In 3 sentences: Deep learning methods like neural networks can learn complex patterns in large, unlabeled datasets and are better than traditional machine learning for tasks like image recognition. Popular deep learning algorithms include convolutional neural networks for image data and recurrent neural networks for sequential data. Effective deep learning requires techniques like regularization, dropout, data augmentation, and hyperparameter optimization to prevent overfitting on training data.
This document provides an overview of deep learning including:
1. Why deep learning performs better than traditional machine learning for tasks like image and speech recognition.
2. Common deep learning applications such as image recognition, speech recognition, and healthcare.
3. Challenges of deep learning like the need for large datasets and lack of interpretability.
The document outlines a presentation on implementing an optical face recognition module using MATLAB. It discusses the task requirements, system overview, implementation steps using various algorithms like Eigenface and Fisherface, and experiments testing recognition rates under different conditions. Future work ideas are also proposed to improve robustness and efficiency.
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
This document discusses using deep learning and deep features to build an app that finds similar images. It begins with an overview of deep learning and how neural networks can learn complex patterns in data. The document then discusses how pre-trained neural networks can be used as feature extractors for other domains through transfer learning. This reduces data and tuning requirements compared to training new deep learning models. The rest of the document focuses on building an image similarity service using these techniques, including training a model with GraphLab Create and deploying it as a web service with Dato Predictive Services.
The document provides an introduction to deep learning and how to compute gradients in deep learning models. It discusses machine learning concepts like training models on data to learn patterns, supervised learning tasks like image classification, and optimization techniques like stochastic gradient descent. It then explains how to compute gradients using backpropagation in deep multi-layer neural networks, allowing models to be trained on large datasets. Key steps like the chain rule and backpropagation of errors from the final layer back through the network are outlined.
Machine Learning Essentials Demystified part2 | Big Data DemystifiedOmid Vahdaty
The document provides an overview of machine learning concepts including linear regression, artificial neural networks, and convolutional neural networks. It discusses how artificial neural networks are inspired by biological neurons and can learn relationships in data. The document uses the MNIST dataset example to demonstrate how a neural network can be trained to classify images of handwritten digits using backpropagation to adjust weights to minimize error. TensorFlow is introduced as a popular Python library for building machine learning models, enabling flexible creation and training of neural networks.
This document discusses using fully convolutional neural networks for defect inspection. It begins with an agenda that outlines image segmentation using FCNs and defect inspection. It then provides details on data preparation including labeling guidelines, data augmentation, and model setup using techniques like deconvolution layers and the U-Net architecture. Metrics for evaluating the model like Dice score and IoU are also covered. The document concludes with best practices for successful deep learning projects focusing on aspects like having a large reusable dataset, feasibility of the problem, potential payoff, and fault tolerance.
Kaggle reviewPlanet: Understanding the Amazon from SpaceEduard Tyantov
This document summarizes a Kaggle competition to detect deforestation in the Amazon rainforest using satellite images. It describes:
1. The competition involved classifying over 150,000 image chips into 17 land cover classes to detect deforestation.
2. The baseline model was a ResNet-18 pretrained on ImageNet with fine-tuning, which achieved a score of 90.06%. Several techniques like optimal class thresholds and hyperparameter tuning improved the score to 92.53%.
3. The top models combined RGB satellite images with a near-infrared channel and indexes, training separate branches on JPG and TIF data. The best single model scored 93.071% by ensembling different model
Machine learning lets you make better business decisions by uncovering patterns in your consumer behavior data that is hard for the human eye to spot. You can also use it to automate routine, expensive human tasks that were previously not doable by computers. In the business to business space (B2B), if your competitors can make wiser business decisions based on data and automate more business operations but you still base your decisions on guesswork and lack automation, you will lose out on business productivity. In this introduction to machine learning tech talk, you will learn how to use machine learning even if you do not have deep technical expertise on this technology.
Topics covered:
1.What is machine learning
2.What is a typical ML application architecture
3.How to start ML development with free resource links
4.Key decision factors in ML technology selection depending on use case scenarios
This document provides an introduction to computer vision with convoluted neural networks. It discusses what computer vision aims to address, provides a brief overview of neural networks and their basic building blocks. It then covers the history and evolution of convolutional neural networks, how and why they work on digital images, their limitations, and applications like object detection. Examples are provided of early CNNs from the 1980s and 1990s and recent advancements through the 2010s that improved accuracy, including deeper networks, inception modules, residual connections, and efforts to increase performance like MobileNets. Training deep CNNs requires large datasets and may take weeks, but pre-trained networks can be fine-tuned for new tasks.
Similar to Face Recognition: From Scratch To Hatch / Эдуард Тянтов (Mail.ru Group) (20)
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...Ontico
HighLoad++ 2017
Зал «Калининград», 8 ноября, 15:00
Тезисы:
http://www.highload.ru/2017/abstracts/2964.html
Одноклассники состоят из более чем восьми тысяч железных серверов, расположенных в нескольких дата-центрах. Каждая из этих машин была специализированной под конкретную задачу - как для обеспечения изоляции отказов, так и для обеспечения автоматизированного управления инфраструктурой.
...
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Ontico
HighLoad++ 2017
Зал «Калининград», 8 ноября, 16:00
Тезисы:
http://www.highload.ru/2017/abstracts/3032.html
Протокол DNS на семь лет старше, чем Всемирная паутина. Стандарты RFC 882 и 883, определяющие основную функциональность системы доменных имён, появились в конце 1983 года, а первая реализация последовала уже годом позже. Естественно, что у технологии столь старой и при этом по сей день активнейшим образом используемой просто не могли не накопиться особенности, неочевидные обыкновенным пользователям.
...
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Ontico
HighLoad++ 2017
Зал «Калининград», 8 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/3010.html
В этом докладе я расскажу, как BigData-платформа помогает трансформировать Почту России, как мы управляем построением и развитием платформы. Расскажу про найденные удачные решения, например, как разбиение на продукты с понятными SLA и интерфейсами между ними помогло нам сохранять управляемость с ростом масштабов проекта.
...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 10:00
Тезисы:
http://www.highload.ru/2017/abstracts/2914.html
Казалось бы, что нужно для организации тестового окружения? Тестовая железка и копия боевого окружения - и тестовый сервер готов. Но как быть, когда проект сложный? А когда большой? А если нужно тестировать одновременно много версий? А если все это вместе?
Организация тестирования большого развивающегося проекта, где одновременно в разработке и тестировании около полусотни фич - достаточно непростая задача. Ситуация обычно осложняется тем, что иногда есть желание потрогать еще не полностью готовый функционал. В таких ситуациях часто возникает вопрос: "А куда это можно накатить и где покликать?"
...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 18:00
Тезисы:
http://www.highload.ru/2017/abstracts/2854.html
Из этого доклада вы узнаете о возможностях репликации и автофейловера PostgreSQL, в том числе о возможностях, ставших доступных в PostgreSQL 10.
Среди прочих, будет затронуты следующие темы:
* Виды репликации и решаемые с ее помощью проблемы.
* Настройка потоковой репликации.
* Настройка логической репликации.
* Настройка автофейловера / HA средствами Stolon и Consul.
После прослушивания доклада вы сможете самостоятельно настраивать репликацию и автофейловер PostgreSQL.
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 17:00
Тезисы:
http://www.highload.ru/2017/abstracts/3096.html
PostgreSQL is the world’s most advanced open source database. Indeed! With around 270 configuration parameters in postgresql.conf, plus all the knobs in pg_hba.conf, it is definitely ADVANCED!
How many parameters do you tune? 1? 8? 32? Anyone ever tuned more than 64?
No tuning means below par performance. But how to start? Which parameters to tune? What are the appropriate values? Is there a tool --not just an editor like vim or emacs-- to help users manage the 700-line postgresql.conf file?
Join this talk to understand the performance advantages of appropriately tuning your postgresql.conf file, showcase a new free tool to make PostgreSQL configuration possible for HUMANS, and learn the best practices for tuning several relevant postgresql.conf parameters.
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 16:00
Тезисы:
http://www.highload.ru/2017/abstracts/3115.html
During this session we will cover the last development in ProxySQL to support regular expressions (RE2 and PCRE) and how we can use this strong technique in correlation with ProxySQL's query rules to anonymize live data quickly and transparently. We will explain the mechanism and how to generate these rules quickly. We show live demo with all challenges we got from the Community and we finish the session by an interactive brainstorm testing queries from the audience.
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 15:00
Тезисы:
http://www.highload.ru/2017/abstracts/2957.html
Расскажем о нашем опыте разработки модуля межсетевого экрана для MySQL с использованием генератора парсеров ANTLR и языка Kotlin.
Подробно рассмотрим следующие вопросы:
— когда и почему целесообразно использовать ANTLR;
— особенности разработки ANTLR-грамматики для MySQL;
— сравнение производительности рантаймов для ANTLR в рамках задачи синтаксического анализа MySQL (C#, Java, Kotlin, Go, Python, PyPy, C++);
— вспомогательные DSL;
— микросервисная архитектура модуля экранирования SQL;
— полученные результаты.
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 14:00
Тезисы:
http://www.highload.ru/2017/abstracts/3114.html
ProxySQL aims to be the most powerful proxy in the MySQL ecosystem. It is protocol-aware and able to provide high availability (HA) and high performance with no changes in the application, using several built-in features and integration with clustering software. During this session we will quickly introduce its main features, so to better understand how it works. We will then describe multiple use case scenarios in which ProxySQL empowers large MySQL installations to provide HA with zero downtime, read/write split, query rewrite, sharding, query caching, and multiplexing using SSL across data centers.
MySQL Replication — Advanced Features / Петр Зайцев (Percona)Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/2954.html
MySQL Replication is powerful and has added a lot of advanced features through the years. In this presentation we will look into replication technology in MySQL 5.7 and variants focusing on advanced features, what do they mean, when to use them and when not, Including.
When should you use STATEMENT, ROW or MIXED binary log format?
What is GTID in MySQL and MariaDB and why do you want to use them?
What is semi-sync replication and how is it different from lossless semi-sync?
...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Ontico
HighLoad++ 2017
Зал «Кейптаун», 8 ноября, 12:00
Тезисы:
http://www.highload.ru/2017/abstracts/3120.html
Количество разработчиков мобильных приложений Сбербанк Онлайн с начала 2016 года выросло на порядок. Для того чтобы продолжать выпускать качественный продукт, мы кардинально перестраиваем процесс разработки.
Количество внутренних заказчиков тех или иных доработок в какой-то момент выросло настолько, что разработчики стали узким местом. Мы внедрили культуру разработки, которую можно условно назвать "внутренним open-source", сохранив за собой контроль над архитектурой и качеством проекта, но позволив разрабатывать новые фичи всем желающим.
...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 18:00
Тезисы:
http://www.highload.ru/2017/abstracts/2836.html
При использовании Eventually Consistent распределенных баз данных нет гарантий, что чтение возвращает результаты последних изменений данных, если чтение и запись производятся на разных узлах. Это ограничивает пропускную способность системы. Поддержка свойства Causal Consistency снимает это ограничение, что позволяет улучшить масштабируемость, не требуя изменений в коде приложения.
...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 16:00
Тезисы:
http://www.highload.ru/2017/abstracts/2858.html
Аудитория Одноклассников превышает 73 миллиона человек в России, СНГ и странах дальнего зарубежья. При этом ОК.ru - первая социальная сеть по просмотрам видео в рунете и крупнейшая сервисная платформа.
Качественный и количественный рост DDoS-атак за последние годы превращает их в одну из первоочередных проблем для крупнейших интернет-ресурсов. В зависимости от вектора атаки “узким” местом становится та или иная часть инфраструктуры. В частности, при SYN-flood первый удар приходится на систему балансировки трафика. От ее производительности зависит успех в противостоянии атаке.
...
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 15:00
Тезисы:
http://www.highload.ru/2017/abstracts/3008.html
Никогда не было и вот снова случилось! Компания Google в результате перенаправления трафика сделала недостпуными в Японии несколько тысяч различных сервисов, большинство из которых никак не связано с самой компанией Google. Однако, подобные инциденты происходят с завидной регулярностью, вот только не всегда попадают в большие СМИ. У таких инцидентов могут быть разные причины, начиная от ошибок сетевых инженеров и заканчивая государственным регулированием.
...
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 14:00
Тезисы:
http://www.highload.ru/2017/abstracts/2925.html
Облака и виртуализация – современные тренды развития IT-технологий. Операторы связи строят свои TelcoClouds на стандартах NFV (Network Functions Virtualization) и SDN (Software-Defined Networking). В докладе начнем с основ виртуализации, далее разберемся, для чего используются NFV и SDN, потом полетим к облакам и вернемся на землю для решения практических задач!
...
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Ontico
HighLoad++ 2017
Зал «Мумбай», 8 ноября, 10:00
Тезисы:
http://www.highload.ru/2017/abstracts/3045.html
Как мы заставили Druid работать в Одноклассниках.
«Druid is a high-performance, column-oriented, distributed data store» http://druid.io.
Мы расскажем о том, как, внедрив Druid, мы справились с ситуацией, когда MSSQL-based система статистики на 50 терабайт стала:
- медленной: средняя скорость ответа была в разы меньше требуемой (и увеличилась в 20 раз);
- нестабильной: в час пик статистика отставала до получаса (теперь ничего не отстает);
- дорогой: изменилась политика лицензирования Microsoft, расходы на лицензии могли составить миллионы долларов.
...
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 18:00
Тезисы:
http://www.highload.ru/2017/abstracts/2905.html
Прошло более года с того момента, как Microsoft выпустила первую версию своего нового фреймворка для разработки web-приложений ASP.NET Core, и с каждым днем он находит все больше поклонников. ASP.NET Core базируется на платформе .NET Core, кроссплатформенной версии платформы .NET c открытым исходным кодом. Теперь у С#-разработчиков появилась возможность использовать Mac в качестве среды разработки, и запускать приложения на Linux или внутри Docker-контейнеров.
...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 14:00
Тезисы:
http://www.highload.ru/2017/abstracts/2913.html
Изначально будут раскрыты базовые причины, которые заставили появиться такой части механизма СУБД, как кэш результатов, и почему в ряде СУБД он есть или отсутствует.
Будут рассмотрены различные варианты кэширования результатов как sql-запросов, так и результатов хранимой в БД бизнес-логики. Произведено сравнение способов кэширования (программируемые вручную кэши, стандартный функционал) и даны рекомендации, когда и в каких случаях данные способы оптимальны, а порой опасны.
...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Ontico
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 13:00
Тезисы:
http://www.highload.ru/2017/abstracts/2947.html
Apache Ignite — Open Source платформа для высокопроизводительной распределенной работы с большими данными с применением SQL или Java/.NET/C++ API. Ignite используют в самых разных отраслях. Сбербанк, ING, RingCentral, Microsoft, e-Therapeutics — все эти компании применяют решения на основе Ignite. Размеры кластеров разнятся от всего одного узла до нескольких сотен, узлы могут быть расположены в одном ЦОД-е или в нескольких геораспределенных.
...
HighLoad++ 2017
Зал «Рио-де-Жанейро», 8 ноября, 12:00
Тезисы:
http://www.highload.ru/2017/abstracts/3005.html
Когда мы говорим о нагруженных системах и базах данных с большим числом параллельных коннектов, особый интерес представляет практика эксплуатации и сопровождения таких проектов. В том числе инструменты и механизмы СУБД, которые могут быть использованы DBA и DevOps-инженерами для решения задач мониторинга жизнедеятельности базы данных и ранней диагностики возможных проблем.
...
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
13. Viola-Jones algorithm: inference
for each patch
Stages
Face
Yes Yes
Stage 1 Stage 2 Stage N
Optimization
– Features are grouped into stages
– If a patch fails any stage => discard
19. Comparison: MTCNN vs R-FCN
MTCNN
+ Faster
+ Landmarks
- Less accurate
- No batch processing
Model GPU Inference FDDB Precision
(100 errors)
R-FCN 40 ms 92%
MTCNN 17 ms 90%
21. What is TensorRT
NVIDIA TensorRT is a high-performance deep learning inference optimizer
Features
– Improves performance for complex networks
– FP16 & INT8 support
– Effective at small batch-sizes
25. Batch processing
Results
– Single run
– Enables batch processing
Model Inference
ms
MTCNN (Caffe, python) 17
MTCNN (Caffe, C++) 12.7
+ batch 10.7
26. TensorRT: layers
Problem
No PReLU layer => default pre-trained
model can’t be used
Retrained with ReLU from scratch
Model GPU Inference
ms
FDDB Precision
(100 errors)
MTCNN, batch 10.7 90%
+Tensor RT 8.8 91.2%
-20%
36. Softmax
– Learned features only separable but not discriminative
– The resulting features are not sufficiently effective
close
37. We need metric learning
– Tightness of the cluster
– Discriminative features
38. Triplet loss
Features
– Identity -> single point
– Enforces a margin between persons
Anchor
Positive Negativepositive + α < negative
minimize maximize
39. Choosing triplets
Crucial problem
How to choose triplets ? Useful triplets = hardest errors
Pick all
positive
Too easy
Hard enough
Solution
Hard-mining within a large mini-batch (>1000)
47. Center loss: structure
– Without classification loss – collapses
CNN
Embedding
Classify
Softmax
Loss
λ
Center
Loss
Pull
– Final loss = Softmax loss + λ Center loss
51. Center loss: summary
Overview
– Intra-class compactness and inter-class separability
– Good performance at several other tasks
Opensource Code
– Caffe (original, Megaface - 65%)
LFW, % Megaface
Triplet Loss 99.35 65
Center Loss
(Torch, ours)
99.60 71.7
52. Tricks: augmentation
Test time augmentation
– Flip image
Embedding
Flipped
Embedding
Final
Embedding
Average
– Average embeddings
– Compute 2 embeddings
54. Shades on
At one point we used shades augmentation
How to
– Construct several sunglass textures
– Place them using landmarks
55. Tricks: adding Eye loss
CNN
Embedding
Person
Softmax
Loss
Center
Loss
– We can force CNN to learn specific discriminative features
– For celebrities eye colors are available in the Internet
Eye
Loss
Eye
Color
56. Eye loss: summary
*Adding simple features doesn’t help, i.e gender
LFW, % Megaface
Center Loss + Tricks 99.68 73
Center Loss + Eye 99.68 73.5
60. Angular softmax: summary
Overview
– As describes in the paper: doesn’t work at all
– Works using sum of losses (m=1,N) over training
• only on small datasets!
LFW, % Megaface
Center Loss 99.6 73
Center Loss + Eye 99.68 73.5
A-Softmax (Torch) 99.68 74.2
Opensource Code
– Caffe (original)
– Slight modification of the loss yields 74.2%
61. Metric learning: summary
Softmax < Triplet < Center < A-Softmax
A-Softmax
– With bells and whistles better than center loss
Center loss
Overall
– Rule of thumb: use Center loss
– Metric learning may improve classification performance
63. Errors after MSCeleb: children
Problem
Children all look alike
Result
Embeddings are almost single point in
the space
64. Errors after MSCeleb: asian
Problem
Face Recognition’s intolerant to
Asians
Reason
Dataset doesn’t contain enough
photos of these categories
65. How to fix these errors ?
It’s all about data, we need diverse
dataset!
Natural choice – avatars of social networks
66. A way to construct dataset
Face
Detection
Pick
largest
Face
Recognition+
Clustering
Cleaning algorithm
1. Face detection2. Face recognition -> embeddings3. Hierarchical clustering algorithm4. Pick the largest cluster as a personIterate after each model improvement
67. MSCeleb dataset’s errors
MSCeleb is constructed by leveraging search engines
Joe Eszterhas
Joe Eszterhas and Mel Gibson public confrontation leads to the error
Mel Gibson
=
74. Workaround
Algorithm
1. Construct dataset with children
2. Compute average embedding
3. Every point inside the sphere – a child
4. Tighten distance threshold there
Results
This allows softening the overall threshold
75. How to handle big dataset
It seems we can add more data infinitely, but no.
Problems
– Memory consumption (Softmax)
– Computational costs
– A lot of noise in gradients
77. Softmax Approximation
Algorithm
1. Perform K-Means clustering using current FR model
CNN
Embedding
Predict
cluster
Predict
person MenPerson
Softmax
2. Two Softmax heads:
1. Predicts cluster label
2. Class within the true cluster
Cluster
Softmax
Men
85. Fixing trash clusters
«Trash» has small norm Faces
Trash
Softmax loss
Motivation
Softmax encourage big embedding’s norms
Results
– ROC AUC 97%
– Better then Laplacian for blurry
90. Summary
1. Use TensorRT to speed up inference
2. Metric learning: use Center loss by default
3. Clean your data thoroughly
4. Understanding CNN helps to fight errors
99. Histogram loss
Idea
– Compute similarities positive and negative pairs
– Maximize a probability that a randomly sampled positive pair has smaller similarity than
negative one
Loss = the integral of the product between the negative distribution and the cumulative density function for the positive distribution (shown with a dashed line)
«Another approach – HOG features (dlib)» «not far from state-of-the-art (Google reports 94%)»
Offtop:
обучается первая сеть на 12x12, вторая – 24x24, 3 – 36x36 – hard negative mining для 2 и 3 для исправления ошибок
выдает первая 5x1x1, а на пирамиде тензор, дальше составляется батч для 2 и 3 (батч над батчами не делали)
https://kpzhang93.github.io/MTCNN_face_detection_alignment/paper/spl.pdf
«Open source
Pre-trained model (Caffe)
No original training code
«
«but not Torch/PyTorch
We’ve retrained the CNN because of PreLu absense»
basic- actively devolping
TODO: возможно показать что мы не подряд засовываем в сеть, а сразу в кучу, хз не перегруз ли
«All faces of person are project to a single point»
Chad Smith
TODO: про альфа лишнее ?
https://ydwen.github.io/papers/WenECCV16.pdf, 65% из статьи
TODO: убрать текст ?
(Better than dlib)
TODO: не выпилить но доделать нормально фотка чувака точки + очки
«пол например не является discriminative – пробовали»
71.5% -> 75.5% или +2
71.5 vs 73, 75,5 из-за датасета
СЛОВА
«сказать что норма – это деление на сумму»
To discriminate better -> tighten clusters
http://proceedings.mlr.press/v48/liud16.pdf
http://proceedings.mlr.press/v48/liud16.pdf
“Рассказать про идею созданую по мотивам корпоратиа”
VK – easy API
OK – hard access to API, rate-limits
Facebook – just closed to mining
Cleaning algorithm
Face Detection
Face Recognition to get embeddings
Hierarchical clustering algorithm
with Soften the distance threshold
Pick the largest cluster as a person
iterate after each model improvement
this leads to improved perf on megaface = 75%?
цифра лучше ?
чуть меньше текста или на 2 слайда разбить
TODO: мы обнаружили что дети компанкто расположены
https://arxiv.org/pdf/1609.04309.pdf - paper: “Efficient softmax approximation for GPUs”
«какой профит помимо решения проблемы»
https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_gradients/py_gradients.html
It’s a face, but without enough facial information
“If the detector makes a mistake, recognizer gone crazy. Detection errors often form a cluster.”
«софтмакс стремится увеличить скалярное произведение»
maximizing scalar product <w_i, x> for i class
We normalize embeddings only for distance
“Social networks contains avatars over users’ lives”