Report by Luís Mey ( https://www.linkedin.com/in/lu%C3%ADs-gustavo-bernardo-mey-97b38927/ ) on Udacity Machine Learning Course - Final Project: Use Deep Learning to Find Similar Dresses.
Using deep neural networks for fashion applicationsAhmad Qamar
Talk abstract:
Deep learning has been a popular and powerful approach for solving computer vision problems in recent years. As web and social media content shifts towards rich-media, deep learning can be used to tackle the problem of understanding images to better capture user's fashion preferences. In this talk we take a closer look at convolutional neural networks used for detecting, tagging, and indexing fashion images. We'll also cover related work in the area, illustrate a wide range of applications, discuss challenges and merits of domain-specific deep learning models, and touch upon future work.
Thread Genius is a NYC-based Techstars-backed visual search and recommendation platform for fashion content. Use the full suite of Thread Genius APIs to index and identify clothing within UGC photos, find visually similar alternatives, or recommendations on how to complete the look. Find out more at threadgenius.co
Image Classification Done Simply using Keras and TensorFlow Rajiv Shah
This presentation walks through the process of building an image classifier using Keras with a TensorFlow backend. It will give a basic understanding of image classification and show the techniques used in industry to build image classifiers. The presentation will start with building a simple convolutional network, augmenting the data, using a pretrained network, and finally using transfer learning by modifying the last few layers of a pretrained network. The classification will be based on the classic example of classifying cats and dogs. The code for the presentation can be found at https://github.com/rajshah4/image_keras, and the presentation will discuss how to extend the code to your own pictures to make a custom image classifier.
Using deep neural networks for fashion applicationsAhmad Qamar
Talk abstract:
Deep learning has been a popular and powerful approach for solving computer vision problems in recent years. As web and social media content shifts towards rich-media, deep learning can be used to tackle the problem of understanding images to better capture user's fashion preferences. In this talk we take a closer look at convolutional neural networks used for detecting, tagging, and indexing fashion images. We'll also cover related work in the area, illustrate a wide range of applications, discuss challenges and merits of domain-specific deep learning models, and touch upon future work.
Thread Genius is a NYC-based Techstars-backed visual search and recommendation platform for fashion content. Use the full suite of Thread Genius APIs to index and identify clothing within UGC photos, find visually similar alternatives, or recommendations on how to complete the look. Find out more at threadgenius.co
Image Classification Done Simply using Keras and TensorFlow Rajiv Shah
This presentation walks through the process of building an image classifier using Keras with a TensorFlow backend. It will give a basic understanding of image classification and show the techniques used in industry to build image classifiers. The presentation will start with building a simple convolutional network, augmenting the data, using a pretrained network, and finally using transfer learning by modifying the last few layers of a pretrained network. The classification will be based on the classic example of classifying cats and dogs. The code for the presentation can be found at https://github.com/rajshah4/image_keras, and the presentation will discuss how to extend the code to your own pictures to make a custom image classifier.
This presentation displays the applications of CNNs, a quick review about Neural Networks and their drawbacks, the convolution process, padding, striding, convolution over volume, types of layers in CNN, max pool layer, fully connected layer, and lastly the famous CNNs, LetNet-5, AlexNet, VGG-16, ResNet and GoogLeNet.
ResNet (short for Residual Network) is a deep neural network architecture that has achieved significant advancements in image recognition tasks. It was introduced by Kaiming He et al. in 2015.
The key innovation of ResNet is the use of residual connections, or skip connections, that enable the network to learn residual mappings instead of directly learning the desired underlying mappings. This addresses the problem of vanishing gradients that commonly occurs in very deep neural networks.
In a ResNet, the input data flows through a series of residual blocks. Each residual block consists of several convolutional layers followed by batch normalization and rectified linear unit (ReLU) activations. The original input to a residual block is passed through the block and added to the output of the block, creating a shortcut connection. This addition operation allows the network to learn residual mappings by computing the difference between the input and the output.
By using residual connections, the gradients can propagate more effectively through the network, enabling the training of deeper models. This enables the construction of extremely deep ResNet architectures with hundreds of layers, such as ResNet-101 or ResNet-152, while still maintaining good performance.
ResNet has become a widely adopted architecture in various computer vision tasks, including image classification, object detection, and image segmentation. Its ability to train very deep networks effectively has made it a fundamental building block in the field of deep learning.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/09/an-introduction-to-data-augmentation-techniques-in-ml-frameworks-a-presentation-from-amd/
Rajy Rawther, PMTS Software Architect at AMD, presents the “Introduction to Data Augmentation Techniques in ML Frameworks” tutorial at the May 2021 Embedded Vision Summit.
Data augmentation is a set of techniques that expand the diversity of data available for training machine learning models by generating new data from existing data. This talk introduces different types of data augmentation techniques as well as their uses in various training scenarios.
Rawther explores some built-in augmentation methods in popular ML frameworks like PyTorch and TensorFlow. She also discusses some tips and tricks that are commonly used to randomly select parameters to avoid having model overfit to a particular dataset.
Lung Cancer Detection using transfer learning.pptx.pdfjagan477830
Lung cancer is one of the deadliest cancers worldwide. However, the early detection of lung cancer significantly improves survival rate. Cancerous (malignant) and noncancerous (benign) pulmonary nodules are the small growths of cells inside the lung. Detection of malignant lung nodules at an early stage is necessary for the crucial prognosis.
Attention Is All You Need.
With these simple words, the Deep Learning industry was forever changed. Transformers were initially introduced in the field of Natural Language Processing to enhance language translation, but they demonstrated astonishing results even outside language processing. In particular, they recently spread in the Computer Vision community, advancing the state-of-the-art on many vision tasks. But what are Transformers? What is the mechanism of self-attention, and do we really need it? How did they revolutionize Computer Vision? Will they ever replace convolutional neural networks?
These and many other questions will be answered during the talk.
In this tech talk, we will discuss:
- A piece of history: Why did we need a new architecture?
- What is self-attention, and where does this concept come from?
- The Transformer architecture and its mechanisms
- Vision Transformers: An Image is worth 16x16 words
- Video Understanding using Transformers: the space + time approach
- The scale and data problem: Is Attention what we really need?
- The future of Computer Vision through Transformers
Speaker: Davide Coccomini, Nicola Messina
Website: https://www.aicamp.ai/event/eventdetails/W2021101110
IEEE EED2021 AI use cases in Computer VisionSAMeh Zaghloul
AI Use Cases in Computer Vision
Introduction and Overview about AI Use Cases in Computer Vision, to answer a basic question: “How Machines See?”, covering Neural Networks, Object detection and recognition, Content-based image retrieval, Object tracking, Image restoration, Scene reconstruction, Computer Vision Tools, Frameworks, Pretrained Models, and Public Train/Test Datasets.
With real-project examples on using Computer Vision in Egyptian Hieroglyph Alphabet recognition, Face Recognition/Matching, in addition to hands-on interactive session on Object/Image Tagging/Annotation on Videos/Images to prepare model training dataset.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
This presentation displays the applications of CNNs, a quick review about Neural Networks and their drawbacks, the convolution process, padding, striding, convolution over volume, types of layers in CNN, max pool layer, fully connected layer, and lastly the famous CNNs, LetNet-5, AlexNet, VGG-16, ResNet and GoogLeNet.
ResNet (short for Residual Network) is a deep neural network architecture that has achieved significant advancements in image recognition tasks. It was introduced by Kaiming He et al. in 2015.
The key innovation of ResNet is the use of residual connections, or skip connections, that enable the network to learn residual mappings instead of directly learning the desired underlying mappings. This addresses the problem of vanishing gradients that commonly occurs in very deep neural networks.
In a ResNet, the input data flows through a series of residual blocks. Each residual block consists of several convolutional layers followed by batch normalization and rectified linear unit (ReLU) activations. The original input to a residual block is passed through the block and added to the output of the block, creating a shortcut connection. This addition operation allows the network to learn residual mappings by computing the difference between the input and the output.
By using residual connections, the gradients can propagate more effectively through the network, enabling the training of deeper models. This enables the construction of extremely deep ResNet architectures with hundreds of layers, such as ResNet-101 or ResNet-152, while still maintaining good performance.
ResNet has become a widely adopted architecture in various computer vision tasks, including image classification, object detection, and image segmentation. Its ability to train very deep networks effectively has made it a fundamental building block in the field of deep learning.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/09/an-introduction-to-data-augmentation-techniques-in-ml-frameworks-a-presentation-from-amd/
Rajy Rawther, PMTS Software Architect at AMD, presents the “Introduction to Data Augmentation Techniques in ML Frameworks” tutorial at the May 2021 Embedded Vision Summit.
Data augmentation is a set of techniques that expand the diversity of data available for training machine learning models by generating new data from existing data. This talk introduces different types of data augmentation techniques as well as their uses in various training scenarios.
Rawther explores some built-in augmentation methods in popular ML frameworks like PyTorch and TensorFlow. She also discusses some tips and tricks that are commonly used to randomly select parameters to avoid having model overfit to a particular dataset.
Lung Cancer Detection using transfer learning.pptx.pdfjagan477830
Lung cancer is one of the deadliest cancers worldwide. However, the early detection of lung cancer significantly improves survival rate. Cancerous (malignant) and noncancerous (benign) pulmonary nodules are the small growths of cells inside the lung. Detection of malignant lung nodules at an early stage is necessary for the crucial prognosis.
Attention Is All You Need.
With these simple words, the Deep Learning industry was forever changed. Transformers were initially introduced in the field of Natural Language Processing to enhance language translation, but they demonstrated astonishing results even outside language processing. In particular, they recently spread in the Computer Vision community, advancing the state-of-the-art on many vision tasks. But what are Transformers? What is the mechanism of self-attention, and do we really need it? How did they revolutionize Computer Vision? Will they ever replace convolutional neural networks?
These and many other questions will be answered during the talk.
In this tech talk, we will discuss:
- A piece of history: Why did we need a new architecture?
- What is self-attention, and where does this concept come from?
- The Transformer architecture and its mechanisms
- Vision Transformers: An Image is worth 16x16 words
- Video Understanding using Transformers: the space + time approach
- The scale and data problem: Is Attention what we really need?
- The future of Computer Vision through Transformers
Speaker: Davide Coccomini, Nicola Messina
Website: https://www.aicamp.ai/event/eventdetails/W2021101110
IEEE EED2021 AI use cases in Computer VisionSAMeh Zaghloul
AI Use Cases in Computer Vision
Introduction and Overview about AI Use Cases in Computer Vision, to answer a basic question: “How Machines See?”, covering Neural Networks, Object detection and recognition, Content-based image retrieval, Object tracking, Image restoration, Scene reconstruction, Computer Vision Tools, Frameworks, Pretrained Models, and Public Train/Test Datasets.
With real-project examples on using Computer Vision in Egyptian Hieroglyph Alphabet recognition, Face Recognition/Matching, in addition to hands-on interactive session on Object/Image Tagging/Annotation on Videos/Images to prepare model training dataset.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017MLconf
Alexandra Johnson, Software Engineer, SigOpt
Alexandra works on everything from infrastructure to product features to blog posts. Previously, she worked on growth, APIs, and recommender systems at Polyvore (acquired by Yahoo). She majored in computer science at Carnegie Mellon University with a minor in discrete mathematics and logic, and during the summers she A/B tested recommendations at internships with Facebook and Rent the Runway.
Abstract Summary:
Common Problems In Hyperparameter Optimization: All large machine learning pipelines have tunable parameters, commonly referred to as hyperparameters. Hyperparameter optimization is the process by which we find the values for these parameters that cause our system to perform the best. SigOpt provides a Bayesian optimization platform that is commonly used for hyperparameter optimization, and I’m going to share some of the common problems we’ve seen when integrating into machine learning pipelines.
Winning Kaggle 101: Introduction to StackingTed Xiao
An Introduction to Stacking by Erin LeDell, from H2O.ai
Presented as part of the "Winning Kaggle 101" event, hosted by Machine Learning at Berkeley and Data Science Society at Berkeley. Special thanks to the Berkeley Institute of Data Science for the venue!
H2O.ai: http://www.h2o.ai/
ML@B: ml.berkeley.edu
DSSB: http://dssberkeley.org
BIDS: http://bids.berkeley.edu/
Artificial Intelligence is rapidly coming of age, as business leaders increasingly grasp the immense potential of "smart" machines and other innovations as catalysts for greater efficiency and competitiveness. Discover more at www.accenture.com/AItechnology
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016DataStax
A deep learning startup has a requirement for a robust and scalable data architecture. Training a Deep Neural Network requires 10s-100s of millions of examples consisting of data and metadata. In addition to training it is necessary to support test/validation, data exploration and more traditional data science analytics workloads. As a startup we have minimal resources and an engineering team of 1.
Cassandra, Spark and Kafka running on Mesos in AWS is a scalable architecture that is fast and easy to set up and maintain to deliver a data architecture for Deep Learning.
About the Speaker
Andrew Jefferson VP Engineering, Tractable
A software engineer specialising in realtime data systems. I've worked at companies from Startups to Apple on applications ranging from Ticketing to Genetics. Currently building data systems for training and exploiting Deep Neural Networks.
This project was carried as a semester project requirement for CSC 522 Automated Learning & Data Mining.
The project focuses on predicting forest cover type in the 4 Wilderness Areas of Roosevelt National Park located at Colorado.
The data for the project was obtained from Kaggle (it is also hosted on UCI repository under the name "forest cover type").
We obtained incremental improvement with every new classification technique we tried and simultaneously our Kaggle ranking also went up.
Listen to an experienced, global panel of insurance professionals present, discuss and answer your questions on the theme of “AI & Machine Learning”.
Brought to you by The Digital Insurer and sponsored by KPMG.
Kaggle is a community of almost 400K data scientists who have built almost 2MM machine learning models to participate in our competitions. Data scientists come to Kaggle to learn, collaborate and develop the state of the art in machine learning. This talk will cover some of the lessons we have learned from the Kaggle community.
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
One-shot learning is an object categorization problem in computer vision. Whereas most machine learning based object categorization algorithms require training on hundreds or thousands of images and very large datasets, one-shot learning aims to learn information about object categories from one, or only a few, training images
This is the Bangla Handwritten Digit Recognition Report. you can see this report for your helping hand.
**Bengali is the world's fifth most spoken language, with 265 million native and non-native speakers accounting for 4% of the global population.
**Despite the large number of Bengali speakers, very little research has been conducted on Bangali handwritten digit recognition.
**The application of the BHwDR system is wide from postal code digit recognition to license plate recognition, digit recognition in cheques in the banking system to exam paper registration number recognition.
Image Captioning Generator using Deep Machine Learningijtsrd
Technologys scope has evolved into one of the most powerful tools for human development in a variety of fields.AI and machine learning have become one of the most powerful tools for completing tasks quickly and accurately without the need for human intervention. This project demonstrates how deep machine learning can be used to create a caption or a sentence for a given picture. This can be used for visually impaired persons, as well as automobiles for self identification, and for various applications to verify quickly and easily. The Convolutional Neural Network CNN is used to describe the alphabet, and the Long Short Term Memory LSTM is used to organize the right meaningful sentences in this model. The flicker 8k and flicker 30k datasets were used to train this. Sreejith S P | Vijayakumar A "Image Captioning Generator using Deep Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42344.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42344/image-captioning-generator-using-deep-machine-learning/sreejith-s-p
Detection of medical instruments project- PART 1Sairam Adithya
this presentation is about a project done by me and my colleague related to computer vision. This project is used to classify the uploaded images of biomedical instruments into prominent ones like ECG, EEG, x-ray machine, CT, MRI, and so on. A website has been developed on which the user can upload any image he is unknown of and the model will tell what instrument it is along with a paragraph explaining the instrument in a crisp manner
The goal of this report is the presentation of our biometry and security course’s project: Face recognition for Labeled Faces in the Wild dataset using Convolutional Neural Network technology with Graphlab Framework.
CNN FEATURES ARE ALSO GREAT AT UNSUPERVISED CLASSIFICATION cscpconf
This paper aims at providing insight on the transferability of deep CNN features to
unsupervised problems. We study the impact of different pretrained CNN feature extractors on
the problem of image set clustering for object classification as well as fine-grained
classification. We propose a rather straightforward pipeline combining deep-feature extraction
using a CNN pretrained on ImageNet and a classic clustering algorithm to classify sets of
images. This approach is compared to state-of-the-art algorithms in image-clustering and
provides better results. These results strengthen the belief that supervised training of deep CNN
on large datasets, with a large variability of classes, extracts better features than most carefully
designed engineering approaches, even for unsupervised tasks. We also validate our approach
on a robotic application, consisting in sorting and storing objects smartly based on clustering
Traditional Machine Learning had used handwritten features and modality-specific machine learning to classify images, text or recognize voices. Deep learning / Neural network identifies features and finds different patterns automatically. Time to build these complex tasks has been drastically reduced and accuracy has exponentially increased because of advancements in Deep learning. Neural networks have been partly inspired from how 86 billion neurons work in a human and become more of a mathematical and a computer problem. We will see by the end of the blog how neural networks can be intuitively understood and implemented as a set of matrix multiplications, cost function, and optimization algorithms.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal,
The model explains how we can Automate System using Artificial Intelligence.
It broadly concerns about:-
1. Lane Detection.
2. Traffic Sign Classification.
3. Behavioural Cloning.
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...CSCJournals
The traditional approach for solving the object recognition problem requires image representations to be first extracted and then fed to a learning model such as an SVM. These representations are handcrafted and heavily engineered by running the object image through a sequence of pipeline steps which requires a good prior knowledge of the problem domain in order to engineer these representations. Moreover, since the classification is done in a separate step, the resultant handcrafted representations are not tuned by the learning model which prevents it from learning complex representations that might would give it more discriminative power. However, in end-to-end deep learning models, image representations along with the classification decision boundary are all learnt directly from the raw data requiring no prior knowledge of the problem domain. These models deeply learn the object image representation hierarchically in multiple layers corresponding to multiple levels of abstraction resulting in representations that are more discriminative and give better results on challenging benchmarks. In contrast to the traditional handcrafted representations, the performance of deep representations improves with the introduction of more data, and more learning layers (more depth) and they perform well on large-scale machine learning problems. The purpose of this study is six fold: (1) review the literature of the pipeline processes used in the previous state-of-the-art codebook model approach for tackling the problem of generic object recognition, (2) Introduce several enhancements in the local feature extraction and normalization steps of the recognition pipeline, (3) compare the enhancements proposed to different encoding methods and contrast them to previous results, (4) experiment with current state-of-the-art deep model architectures used for object recognition, (5) compare between deep representations extracted from the deep learning model and shallow representations handcrafted through the recognition pipeline, and finally, (6) improve the results further by combining multiple different deep learning models into an ensemble and taking the maximum posterior probability.
How to Build a Neural Network and Make PredictionsDeveloper Helps
Lately, people have been really into neural networks. They’re like a computer system that works like a brain, with nodes connected together. These networks are great at sorting through big piles of data and figuring out patterns to solve hard problems or guess stuff. And you know what’s super cool? They can keep on learning forever.
Creating and deploying neural networks can be a challenging process, which largely depends on the specific task and dataset you’re dealing with. To succeed in this endeavor, it’s crucial to possess a solid grasp of machine learning concepts, along with strong programming skills. Additionally, a deep understanding of the chosen deep learning framework is essential. Moreover, it’s imperative to prioritize responsible and ethical usage of AI models, especially when integrating them into real-world applications.
Learn from : https://www.developerhelps.com/how-to-build-a-neural-network-and-make-predictions/
Similar to Using Deep Learning to Find Similar Dresses (20)
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Elevating Tactical DDD Patterns Through Object Calisthenics
Using Deep Learning to Find Similar Dresses
1.
Dress Similarity
03.20.2017
─
Luis Mey
Machine Learning Engineer Student
Overview
This project developed a software capable of helping fashion companies to find similar
images in the database, allowing these businesses to cluster its products or to provide
product recommendation for its customers. This solution was possible with the current
deep learning technology that has shown huge potential for business applications in the
2. recent years. Using this technology, the artificial intelligence is already trained to receive
a set of images (or urls pointing to these images) of different dresses and retrieve what
are the most similar dresses for each of the provided dresses.
Domain Background
Developing product recommendation based on image similarity is a hard challenge and
recently, with the advance of technology, this problem has been attacked with deep
learning, using very recent tools as Tensorflow and Keras (released in 2015) . For
example, just recently in 2015 a paper published by researchers at University of North
Carolina at Chapel Hill and Illinois at Urbana-Champaign used neural networks to
retrieve similar fashion images [1]. In 2016 a start-up, Thread Genius, used a similar
technology to retrieve similar fashion images [2].
Problem Statement
Inspired by the recent applications of deep learning in image similarity, the proposal for
this project is to develop a simple deep learning software that, given an image of a
dress listed by a retailer, similar listed dress images will be returned. Therefore, no
supervision to train this model is necessary, since the images itself are used,
characterizing the algorithm as an unsupervised learning algorithm. For this project
image similarity is be measured by the euclidian distance of the feature vector of the
two images being compared. In the context of the Brazilian market that is still young on
data science, a simple model would already be competitive with the current product
recommendation system presented by big ecommerce retailers.
Datasets and Inputs
A realistic dataset from Brazilian retailers will be used. This dataset consists in about
10,000 scrapped page urls of dresses with product image, name and type, as described
by the retailer [3][4]. These images when loaded into memory consume about 17GB
and present different shapes, usually with a white or light grey background, highlighting
a colored dress that usually a model is wearing. This dataset size was enough to build a
simple image similarity software that other retailers could build without requiring
expensive processing power.
1
3. Solution Statement
A convolutional neural network was used to generate a feature vector for each image in
the dataset. Each feature vector could be interpreted as a representation of the image in
a dimension that is smaller than its pixel representation. In order to save computational
effort, this convolutional neural network (CNN) took advantage of pretrained layers in
the Inception V3 architecture made available by the deep learning library Keras. The
resulting feature vectors were used in the similarity calculations, where the euclidean
distance between the feature vectors represented image similarity.
Benchmark Model and Evaluation Metric
Since the similarity between two (or n) dresses has human subjectivity, the proposal for
evaluation and benchmarking is to compare a sample of this project’s results with what
a large ecommerce retailer suggests as similar products that the customer also would
like to see. For example, the first dress below is the searched product, while the
following four dresses are the ones recommended by the deep learning model and the
last three are similarity recommendations from the ecommerce website used as
benchmark. In order to create a numeric metric, it is possible to vote and assign 1 point
if the reference image is similar with the recommendation, and zero otherwise. After
evaluating 10 randomly sampled images, the performance of the model versus the
benchmark could me measured as the number of samples considered similar divided by
the total number of sampled images (10).
Figure 1 - Dress similarity example. The first dress in the left is used as reference, while
the remaining are recommended products due to similarity.
2
4. Project Design
In summary, this project is designed to have 5 steps:
1. Load the images from retailers;
Using the image urls available in the dataset, it is possible to use the skimage library to
load all the images and store them in a variable.
2. Preprocess the images to fit into the CNN;
Due to the shape of each image not matching the pre-trained network required input the
images need to be reshaped and then stacked as vectors with a shape like (n_images,
image_height, image_width, image_channel), where channel is 3 because the images
are colored.
3. Build the CNN using a pre-trained network as a base model;
It is possible to use Keras pre-trained Inception V3 model as a base model. This base
model receives a (256,256,3) image as input and returns a flat vector. From this flat
output vector it is possible to add additional layers, as an additional dense layer with
1024 nodes and a prediction layer with 2 nodes.
4. Generate the feature vectors;
Having some simple information about the dresses, as if it is long or short, it is possible
to train the layers below the base model, fine tuning these layers to provide information
on the dresses. Then the output from the 1024 nodes base layer generated from each
image can be used as a feature vector to represent the image.
5. Retrieve the n-closest images given a reference image.
Finally, a distance metric can be used to calculate how different one feature vector is
from the other feature vectors present in the database. This distance metric can be as
simple as an euclidean distance that has a numpy implementation. Therefore, being
able to calculate these distances allows anyone to just provide an image as input to the
model that will calculate or retrieve the feature vector, calculate the minimal distance
from other images and finally return n-closest images to the provided input.
3
5. Data Analysis
During the data exploration it was possible to notice that the dataset required some
preprocessing to assure data quality before feeding it to the model. The dataset
presented some issues regarding dress labeling. For example, it was possible to notice
that some dresses were used by the ecommerce both in the short section and in the
long dress section. This type of event could harm the first step of model training that
consisted in training the final layers of a CNN ir order to classify the dress as short or
long, making the model learn about features of a dress. Due to this ambiguous
classification, the problematic examples were removed from the training set.
Additionally, the major source of data, with 11,662 examples, actually had repetitive
records that were removed when detected in the data exploration. Moreover, it was
possible to notice that image resolution was about 300x400 pixels, higher than the
required shape to feed into the pretrained Inception V3 model. This shape mismatch
required some image reshaping in the preprocessing phase.
In possession of the preprocessed data, it was possible train the deep learning
algorithm that basically consisted in feeding images of the long and short dresses into a
pre trained neural network, illustrated by Figure 2, followed by additional dense layers.
Each time that a batch of images were feeded into the network, each pixel was
transformed through the network up to the last node where a prediction was made.
Based on the known labels, short dress or long dress, the network calculated the
prediction error and updated the weights of the dense layers. The weights of the dense
layers kept multiplying the output of the long convolution neural network, getting closer
to a better prediction after several iterations. After processing all the images 30 times
(number of epochs), the output of the 1024 nodes dense layer was used as a feature
vector that represents each image and, by calculating the euclidian distance between
each image it was possible to predict which image was closer to another.
4
6. Figure 2 - Network used for training: schematic diagram of inception V3[5] that flows
into dense layers specifically added for learning about the dresses.
During the exploratory visualization it was also possible to see that most of the dresses
only have a generic classification as long or short, as illustrated in the Figure 2 by the
first two biggest bars. Even among the most common dress types, the short dresses are
5.7 times more representative in the dataset. The next most common dresses are the
printed dresses and the flat dresses that together count about 1,000 examples.
Further illustration is provided by Figure 3 that has one example for each of the five
most common dress types.
5
7. Figure 4 - Unique dress type distribution.
Figure 5 - Top 5 dress types (from left to right: short, long, printed, flat, night).
Methodology Implementation
The initial implementation was executed in the IPython Notebook named Model.ipynb.
The notebook’s first section, named “Load and Preprocess Data” is responsible for the
steps 1 and 2 previously described in the Project Design section. The process starts
6
8. from loading the data using the url list available in a csv file, followed by a image
reshape to fit into the Inception V3 model and finally a train/test split required to train
and test the model. Additionally to the Model.ipynb, the utils.py file was very helpful on
organizing the necessary code to load and preprocess the images.
Still on the Model.ipynb, the third step mentioned in the Project Design section was
executed (and also the training process mentioned in the step 4). With the help of the
model.py it was possible to train the data in an architecture that consists in an input
layer that receives 256x256x3 vectors, feeds into a pretrained Inception V3 model,
performs an average pooling and feeds to two dense layers, where the first has 1024
nodes and the section only 2 nodes that are activated with a softmax function. This first
model only trained the layers below the Inception V3, predicting if a dress was short or
long using a rmsprop optimizer and a categorical cross-entropy loss with an accuracy of
90% in the test set.
The fourth step in the Project Design was executed through the Scores Images IPython
Notebook. This notebook used the model.py file to create another model that, instead of
returning a binary classification, returned the entire dense layer of 1024 nodes,
representing the image feature vector. This feature vector model was used to score and
save in batch every unique image in the major ecommerce dataset. This process of
calculating the feature vectors and saving them in batch is important due to the fact that
in the future new products could be added, and with a batch process only this new
products would require feature vector calculation, and not the entire dataset again.
The fifth step in the Project Design was executed with the IPython Notebook named
Create Similarity Vector. This notebook used the feature vectors to calculate the
distance between each image, ranking the 4 closest ones and generating a matrix
named similarity matrix. With this similarity matrix, it is possible to know what are the 4
closest images for each of the images in that the retailer has.
Regarding model refinements, it is important to say that the exploratory data analysis
was critical to identify the fact that duplicated images were present in the database,
what resulted in poor performance of the recommendation system that kept suggesting
the same image instead of a similar image. Another refinement was an image
augmentation that consisted in feeding not only the images, but also the flipped version
of the images. This additional process actually decreased the model accuracy from 93%
to 90% for predicting if a dress was long or short, but it is possible to argue that it is
important to the model to learn about symmetry because a dress that is just a mirror
image should be very similar to its original version.
7
9. Results
The deep learning model was capable of providing suggestions for similar images given
a reference image. In order to evaluate these results, 10 randomly sampled reference
images were used. For each of the sampled images, similarity recommendation from
the original retailer were extracted using an anonymous browser in order to prevent
historical cookies to be used. For the same sampled images, the similarity matrix
provided what would be the recommendation from a deep learning model. Organizing
the results, it is possible to compare the results as presented below.
Sample 1)
It seems that both recommendations are short dresses, but the deep learning model
found similar shades of red, while the ecommerce solution recommended dresses with
exposed shoulders. Cumulative points using the proposed metric: 1 (model) vs. 1
(benchmark).
8
10. Sample 2)
Considering this example, it seems that both recommendations does not make much
sense. It could be a sign that there are no good similar products in the database.
However, it could be argued that recommending other dresses is better than
recommending boots and hair equipments. Cumulative points using the proposed
metric: 1 (model) vs. 1 (benchmark).
Sample 3)
The deep learning recommendation seems to be similar to the reference image, with
very colorful dresses, usually without sleeves and summer like. Having in mind the
subjectivity of the comparison, it is possible to argue that the deep learning
recommendation seems to be better than the ecommerce proposed solution for this
example. Cumulative points using the proposed metric: 2 (model) vs. 1 (benchmark).
9
11. Sample 4)
The deep learning model recommended what seems to be very similar dresses due to
the similar color and size. Considering the ecommerce recommendation, it is possible to
notice that probably the recommendation was based on the dress category/style
“Sommer” that is common to the reference image. Cumulative points using the
proposed metric: 3 (model) vs. 2 (benchmark).
Sample 5)
The deep learning recommendation appears to have a good suggestion here because
of the patterns in the dresses. The ecommerce recommendation seems to make sense
for the grey dress that has similar color to the reference image. Cumulative points using
the proposed metric: 4 (model) vs. 3 (benchmark).
10
12. Sample 6)
This example presents similar recommendation among the possible solutions due to the
fact that both are showing long and formal like dresses. Cumulative points using the
proposed metric: 5 (model) vs. 4 (benchmark).
Sample 7)
It seems that the deep learning recommendation provided neutral color options with
complex patterns and somehow light look, similarly to the reference image. The
ecommerce suggestions seems more aggressive on colors, but it is possible to say that
the style is similar. Cumulative points using the proposed metric: 6 (model) vs. 5
(benchmark).
11
13. Sample 8)
This sample is very interesting because it seems that the deep learning model
suggested long dresses, as in the reference image, but also dresses that it is possible
to see the feet or even the legs. In contrast, the ecommerce suggestion was short
dresses. Cumulative points using the proposed metric: 7 (model) vs. 5 (benchmark).
Sample 9)
This is another example that the ecommerce probably could not provide similar images,
but the deep learning model could recommend dresses that usually share a common
feature of having stripes. Cumulative points using the proposed metric: 8 (model) vs. 5
(benchmark).
12
14. Sample 10)
This last sampled example is another probable behavior that the ecommerce suggests
similar dresses as similar model (“DF TOP MODA”). This behavior could be
incorporated in the deep learning model, making it consider a description similarity as
well when recommending dresses. Regarding the deep learning recommendation, it is
hard to see similarity in the images, so it is a win for the ecommerce in this example,
however the deep learning model finished the test as the winner. Cumulative points
using the proposed metric: 8 (model) vs. 6 (benchmark).
In summary, the evaluation resulted in 80% similarity versus 60% similarity according to
the proposed similarity metric. This result provides some trust regarding the model
performance and, given that little preprocessing was required, there is an indication that
the model is robust to compete the with ecommerce benchmark.
Conclusion
This project successfully accomplished the goal of creating similarity model and a
process to provide recommendations based on a given image. The final model is
competitive and sometimes outperforms the benchmark and also has the advantage of
do not requiring a historical user behavior to train the recommendation system.
Additional examples from visualizing the final model brings excitement, corroborating
with the great quality achieved by the model.
13
16. However, the model still can face difficulties, mainly when there are few similar images
in the database. It is extremely interesting that the feature vectors generated in the
dense layer after the convolutional neural network were capable of finding similar
images using a metric as simple as the euclidean distance. For future work, it would be
interesting to use product description as well, capturing similar brands, season or
category and extracting even more value from this exciting deep learning technology.
References
[1] M. H. Kiapour, X. Han, S. Lazebnik, A. C. Berg and T. L. Berg, “Where to Buy It:
Matching Street Clothing Photos in Online Shops” (2015)
[2] Thread Genius, “Robo Bill Cunningham: Shazam for Fashion With Deep Neural
Networks” (2016)
[3] https://www.dafiti.com.br/
[4] http://www.lojasrenner.com.br/
[5] Inception V3:
https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html
15