SlideShare a Scribd company logo
Multimedia Data Mining
using deep learning
Peter Wlodarczak
wlodarczak@gmail.com
Agenda
 Aims
 Multimedia Data Mining
 Artificial Neural Networks
 Deep learning
 Challenges
 Discussion
Aims
 Analyze multimedia data for:
 Object/face recognition
 Voice commands
 Natural Language Processing
 Classification
 Automatic caption generation
 Record linkage (entity resolution)
Multimedia Data Mining I
 Multimedia data mining:
 Unprecedented amount of Multimedia data
since Web 2.0 and Social Media
 Prosumer data
 Uses algorithms to extract useful patterns
and relations from image, audio and video
data
 Traditional methods often not satisfactory
 Unsuitable for high dimensionality
Multimedia Data Mining II
 Multimedia data mining has been
improved using deep learning in:
 Visual data mining
 Natural Language Processing
 Deep learner are:
 Machine Learning schemes
 Usually multi-layered artificial neural
networks
Artificial Neural Networks I
 Artificial Neural Networks:
 Suitable to give good approximations for
complex problems
 Consist of perceptrons, neurons,
and weighted connections,
the axons
Artificial Neural Networks II
 Perceptron (Neuron)
 Linear classifier
 Data linearly separable using a hyperplane
 Where w = weights, a = real-valued vector,
feature vector, a0 = bias
 Binary classifier f(a) that maps its input
vector a to a single, binary output value
w0a0 + w1a1 + w2a2 + … + wkak = 0
Artificial Neural Networks III
w0
1
bias
attr
a1
attr
a2
attr
a3
w1 w2
w3
f(a) = kwkak + b
f(a) > 0 or
f(a) < 0
Artificial Neural Networks III
Training data
sex mask cape tie ears smokes class
Batman male yes yes no yes no Good
Robin male yes yes no no no Good
Alfred male no no yes no no Good
Penguin male no no yes no yes Bad
Catwoman female yes no no yes no Bad
Joker male no no no no no Bad
Test data
Batgirl female yes yes no yes no ?
Riddler male yes no no no no ?
 Supervised learning
Artificial Neural Networks IV
 Not all data is linearly separable
Artificial Neural Networks V
 Multilayer Perceptron
 Perceptrons organized in several layers
 A layer is fully interconnected with the next
layer
 All nodes except input node are perceptrons
 Feedforward neural network
 Uses backpropagation for training
 Error propagated back to minimize loss function
Artificial Neural Networks VI
 Multilayer perceptron can be used for
non-linear, multiclass classification
Artificial Neural Networks VII
 Gradient descent optimization method
for learning weights
Artificial Neural Networks VIII
 Complexity has to be accurate
(Occam’s razor)
Schapire 2004
Artificial Neural Networks IX
Schapire 2004
Artificial Neural Networks X
 For building an accurate classifier:
 Enough training examples
 Good performance on training set
 Classifier that is not too complex,
overfitting
 Allows to get approximate solutions for
very complex problems
 Support Vector Machines (SVM) are a
much simpler alternative to ANN
Deep learning I
 Deep learning
 No clear distinction to shallow learner
 Multiple layers of non-linear processing
units
 Each layer represents features at a higher
level
 Forms a hierarchical representation
 Majority of deep learners are aNN
Deep learning II
 Deep learning neural networks
 Uses Rectified Linear Unit (ReLU)
 Learn faster
 Half-wave rectifier
f(z) = max(z, 0)
 Use backpropagation for adjusting the
weights
Deep learning III - ConvNet
LeNet 2015
Deep learning IV - ConvNet
 Convolutional neural networks
 Inspired by the animal visual cortex
 Visual cortex is the most powerful visual
processing system in existence
 Typically two stages:
 Convolutional stage
 Pooling stage
 Characterized by
 sparse connectivity
 shared weights
Deep learning V - ConvNet
 Shared weights
 Subsets share weights and bias to form
feature map
 Replicated across entire visual field
Deep learning VI - ConvNet
 Each layer accepts 3D input vector and
transforms it into a 3D output vector
 Filters activate when specific feature is
mapped
CS231n 2015
Deep learning VII - ConvNet
 Receptive field spans all feature maps
LeNet 2015
Deep learning VIII - ConvNet
 MaxPooling
 Non-linear down-sampling
 Partitions input into non-overlapping
rectangles
 Outputs maximum value for each sub-
region
 Minimizes computation for next layer
 Reduces dimensionality of intermediate
representations
Deep learning IX - ConvNet
 Convolutional and sampling sublayers
UFLDL 2015
Deep learning X - ConvNet
 Image cascading max-pooling with
convolutionary layer
 Similar to edge detector
Deep learning XI - RNN
 Recurrent neural networks
 Contain directed cycles
 Take sequences as input, no fixed size
input and output vectors, e. g. natural
speech
Deep learning XII - RNN
 No fixed size of computations
 Much simpler than ConvNets
 Maintain inner state exhibiting dynamic
temporal behavior
 Optimized through backpropagation
 Can be extended with long time memory
extensions
 Don’t necessary need sequences of inputs
Deep learning XIII - RNN
 Training RNN is a non-linear global
optimization problem
 Trained using stochastic gradient descent
 Non-linear, differentiable activation
function, e. g. rectifier
 Trained through backpropagation through
time (BPTT)
 Genetic algorithms can be used for training
Deep learning XIV - RNN
 Many different architectures for RNN
Elman SRN Spiking neural network
Deep learning XV - RNN
RNN learns to read house
numbers
RNN learns to paint
house numbers
Karpathy 2015
Deep learning XVI - RNN
 RNN used for
 Transcribe speech to text
 Voice synthetization
 Machine translation
Deep learning XVII
 Combining ConvNets and RNN for
image descriptions
 Regions described
using language as
label space using
ConvNet
 Language synthesizing
using RNN
Karpathy & Fei-Fei 2014
Deep learning XVIII
 ConvNet and RNN can be combined
 Automated caption generation
Deep learning XIX
 Automatic feature extraction
 No closed vocabulary set
 Alignment of segments of sentences to
region on the image
Karpathy & Fei-Fei 2014
Deep learning XX
 Other applications
 Object recognition
 Movie classification
 Handwriting recognition
 Record linkage
Challenges I
 Main disadvantage large volumes of
training data needed
 Overfitting if not enough training data
 Optimization difficult
 Finding relevant information
 Privacy preservice data mining
Challenges II
 Describing actions
Discussion
 Future research in
 Attention based models
 Finding relevant information
 Data democratization and Internet of
Things
 Unsupervised learning
 Semantic data modeling
 Reasoning
Thank you for the attention
 Questions?
References
 Zhao, X, Li, X & Zhang, Z 2015, 'Multimedia Retrieval via Deep Learning to Rank ', IEEE Signal Processing Letters, vol. 22, no. 9, pp. 1487 -
91 <http://ieeexplore.ieee.org.ezproxy.usq.edu.au/xpls/abs_all.jsp?arnumber=7054452>.
 Yu, W, Zhuang, F, He, Q & Shi, Z 2015, 'Learning deep representations via extreme learning machines', Neurocomputing, vol. 149, Part A,
pp. 308-15, <http://www.sciencedirect.com/science/article/pii/S0925231214011461>.
 Xu, K, Ba, J, Kiros, R, Cho, K, Courville, A, Salakhutdinov, R, Zemel, R & Bengio, Y 2015, 'Show, Attend and Tell: Neural Image Caption
Generation with Visual Attention', Proceedings of the 32nd International Conference on Machine Learning from Data: Artificial Intelligence
and Statistics, vol. 37.
 Xin, J, Wang, Z, Qu, L & Wang, G 2015, 'Elastic extreme learning machine for big data classification', Neurocomputing, vol. 149, Part A, pp.
464-71, <http://www.sciencedirect.com/science/article/pii/S0925231214011503>.
 Weston, J, Chopra, S & Bordes, A 2015, 'Memory Networks', in 3rd International Conference on Learning Representations: proceedings of
the3rd International Conference on Learning Representations San Diego, viewed <http://arxiv.org/pdf/1410.3916v10.pdf>.
 Weilong, H, Xinbo, G, Dacheng, T & Xuelong, L 2015, 'Blind Image Quality Assessment via Deep Learning', Neural Networks and Learning
Systems, IEEE Transactions on, vol. 26, no. 6, pp. 1275-86.
 Wang, Y, Li, D, Du, Y & Pan, Z 2015, 'Anomaly detection in traffic using L1-norm minimization extreme learning machine', Neurocomputing,
vol. 149, Part A, pp. 415-25, <http://www.sciencedirect.com/science/article/pii/S0925231214011382>.
 Vinyals, O, Toshev, A, Bengio, S & Erhan, D 2015, 'Show and Tell: A Neural Image Caption Generator', Google,
<http://arxiv.org/pdf/1411.4555v1.pdf>.
 Noda, K, Yamaguchi, Y, Nakadai, K, Okuno, H & Ogata, T 2015, 'Audio-visual speech recognition using deep learning', Applied Intelligence,
vol. 42, no. 4, pp. 722-37, <http://dx.doi.org/10.1007/s10489-014-0629-7>.
 Mao, W, Zhao, S, Mu, X & Wang, H 2015, 'Multi-dimensional extreme learning machine', Neurocomputing, vol. 149, Part A, pp. 160-70,
<http://www.sciencedirect.com/science/article/pii/S0925231214011540>.
 Liu, X, Wang, L, Huang, G-B, Zhang, J & Yin, J 2015, 'Multiple kernel extreme learning machine', Neurocomputing, vol. 149, Part A, pp. 253-
64, <http://www.sciencedirect.com/science/article/pii/S0925231214011199>.
 LeCun, Y, Bengio, Y & Hinton, G 2015, 'Deep learning', Nature, vol. 521, no. 7553, pp. 436-44, <http://dx.doi.org/10.1038/nature14539>.
 Srivastava, N, Hinton, G, Krizhevsky, A, Sutskever, I & Salakhutdinov, R 2014, 'Dropout: a simple way to prevent neural networks from
overfitting', J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929-58.
 Karpathy, A & Fei-Fei, L 2014, 'Deep visual-semantic alignments for generating image descriptions', arXiv preprint arXiv:1412.2306.

More Related Content

What's hot

Deep Learning of High-Level Representations
Deep Learning of High-Level RepresentationsDeep Learning of High-Level Representations
Deep Learning of High-Level Representations
Hamid Eghbal-zadeh
 
chalenges and apportunity of deep learning for big data analysis f
 chalenges and apportunity of deep learning for big data analysis f chalenges and apportunity of deep learning for big data analysis f
chalenges and apportunity of deep learning for big data analysis f
maru kindeneh
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
Myungjin Lee
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
Tunde Ajose-Ismail
 
The Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics PipelineThe Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics Pipeline
Daniele Di Mitri
 
Read Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataRead Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal Data
Daniele Di Mitri
 
Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1
Deakin University
 
31 34
31 3431 34
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
Kato Mivule
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
Poo Kuan Hoong
 
Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19
Daniele Di Mitri
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed
 
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Kato Mivule
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
Melanie Swan
 
Multilabel Image Retreval Using Hashing
Multilabel Image Retreval Using HashingMultilabel Image Retreval Using Hashing
Multilabel Image Retreval Using Hashing
Surbhi Bhosale
 
GUI based handwritten digit recognition using CNN
GUI based handwritten digit recognition using CNNGUI based handwritten digit recognition using CNN
GUI based handwritten digit recognition using CNN
Abhishek Tiwari
 
Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...
butest
 
Kato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule
 
Multispectral image analysis using random
Multispectral image analysis using randomMultispectral image analysis using random
Multispectral image analysis using random
ijsc
 
BLIND RECOVERY OF DATA
BLIND RECOVERY OF DATABLIND RECOVERY OF DATA
BLIND RECOVERY OF DATA
Ajinkya Nikam
 

What's hot (20)

Deep Learning of High-Level Representations
Deep Learning of High-Level RepresentationsDeep Learning of High-Level Representations
Deep Learning of High-Level Representations
 
chalenges and apportunity of deep learning for big data analysis f
 chalenges and apportunity of deep learning for big data analysis f chalenges and apportunity of deep learning for big data analysis f
chalenges and apportunity of deep learning for big data analysis f
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
 
The Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics PipelineThe Multimodal Learning Analytics Pipeline
The Multimodal Learning Analytics Pipeline
 
Read Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataRead Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal Data
 
Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1
 
31 34
31 3431 34
31 34
 
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a GaugeAn Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
An Investigation of Data Privacy and Utility Using Machine Learning as a Gauge
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 
Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19Multimodal Tutor for CPR presented at AIME'19
Multimodal Tutor for CPR presented at AIME'19
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
Lit Review Talk - Signal Processing and Machine Learning with Differential Pr...
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Multilabel Image Retreval Using Hashing
Multilabel Image Retreval Using HashingMultilabel Image Retreval Using Hashing
Multilabel Image Retreval Using Hashing
 
GUI based handwritten digit recognition using CNN
GUI based handwritten digit recognition using CNNGUI based handwritten digit recognition using CNN
GUI based handwritten digit recognition using CNN
 
Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...
 
Kato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy EngineeringKato Mivule - Towards Agent-based Data Privacy Engineering
Kato Mivule - Towards Agent-based Data Privacy Engineering
 
Multispectral image analysis using random
Multispectral image analysis using randomMultispectral image analysis using random
Multispectral image analysis using random
 
BLIND RECOVERY OF DATA
BLIND RECOVERY OF DATABLIND RECOVERY OF DATA
BLIND RECOVERY OF DATA
 

Viewers also liked

Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Ppt buyonlineindia case study
Ppt buyonlineindia case studyPpt buyonlineindia case study
Ppt buyonlineindia case study
GAURAV SHARMA
 
ECML-2015 Presentation
ECML-2015 PresentationECML-2015 Presentation
ECML-2015 Presentation
Anirban Santara
 
Eddl5131 assignment 1 march2013
Eddl5131 assignment 1 march2013Eddl5131 assignment 1 march2013
Eddl5131 assignment 1 march2013
gmorong
 
presentation
presentationpresentation
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMESREPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
Ramnandan Krishnamurthy
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
Xavier Ochoa
 
Multimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringMultimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-Answering
NAVER D2
 
Multimodal deep learning
Multimodal deep learningMultimodal deep learning
Multimodal deep learning
hoai_ln
 
Introduction to un supervised learning
Introduction to un supervised learningIntroduction to un supervised learning
Introduction to un supervised learning
Rishikesh .
 
Multimedia technology for websites
Multimedia technology for websitesMultimedia technology for websites
Multimedia technology for websites
Fiona McGuire
 
Weekly news from WCUMC 11-9-2014
Weekly news from WCUMC 11-9-2014Weekly news from WCUMC 11-9-2014
Weekly news from WCUMC 11-9-2014
Woodinville Community Church
 
Voice over IP: Issues and Protocols
Voice over IP: Issues and ProtocolsVoice over IP: Issues and Protocols
Voice over IP: Issues and Protocols
Videoguy
 
MAT Chapter 1
MAT Chapter 1MAT Chapter 1
MAT Chapter 1
IF Engineer 2
 
Image compression using singular value decomposition
Image compression using singular value decompositionImage compression using singular value decomposition
Image compression using singular value decomposition
PRADEEP Cheekatla
 
13584 27 multimedia mining
13584 27 multimedia mining13584 27 multimedia mining
13584 27 multimedia mining
Universitas Bina Darma Palembang
 
CBIR by deep learning
CBIR by deep learningCBIR by deep learning
CBIR by deep learning
Vigen Sahakyan
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
ananth
 
Multimedia
MultimediaMultimedia
Multimedia
Ainun Syamila
 
Text mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehText mining, By Hadi Mohammadzadeh
Text mining, By Hadi Mohammadzadeh
Hadi Mohammadzadeh
 

Viewers also liked (20)

Data mining
Data miningData mining
Data mining
 
Ppt buyonlineindia case study
Ppt buyonlineindia case studyPpt buyonlineindia case study
Ppt buyonlineindia case study
 
ECML-2015 Presentation
ECML-2015 PresentationECML-2015 Presentation
ECML-2015 Presentation
 
Eddl5131 assignment 1 march2013
Eddl5131 assignment 1 march2013Eddl5131 assignment 1 march2013
Eddl5131 assignment 1 march2013
 
presentation
presentationpresentation
presentation
 
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMESREPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Multimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-AnsweringMultimodal Residual Learning for Visual Question-Answering
Multimodal Residual Learning for Visual Question-Answering
 
Multimodal deep learning
Multimodal deep learningMultimodal deep learning
Multimodal deep learning
 
Introduction to un supervised learning
Introduction to un supervised learningIntroduction to un supervised learning
Introduction to un supervised learning
 
Multimedia technology for websites
Multimedia technology for websitesMultimedia technology for websites
Multimedia technology for websites
 
Weekly news from WCUMC 11-9-2014
Weekly news from WCUMC 11-9-2014Weekly news from WCUMC 11-9-2014
Weekly news from WCUMC 11-9-2014
 
Voice over IP: Issues and Protocols
Voice over IP: Issues and ProtocolsVoice over IP: Issues and Protocols
Voice over IP: Issues and Protocols
 
MAT Chapter 1
MAT Chapter 1MAT Chapter 1
MAT Chapter 1
 
Image compression using singular value decomposition
Image compression using singular value decompositionImage compression using singular value decomposition
Image compression using singular value decomposition
 
13584 27 multimedia mining
13584 27 multimedia mining13584 27 multimedia mining
13584 27 multimedia mining
 
CBIR by deep learning
CBIR by deep learningCBIR by deep learning
CBIR by deep learning
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
 
Multimedia
MultimediaMultimedia
Multimedia
 
Text mining, By Hadi Mohammadzadeh
Text mining, By Hadi MohammadzadehText mining, By Hadi Mohammadzadeh
Text mining, By Hadi Mohammadzadeh
 

Similar to Multimedia data mining using deep learning

Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...
IJSCAI Journal
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
ijscai
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
ijscai
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
datasciencekorea
 
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
Ahmed Gad
 
Week3-Deep Neural Network (DNN).pptx
Week3-Deep Neural Network (DNN).pptxWeek3-Deep Neural Network (DNN).pptx
Week3-Deep Neural Network (DNN).pptx
fahmi324663
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Wanjin Yu
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
Ha Phuong
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introduction
giangbui0816
 
Deep learning
Deep learningDeep learning
Deep learning
Mohamed Loey
 
Face Recognition - Deep Learning
Face Recognition - Deep LearningFace Recognition - Deep Learning
Face Recognition - Deep Learning
Aashish Chaubey
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdf
KammetaJoshna
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
Yogendra Tamang
 
Deep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer visionDeep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer vision
Fares Al-Qunaieer
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Universitat Politècnica de Catalunya
 
IRJET - Deep Learning Applications and Frameworks – A Review
IRJET -  	  Deep Learning Applications and Frameworks – A ReviewIRJET -  	  Deep Learning Applications and Frameworks – A Review
IRJET - Deep Learning Applications and Frameworks – A Review
IRJET Journal
 
IRJET- Deep Learning Techniques for Object Detection
IRJET-  	  Deep Learning Techniques for Object DetectionIRJET-  	  Deep Learning Techniques for Object Detection
IRJET- Deep Learning Techniques for Object Detection
IRJET Journal
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Amr Rashed
 

Similar to Multimedia data mining using deep learning (20)

Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...Unsupervised learning models of invariant features in images: Recent developm...
Unsupervised learning models of invariant features in images: Recent developm...
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
 
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
UNSUPERVISED LEARNING MODELS OF INVARIANT FEATURES IN IMAGES: RECENT DEVELOPM...
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
NumPyCNNAndroid: A Library for Straightforward Implementation of Convolutiona...
 
Week3-Deep Neural Network (DNN).pptx
Week3-Deep Neural Network (DNN).pptxWeek3-Deep Neural Network (DNN).pptx
Week3-Deep Neural Network (DNN).pptx
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Neural Networks (DNN)
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introduction
 
Deep learning
Deep learningDeep learning
Deep learning
 
Face Recognition - Deep Learning
Face Recognition - Deep LearningFace Recognition - Deep Learning
Face Recognition - Deep Learning
 
imageclassification-160206090009.pdf
imageclassification-160206090009.pdfimageclassification-160206090009.pdf
imageclassification-160206090009.pdf
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Deep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer visionDeep convolutional neural networks and their many uses for computer vision
Deep convolutional neural networks and their many uses for computer vision
 
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
Neural Architectures for Still Images - Xavier Giro- UPC Barcelona 2019
 
IRJET - Deep Learning Applications and Frameworks – A Review
IRJET -  	  Deep Learning Applications and Frameworks – A ReviewIRJET -  	  Deep Learning Applications and Frameworks – A Review
IRJET - Deep Learning Applications and Frameworks – A Review
 
IRJET- Deep Learning Techniques for Object Detection
IRJET-  	  Deep Learning Techniques for Object DetectionIRJET-  	  Deep Learning Techniques for Object Detection
IRJET- Deep Learning Techniques for Object Detection
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 

Recently uploaded

一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 

Recently uploaded (20)

一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 

Multimedia data mining using deep learning

  • 1. Multimedia Data Mining using deep learning Peter Wlodarczak wlodarczak@gmail.com
  • 2. Agenda  Aims  Multimedia Data Mining  Artificial Neural Networks  Deep learning  Challenges  Discussion
  • 3. Aims  Analyze multimedia data for:  Object/face recognition  Voice commands  Natural Language Processing  Classification  Automatic caption generation  Record linkage (entity resolution)
  • 4. Multimedia Data Mining I  Multimedia data mining:  Unprecedented amount of Multimedia data since Web 2.0 and Social Media  Prosumer data  Uses algorithms to extract useful patterns and relations from image, audio and video data  Traditional methods often not satisfactory  Unsuitable for high dimensionality
  • 5. Multimedia Data Mining II  Multimedia data mining has been improved using deep learning in:  Visual data mining  Natural Language Processing  Deep learner are:  Machine Learning schemes  Usually multi-layered artificial neural networks
  • 6. Artificial Neural Networks I  Artificial Neural Networks:  Suitable to give good approximations for complex problems  Consist of perceptrons, neurons, and weighted connections, the axons
  • 7. Artificial Neural Networks II  Perceptron (Neuron)  Linear classifier  Data linearly separable using a hyperplane  Where w = weights, a = real-valued vector, feature vector, a0 = bias  Binary classifier f(a) that maps its input vector a to a single, binary output value w0a0 + w1a1 + w2a2 + … + wkak = 0
  • 8. Artificial Neural Networks III w0 1 bias attr a1 attr a2 attr a3 w1 w2 w3 f(a) = kwkak + b f(a) > 0 or f(a) < 0
  • 9. Artificial Neural Networks III Training data sex mask cape tie ears smokes class Batman male yes yes no yes no Good Robin male yes yes no no no Good Alfred male no no yes no no Good Penguin male no no yes no yes Bad Catwoman female yes no no yes no Bad Joker male no no no no no Bad Test data Batgirl female yes yes no yes no ? Riddler male yes no no no no ?  Supervised learning
  • 10. Artificial Neural Networks IV  Not all data is linearly separable
  • 11. Artificial Neural Networks V  Multilayer Perceptron  Perceptrons organized in several layers  A layer is fully interconnected with the next layer  All nodes except input node are perceptrons  Feedforward neural network  Uses backpropagation for training  Error propagated back to minimize loss function
  • 12. Artificial Neural Networks VI  Multilayer perceptron can be used for non-linear, multiclass classification
  • 13. Artificial Neural Networks VII  Gradient descent optimization method for learning weights
  • 14. Artificial Neural Networks VIII  Complexity has to be accurate (Occam’s razor) Schapire 2004
  • 15. Artificial Neural Networks IX Schapire 2004
  • 16. Artificial Neural Networks X  For building an accurate classifier:  Enough training examples  Good performance on training set  Classifier that is not too complex, overfitting  Allows to get approximate solutions for very complex problems  Support Vector Machines (SVM) are a much simpler alternative to ANN
  • 17. Deep learning I  Deep learning  No clear distinction to shallow learner  Multiple layers of non-linear processing units  Each layer represents features at a higher level  Forms a hierarchical representation  Majority of deep learners are aNN
  • 18. Deep learning II  Deep learning neural networks  Uses Rectified Linear Unit (ReLU)  Learn faster  Half-wave rectifier f(z) = max(z, 0)  Use backpropagation for adjusting the weights
  • 19. Deep learning III - ConvNet LeNet 2015
  • 20. Deep learning IV - ConvNet  Convolutional neural networks  Inspired by the animal visual cortex  Visual cortex is the most powerful visual processing system in existence  Typically two stages:  Convolutional stage  Pooling stage  Characterized by  sparse connectivity  shared weights
  • 21. Deep learning V - ConvNet  Shared weights  Subsets share weights and bias to form feature map  Replicated across entire visual field
  • 22. Deep learning VI - ConvNet  Each layer accepts 3D input vector and transforms it into a 3D output vector  Filters activate when specific feature is mapped CS231n 2015
  • 23. Deep learning VII - ConvNet  Receptive field spans all feature maps LeNet 2015
  • 24. Deep learning VIII - ConvNet  MaxPooling  Non-linear down-sampling  Partitions input into non-overlapping rectangles  Outputs maximum value for each sub- region  Minimizes computation for next layer  Reduces dimensionality of intermediate representations
  • 25. Deep learning IX - ConvNet  Convolutional and sampling sublayers UFLDL 2015
  • 26. Deep learning X - ConvNet  Image cascading max-pooling with convolutionary layer  Similar to edge detector
  • 27. Deep learning XI - RNN  Recurrent neural networks  Contain directed cycles  Take sequences as input, no fixed size input and output vectors, e. g. natural speech
  • 28. Deep learning XII - RNN  No fixed size of computations  Much simpler than ConvNets  Maintain inner state exhibiting dynamic temporal behavior  Optimized through backpropagation  Can be extended with long time memory extensions  Don’t necessary need sequences of inputs
  • 29. Deep learning XIII - RNN  Training RNN is a non-linear global optimization problem  Trained using stochastic gradient descent  Non-linear, differentiable activation function, e. g. rectifier  Trained through backpropagation through time (BPTT)  Genetic algorithms can be used for training
  • 30. Deep learning XIV - RNN  Many different architectures for RNN Elman SRN Spiking neural network
  • 31. Deep learning XV - RNN RNN learns to read house numbers RNN learns to paint house numbers Karpathy 2015
  • 32. Deep learning XVI - RNN  RNN used for  Transcribe speech to text  Voice synthetization  Machine translation
  • 33. Deep learning XVII  Combining ConvNets and RNN for image descriptions  Regions described using language as label space using ConvNet  Language synthesizing using RNN Karpathy & Fei-Fei 2014
  • 34. Deep learning XVIII  ConvNet and RNN can be combined  Automated caption generation
  • 35. Deep learning XIX  Automatic feature extraction  No closed vocabulary set  Alignment of segments of sentences to region on the image Karpathy & Fei-Fei 2014
  • 36. Deep learning XX  Other applications  Object recognition  Movie classification  Handwriting recognition  Record linkage
  • 37. Challenges I  Main disadvantage large volumes of training data needed  Overfitting if not enough training data  Optimization difficult  Finding relevant information  Privacy preservice data mining
  • 39. Discussion  Future research in  Attention based models  Finding relevant information  Data democratization and Internet of Things  Unsupervised learning  Semantic data modeling  Reasoning
  • 40. Thank you for the attention  Questions?
  • 41. References  Zhao, X, Li, X & Zhang, Z 2015, 'Multimedia Retrieval via Deep Learning to Rank ', IEEE Signal Processing Letters, vol. 22, no. 9, pp. 1487 - 91 <http://ieeexplore.ieee.org.ezproxy.usq.edu.au/xpls/abs_all.jsp?arnumber=7054452>.  Yu, W, Zhuang, F, He, Q & Shi, Z 2015, 'Learning deep representations via extreme learning machines', Neurocomputing, vol. 149, Part A, pp. 308-15, <http://www.sciencedirect.com/science/article/pii/S0925231214011461>.  Xu, K, Ba, J, Kiros, R, Cho, K, Courville, A, Salakhutdinov, R, Zemel, R & Bengio, Y 2015, 'Show, Attend and Tell: Neural Image Caption Generation with Visual Attention', Proceedings of the 32nd International Conference on Machine Learning from Data: Artificial Intelligence and Statistics, vol. 37.  Xin, J, Wang, Z, Qu, L & Wang, G 2015, 'Elastic extreme learning machine for big data classification', Neurocomputing, vol. 149, Part A, pp. 464-71, <http://www.sciencedirect.com/science/article/pii/S0925231214011503>.  Weston, J, Chopra, S & Bordes, A 2015, 'Memory Networks', in 3rd International Conference on Learning Representations: proceedings of the3rd International Conference on Learning Representations San Diego, viewed <http://arxiv.org/pdf/1410.3916v10.pdf>.  Weilong, H, Xinbo, G, Dacheng, T & Xuelong, L 2015, 'Blind Image Quality Assessment via Deep Learning', Neural Networks and Learning Systems, IEEE Transactions on, vol. 26, no. 6, pp. 1275-86.  Wang, Y, Li, D, Du, Y & Pan, Z 2015, 'Anomaly detection in traffic using L1-norm minimization extreme learning machine', Neurocomputing, vol. 149, Part A, pp. 415-25, <http://www.sciencedirect.com/science/article/pii/S0925231214011382>.  Vinyals, O, Toshev, A, Bengio, S & Erhan, D 2015, 'Show and Tell: A Neural Image Caption Generator', Google, <http://arxiv.org/pdf/1411.4555v1.pdf>.  Noda, K, Yamaguchi, Y, Nakadai, K, Okuno, H & Ogata, T 2015, 'Audio-visual speech recognition using deep learning', Applied Intelligence, vol. 42, no. 4, pp. 722-37, <http://dx.doi.org/10.1007/s10489-014-0629-7>.  Mao, W, Zhao, S, Mu, X & Wang, H 2015, 'Multi-dimensional extreme learning machine', Neurocomputing, vol. 149, Part A, pp. 160-70, <http://www.sciencedirect.com/science/article/pii/S0925231214011540>.  Liu, X, Wang, L, Huang, G-B, Zhang, J & Yin, J 2015, 'Multiple kernel extreme learning machine', Neurocomputing, vol. 149, Part A, pp. 253- 64, <http://www.sciencedirect.com/science/article/pii/S0925231214011199>.  LeCun, Y, Bengio, Y & Hinton, G 2015, 'Deep learning', Nature, vol. 521, no. 7553, pp. 436-44, <http://dx.doi.org/10.1038/nature14539>.  Srivastava, N, Hinton, G, Krizhevsky, A, Sutskever, I & Salakhutdinov, R 2014, 'Dropout: a simple way to prevent neural networks from overfitting', J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929-58.  Karpathy, A & Fei-Fei, L 2014, 'Deep visual-semantic alignments for generating image descriptions', arXiv preprint arXiv:1412.2306.