SlideShare a Scribd company logo
1 of 23
Download to read offline
Department of Information Technology
Seminar presentation on
CHALLENGES AND OPPORTUNITIES OF ANALYSING
COMPLEX DATA USING DEEP LEARNING
By: Maru Kindeneh
To:Alemu.K
tibeyinmaru@gmail.com 1
Content
1. Introduction
2. Background
2.1. Machine learning and computer aided analysis
2.2. Different types of data
2.3. Feature engineering – Creating structured data from unstructured data
2.4. Deep learning
2.5. Challenges and open problems in deep learning
3. Problem specification
3.1. Related work (literature review )
3.2. Delimitation
4. Method
5. conclusion tibeyinmaru@gmail.com
2
1. Introduction
The era of big data and data analysis is here.
Data generation has growing exponentially and the world has
started to go through a big data revolution
Before the big data revolution, a lot of effort were put in designing
different data collection schemes and surveys for data collection.
One of the main challenges with this type of data , is that the data
often are very heterogeneous and do not follow a predefined
structure.
The data can also be stored in different formats , have different
quality and granularity and come from many different sources ,
factors making it more difficult to analyse the data.
tibeyinmaru@gmail.com 3
Continue…
A research field that has shown promising results solving problems
arising due to this type of unstructured data from multiple sources is
deep learning.
Deep learning is a new sub-field of machine learning that has
revolutionized several fields such as image processing, speech
recognition and natural language processing.
How to initialize and train a deep learning model is often a non trivial
problem, that requires expert knowledge.
 How to select, configure and train deep learning models is by many
still seen as a “black art”.
tibeyinmaru@gmail.com
2. Background
Machine learning is sub-field within artificial intelligence (AI) that is
focused on how machines may learn and draw conclusions from data.
In this research there is a special focus on complex data. This type of
data are often hard to analyse with conventional ML methods.
To solve this problem, the data is often first manually crafted into a
new dataset with new features. This is called feature learning.
In the last few years, a new sub-field of machine learning that can
handle complex data without feature engineering has emerge. This
sub-field is called Deep learning (DL).
tibeyinmaru@gmail.com
2.1. Machine Learning And Computer Aided Analysis
tibeyinmaru@gmail.com
6
Machine learning is a sub-field of computer science which aims to
make computers learn.
Thus, by presenting a lot of training samples to a machine, it would be
able to extrapolate knowledge from the observed data.
The machine would then be able to use the gained knowledge to draw
conclusions in new examples for acting as a support to humans for
decision processes.
2.1.1.Supervised Learning
tibeyinmaru@gmail.com
7
Supervised learning can be regarded to be similar to the learning
process of a teacher teaching a student.
For example, include classification of images where a lot of images
and their corresponding classes are presented to the computer
With this generalized knowledge the computer can later on correctly
classify new images that where not presented during the training
phase.
2.1.2. Unsupervised Learning
tibeyinmaru@gmail.com
In unsupervised learning, the computer tries to generalize the data and
to learn underlying patterns of it.
This is useful when dividing the data into different clusters or trying
to find samples with similar meanings.
8
2.1.3. Feature Learning
In feature learning, machine learning algorithms are trained to
create new representations of the data.
Feature leaning can be conducted in both a supervised and
unsupervised manner.
The most common purpose for learning new representations of
data is to reduce the number of dimensions.
A second reason for learning new representations of the data is to
find a more interpretable representation of the data.
The methods use models such as Restricted Boltzmann Machines
(RBM), Deep Belief Networks (DBN) and Auto encoders (AE)
tibeyinmaru@gmail.com 9
2.1.4. Multimodal Learning
In multimodal learning a model learns from data with multiple
modalities.
A classical approach to analyse multimodal data is through
multiple kernel learning.
This approach is beneficial in at least two ways:
1st this reduce the importance of the choice of kernel and hyper
parameters, since kernels with better performance will be given more
significance when the kernels are combined.
2nd benefit, is that different kernels may handle different formats of
inputs.
tibeyinmaru@gmail.com
10
2.2. Different Types Of Data
tibeyinmaru@gmail.com
11
To avoid any further confusion, the following definitions of different
data types will be used:
1. Structured data:
Structured data are data that are structured in a tabular form.
Relational databases are often used to store structured data.
2. Unstructured data:
all data that cannot be stored in tabular form, where each row is
independent from all other rows, are unstructured.
3. Multi-levelled data:
Multi-levelled data is data that are measured with different
granularity
2.2. Different Types Of Data…
4. Multimodal data:
Multimodal data are data concerning multiple and diverse modalities.
for example, both text and images are stored for the same instance are
multimodal
Another typical example of multimodal data is video streams, containing
both a sequence of images and audio
5. Complex data:
Complex data is the same as high dimensional data.
The complexity of a dataset may either be due to that the dataset “contain
many rows as well as many attributes” or that the dataset contains “non-
trivial interactions between attributes”.
tibeyinmaru@gmail.com
12
2.3.Feature Engineering-Creating Structured
data From Unstructured Data
tibeyinmaru@gmail.com 13
Most machine learning algorithms require structured data, and thus
they can not be applied to unstructured data.
To bypass this problem, the machine learning algorithm can instead be
applied to a fixed set of features that are extracted from the raw data
which is called feature engineering.
Manual feature engineering often misses complex high ordered
dependencies between variables.
Due to this deep learning methods that are capable of automatic
feature creation and selection are of great utility.
Deep learning is a type of representation learning where the machine itself
learns several internal representations from raw data in order to perform
regression or classification.
4.1. Artificial Neural Networks
Artificial neural networks has been around for some time.
4.2. Feed Forward Neural Networks
A feed forward neural network is an artificial neural network where
information only moves in one direction.
4.3. Convolutional Neural Networks
Convolutional neural networks (CNNs) mainly used in image analysis,
some authors successfully use CNNs for natural language processing.
Basic features in a small area of an image can be analysed independently
of their position and the rest of the image
2.4.Deep Learning
tibeyinmaru@gmail.com
14
2.4. Deep Learning…
4.4. Recurrent Neural Networks
Unlike feed forward networks which are a cyclical.
One big problem with RNN, which also occurs in deep feed forward
networks, is that the gradients in the backpropagation will either go
to zero or infinity.
4.5 Generative Adversarial Networks
The idea behind a GAN is that you have two artificial neural networks
that compete.
Generator, tries to generate examples following the same distribution as
the collected data.
Discriminator, tries to distinguish between examples that are generated
by the generator and data that are sampled from the real data
distribution.
tibeyinmaru@gmail.com
15
2.5. Challenges And Open Problems In Deep Learning
tibeyinmaru@gmail.com
16
One of the biggest problems is that the gradient based learning used to
train the networks is computationally demanding.
The most obvious drawbacks are the loss of interpretability and the
difficulty of training the networks.
Another big challenge in the training of deep neural networks is the
large amount of hyper-parameters.
Even while deep learning methods are more flexible and easier to
modify than many classical methods there are still some limitations on
how they can be applied to data that are very heterogeneous and
complex.
3. Problem Specification
Complex data is the same as high dimensional data that are heterogeneous.
Much of the research today within the field of deep learning (and machine
learning in general), are focused on developing new methods to analyse data.
However, it is seldom reflected upon how the type, quality and complexity of
the data affects the analysis.
The different types of data that should be investigated are:
Data that are structured as a graph.
Sequential data where both long and short time dependencies exist.
The initial focus in the study of data properties will be:
The granularity of the data, thus how good the measures in the data are.
Errors in the data and how they affect the result of an analysis.
The number of interacting agents or parts in the system that has generated the data.
Skewness of different classes in the data.
tibeyinmaru@gmail.com
17
3. Problem Specification…
To conclude, the research questions of this work are:
Q1 What are complex data and how can it be defined in terms of
metrics?
Q2 What properties of complex data are problematic for the current
deep learning methods to handle and what is the reason for it.
Q3 How can the current deep learning methods be refined to handle
these properties of the data, or do new methods have to be
developed?
tibeyinmaru@gmail.com
3.1. Literature review
Toubiana et al defines complex data ,That complex data are data
generated by complex interactions in the studied system.
Haken gives definition fro system complexity “Systems which are
composed of many parts, or elements, or components which may
be of the same or of different kinds. The components or parts may
be connected in a more or less complicated fashion.”
Haken states that: “The data to be collected often seem to be quite
inexhaustible. In addition it is often impossible to decide which
aspect to choose a priori, and we must instead undergo a learning
process in order to know how to cope with a complex system.”
tibeyinmaru@gmail.com
19
3.2. Delimitations
This work will be focused on the analysis of complex data, using
deep learning methods, and will therefore only consider and
analyse models for such data.
tibeyinmaru@gmail.com
20
4. Method
The first question posed in this research aims to finding a profound
definition of complex data, for the use within the field of data
analysis and deep learning.
The first step of this research will be to conduct a literature study,
to survey current definitions and opinions of complex data.
Several case studies will then be conducted to validate, refine and
consolidate the produced definition, as well as to further study the
properties of complex data and thus also answer the second
research question.
When a sufficient amount of case studies have been conducted, a
framework for generating data will be created.
tibeyinmaru@gmail.com
21
5. Conclusion
Unlike data analysis just some decades ago, the analysis today does
not only comprise data that are stored in well organized tables.
Instead the data are much more diverse and may, for example,
consist of images or text. This implies the term complex data.
However, there are no profound definition for complex data and
this term is often used to highlight that an analysis of the data is
non-trivial.
A sub-field within machine learning that has shown promising
results analyzing complex data in the last years is deep learning.
Even though deep learning has been successful in many fields,
there are still several open problems that need to be solved.
tibeyinmaru@gmail.com
22
tibeyinmaru@gmail.com
23

More Related Content

What's hot

Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...butest
 
Proposing a new method of image classification based on the AdaBoost deep bel...
Proposing a new method of image classification based on the AdaBoost deep bel...Proposing a new method of image classification based on the AdaBoost deep bel...
Proposing a new method of image classification based on the AdaBoost deep bel...TELKOMNIKA JOURNAL
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisWriteMyThesis
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodTELKOMNIKA JOURNAL
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYcseij
 
IRJET- Deep Learning Techniques for Object Detection
IRJET-  	  Deep Learning Techniques for Object DetectionIRJET-  	  Deep Learning Techniques for Object Detection
IRJET- Deep Learning Techniques for Object DetectionIRJET Journal
 
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...ijcsa
 
Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...ijfcstjournal
 
Test PDF
Test PDFTest PDF
Test PDFAlgnuD
 
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...csandit
 
05012013150050 computerised-paper-evaluation-using-neural-network
05012013150050 computerised-paper-evaluation-using-neural-network05012013150050 computerised-paper-evaluation-using-neural-network
05012013150050 computerised-paper-evaluation-using-neural-networknimmajji
 
Multimedia data mining using deep learning
Multimedia data mining using deep learningMultimedia data mining using deep learning
Multimedia data mining using deep learningPeter Wlodarczak
 
IRJET- Factoid Question and Answering System
IRJET-  	  Factoid Question and Answering SystemIRJET-  	  Factoid Question and Answering System
IRJET- Factoid Question and Answering SystemIRJET Journal
 
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre..."An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...Edge AI and Vision Alliance
 
Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...
Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...
Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...Hakka Labs
 
Predicting the future with social media
Predicting the future with social mediaPredicting the future with social media
Predicting the future with social mediaPeter Wlodarczak
 
Machine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction IntroductionMachine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction Introductionbutest
 
Sulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSULTHAN BASHA
 

What's hot (20)

Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...Proposed-curricula-MCSEwithSyllabus_24_...
Proposed-curricula-MCSEwithSyllabus_24_...
 
Proposing a new method of image classification based on the AdaBoost deep bel...
Proposing a new method of image classification based on the AdaBoost deep bel...Proposing a new method of image classification based on the AdaBoost deep bel...
Proposing a new method of image classification based on the AdaBoost deep bel...
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesis
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network method
 
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEYUSING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
USING ONTOLOGIES TO OVERCOMING DRAWBACKS OF DATABASES AND VICE VERSA: A SURVEY
 
IRJET- Deep Learning Techniques for Object Detection
IRJET-  	  Deep Learning Techniques for Object DetectionIRJET-  	  Deep Learning Techniques for Object Detection
IRJET- Deep Learning Techniques for Object Detection
 
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
Front End Data Cleaning And Transformation In Standard Printed Form Using Neu...
 
Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...Pattern recognition using context dependent memory model (cdmm) in multimodal...
Pattern recognition using context dependent memory model (cdmm) in multimodal...
 
Test PDF
Test PDFTest PDF
Test PDF
 
Report
ReportReport
Report
 
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
USING THE MANDELBROT SET TO GENERATE PRIMARY POPULATIONS IN THE GENETIC ALGOR...
 
05012013150050 computerised-paper-evaluation-using-neural-network
05012013150050 computerised-paper-evaluation-using-neural-network05012013150050 computerised-paper-evaluation-using-neural-network
05012013150050 computerised-paper-evaluation-using-neural-network
 
Multimedia data mining using deep learning
Multimedia data mining using deep learningMultimedia data mining using deep learning
Multimedia data mining using deep learning
 
IRJET- Factoid Question and Answering System
IRJET-  	  Factoid Question and Answering SystemIRJET-  	  Factoid Question and Answering System
IRJET- Factoid Question and Answering System
 
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre..."An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
 
Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...
Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...
Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learni...
 
Predicting the future with social media
Predicting the future with social mediaPredicting the future with social media
Predicting the future with social media
 
Machine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction IntroductionMachine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction Introduction
 
Sulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_Science
 
5th sem
5th sem5th sem
5th sem
 

Similar to Deep Learning Challenges of Complex Data Analysis

ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEijesajournal
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEijesajournal
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEijesajournal
 
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...PhD Assistance
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
 
ML crash course
ML crash courseML crash course
ML crash coursemikaelhuss
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AIGreg Werner
 
Current clustering techniques
Current clustering techniquesCurrent clustering techniques
Current clustering techniquesPoonam Kshirsagar
 
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...eswaralaldevadoss
 
Introduction of DSA||DATA STRUCTURE AND ALGORITHUM
Introduction of DSA||DATA STRUCTURE AND ALGORITHUMIntroduction of DSA||DATA STRUCTURE AND ALGORITHUM
Introduction of DSA||DATA STRUCTURE AND ALGORITHUMamjadrasoolbadrani
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7CS, NcState
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.pptARVIND SARDAR
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection methodIJSRD
 
IRJET- Comparative Study of Efficacy of Big Data Analysis and Deep Learni...
IRJET-  	  Comparative Study of Efficacy of Big Data Analysis and Deep Learni...IRJET-  	  Comparative Study of Efficacy of Big Data Analysis and Deep Learni...
IRJET- Comparative Study of Efficacy of Big Data Analysis and Deep Learni...IRJET Journal
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Miningnabil_alsharafi
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...Editor IJCATR
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 

Similar to Deep Learning Challenges of Complex Data Analysis (20)

Fake News Detection using Deep Learning
Fake News Detection using Deep LearningFake News Detection using Deep Learning
Fake News Detection using Deep Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope
 
ML crash course
ML crash courseML crash course
ML crash course
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
 
Current clustering techniques
Current clustering techniquesCurrent clustering techniques
Current clustering techniques
 
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
 
Introduction of DSA||DATA STRUCTURE AND ALGORITHUM
Introduction of DSA||DATA STRUCTURE AND ALGORITHUMIntroduction of DSA||DATA STRUCTURE AND ALGORITHUM
Introduction of DSA||DATA STRUCTURE AND ALGORITHUM
 
Dm sei-tutorial-v7
Dm sei-tutorial-v7Dm sei-tutorial-v7
Dm sei-tutorial-v7
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection method
 
IRJET- Comparative Study of Efficacy of Big Data Analysis and Deep Learni...
IRJET-  	  Comparative Study of Efficacy of Big Data Analysis and Deep Learni...IRJET-  	  Comparative Study of Efficacy of Big Data Analysis and Deep Learni...
IRJET- Comparative Study of Efficacy of Big Data Analysis and Deep Learni...
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Mining
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

Deep Learning Challenges of Complex Data Analysis

  • 1. Department of Information Technology Seminar presentation on CHALLENGES AND OPPORTUNITIES OF ANALYSING COMPLEX DATA USING DEEP LEARNING By: Maru Kindeneh To:Alemu.K tibeyinmaru@gmail.com 1
  • 2. Content 1. Introduction 2. Background 2.1. Machine learning and computer aided analysis 2.2. Different types of data 2.3. Feature engineering – Creating structured data from unstructured data 2.4. Deep learning 2.5. Challenges and open problems in deep learning 3. Problem specification 3.1. Related work (literature review ) 3.2. Delimitation 4. Method 5. conclusion tibeyinmaru@gmail.com 2
  • 3. 1. Introduction The era of big data and data analysis is here. Data generation has growing exponentially and the world has started to go through a big data revolution Before the big data revolution, a lot of effort were put in designing different data collection schemes and surveys for data collection. One of the main challenges with this type of data , is that the data often are very heterogeneous and do not follow a predefined structure. The data can also be stored in different formats , have different quality and granularity and come from many different sources , factors making it more difficult to analyse the data. tibeyinmaru@gmail.com 3
  • 4. Continue… A research field that has shown promising results solving problems arising due to this type of unstructured data from multiple sources is deep learning. Deep learning is a new sub-field of machine learning that has revolutionized several fields such as image processing, speech recognition and natural language processing. How to initialize and train a deep learning model is often a non trivial problem, that requires expert knowledge.  How to select, configure and train deep learning models is by many still seen as a “black art”. tibeyinmaru@gmail.com
  • 5. 2. Background Machine learning is sub-field within artificial intelligence (AI) that is focused on how machines may learn and draw conclusions from data. In this research there is a special focus on complex data. This type of data are often hard to analyse with conventional ML methods. To solve this problem, the data is often first manually crafted into a new dataset with new features. This is called feature learning. In the last few years, a new sub-field of machine learning that can handle complex data without feature engineering has emerge. This sub-field is called Deep learning (DL). tibeyinmaru@gmail.com
  • 6. 2.1. Machine Learning And Computer Aided Analysis tibeyinmaru@gmail.com 6 Machine learning is a sub-field of computer science which aims to make computers learn. Thus, by presenting a lot of training samples to a machine, it would be able to extrapolate knowledge from the observed data. The machine would then be able to use the gained knowledge to draw conclusions in new examples for acting as a support to humans for decision processes.
  • 7. 2.1.1.Supervised Learning tibeyinmaru@gmail.com 7 Supervised learning can be regarded to be similar to the learning process of a teacher teaching a student. For example, include classification of images where a lot of images and their corresponding classes are presented to the computer With this generalized knowledge the computer can later on correctly classify new images that where not presented during the training phase.
  • 8. 2.1.2. Unsupervised Learning tibeyinmaru@gmail.com In unsupervised learning, the computer tries to generalize the data and to learn underlying patterns of it. This is useful when dividing the data into different clusters or trying to find samples with similar meanings. 8
  • 9. 2.1.3. Feature Learning In feature learning, machine learning algorithms are trained to create new representations of the data. Feature leaning can be conducted in both a supervised and unsupervised manner. The most common purpose for learning new representations of data is to reduce the number of dimensions. A second reason for learning new representations of the data is to find a more interpretable representation of the data. The methods use models such as Restricted Boltzmann Machines (RBM), Deep Belief Networks (DBN) and Auto encoders (AE) tibeyinmaru@gmail.com 9
  • 10. 2.1.4. Multimodal Learning In multimodal learning a model learns from data with multiple modalities. A classical approach to analyse multimodal data is through multiple kernel learning. This approach is beneficial in at least two ways: 1st this reduce the importance of the choice of kernel and hyper parameters, since kernels with better performance will be given more significance when the kernels are combined. 2nd benefit, is that different kernels may handle different formats of inputs. tibeyinmaru@gmail.com 10
  • 11. 2.2. Different Types Of Data tibeyinmaru@gmail.com 11 To avoid any further confusion, the following definitions of different data types will be used: 1. Structured data: Structured data are data that are structured in a tabular form. Relational databases are often used to store structured data. 2. Unstructured data: all data that cannot be stored in tabular form, where each row is independent from all other rows, are unstructured. 3. Multi-levelled data: Multi-levelled data is data that are measured with different granularity
  • 12. 2.2. Different Types Of Data… 4. Multimodal data: Multimodal data are data concerning multiple and diverse modalities. for example, both text and images are stored for the same instance are multimodal Another typical example of multimodal data is video streams, containing both a sequence of images and audio 5. Complex data: Complex data is the same as high dimensional data. The complexity of a dataset may either be due to that the dataset “contain many rows as well as many attributes” or that the dataset contains “non- trivial interactions between attributes”. tibeyinmaru@gmail.com 12
  • 13. 2.3.Feature Engineering-Creating Structured data From Unstructured Data tibeyinmaru@gmail.com 13 Most machine learning algorithms require structured data, and thus they can not be applied to unstructured data. To bypass this problem, the machine learning algorithm can instead be applied to a fixed set of features that are extracted from the raw data which is called feature engineering. Manual feature engineering often misses complex high ordered dependencies between variables. Due to this deep learning methods that are capable of automatic feature creation and selection are of great utility.
  • 14. Deep learning is a type of representation learning where the machine itself learns several internal representations from raw data in order to perform regression or classification. 4.1. Artificial Neural Networks Artificial neural networks has been around for some time. 4.2. Feed Forward Neural Networks A feed forward neural network is an artificial neural network where information only moves in one direction. 4.3. Convolutional Neural Networks Convolutional neural networks (CNNs) mainly used in image analysis, some authors successfully use CNNs for natural language processing. Basic features in a small area of an image can be analysed independently of their position and the rest of the image 2.4.Deep Learning tibeyinmaru@gmail.com 14
  • 15. 2.4. Deep Learning… 4.4. Recurrent Neural Networks Unlike feed forward networks which are a cyclical. One big problem with RNN, which also occurs in deep feed forward networks, is that the gradients in the backpropagation will either go to zero or infinity. 4.5 Generative Adversarial Networks The idea behind a GAN is that you have two artificial neural networks that compete. Generator, tries to generate examples following the same distribution as the collected data. Discriminator, tries to distinguish between examples that are generated by the generator and data that are sampled from the real data distribution. tibeyinmaru@gmail.com 15
  • 16. 2.5. Challenges And Open Problems In Deep Learning tibeyinmaru@gmail.com 16 One of the biggest problems is that the gradient based learning used to train the networks is computationally demanding. The most obvious drawbacks are the loss of interpretability and the difficulty of training the networks. Another big challenge in the training of deep neural networks is the large amount of hyper-parameters. Even while deep learning methods are more flexible and easier to modify than many classical methods there are still some limitations on how they can be applied to data that are very heterogeneous and complex.
  • 17. 3. Problem Specification Complex data is the same as high dimensional data that are heterogeneous. Much of the research today within the field of deep learning (and machine learning in general), are focused on developing new methods to analyse data. However, it is seldom reflected upon how the type, quality and complexity of the data affects the analysis. The different types of data that should be investigated are: Data that are structured as a graph. Sequential data where both long and short time dependencies exist. The initial focus in the study of data properties will be: The granularity of the data, thus how good the measures in the data are. Errors in the data and how they affect the result of an analysis. The number of interacting agents or parts in the system that has generated the data. Skewness of different classes in the data. tibeyinmaru@gmail.com 17
  • 18. 3. Problem Specification… To conclude, the research questions of this work are: Q1 What are complex data and how can it be defined in terms of metrics? Q2 What properties of complex data are problematic for the current deep learning methods to handle and what is the reason for it. Q3 How can the current deep learning methods be refined to handle these properties of the data, or do new methods have to be developed? tibeyinmaru@gmail.com
  • 19. 3.1. Literature review Toubiana et al defines complex data ,That complex data are data generated by complex interactions in the studied system. Haken gives definition fro system complexity “Systems which are composed of many parts, or elements, or components which may be of the same or of different kinds. The components or parts may be connected in a more or less complicated fashion.” Haken states that: “The data to be collected often seem to be quite inexhaustible. In addition it is often impossible to decide which aspect to choose a priori, and we must instead undergo a learning process in order to know how to cope with a complex system.” tibeyinmaru@gmail.com 19
  • 20. 3.2. Delimitations This work will be focused on the analysis of complex data, using deep learning methods, and will therefore only consider and analyse models for such data. tibeyinmaru@gmail.com 20
  • 21. 4. Method The first question posed in this research aims to finding a profound definition of complex data, for the use within the field of data analysis and deep learning. The first step of this research will be to conduct a literature study, to survey current definitions and opinions of complex data. Several case studies will then be conducted to validate, refine and consolidate the produced definition, as well as to further study the properties of complex data and thus also answer the second research question. When a sufficient amount of case studies have been conducted, a framework for generating data will be created. tibeyinmaru@gmail.com 21
  • 22. 5. Conclusion Unlike data analysis just some decades ago, the analysis today does not only comprise data that are stored in well organized tables. Instead the data are much more diverse and may, for example, consist of images or text. This implies the term complex data. However, there are no profound definition for complex data and this term is often used to highlight that an analysis of the data is non-trivial. A sub-field within machine learning that has shown promising results analyzing complex data in the last years is deep learning. Even though deep learning has been successful in many fields, there are still several open problems that need to be solved. tibeyinmaru@gmail.com 22