SlideShare a Scribd company logo
1 of 27
MACHINE LEARNING ON BIG
DATA: OPPORTUNITIES AND
CHALLENGES - FUTURE RESEARCH
DIRECTION FOR PHD SCHOLARS
An Academic presentation by
Dr. Nancy Agnes, Head, Technical Operations, Phdassistance
Group www.phdassistance.com
Email: info@phdassistance.com
In-brief
Introduction
Machine learning
Big data
Data preprocessing opportunities and challenges
Evaluation opportunities and challenges
Future research
Conclusion
Outline
TODAY'SDISCUSSION
Machine Learning (ML) is rapidly used in a variety of applications. It has risen to
prominence in recent years, owing in part to the emergence of big data. When it comes
to big data, ML algorithms have never been more promising. Big data allows machine
learning algorithms to discover finer-grained patterns and make more timely and
precise predictions than ever before; however, it also poses significant challenges to
machine learning, such as model scalability and distributed computing.
In-Brief
In various fields as computer vision, speech recognition,
natural language comprehension, neuroscience, fitness,
and the Internet of Things, ML techniques have had
enormous societal impacts.
The emergence of the era of big data has stirred up interest
in Machine Learning Big Data has never promised or
questioned machine learning algorithms to gain new
insights into a variety of business applications and human
behaviours.
Contd...
INTRODUCTION
On the one hand, big data provides ML algorithms with unparalleled amounts of data
from which to derive underlying patterns and create predictive models; on the other
hand, conventional ML algorithms face crucial challenges such as scalability in order
to fully unlock the value of big data.
With the ever-expanding world of big data, ML must develop and grow in order to turn
big data into actionable intelligence.
Contd...
ML aims to answer the question of how to build a computer system that improves itself
over time.
The problem of learning from experience with respect to certain tasks and performance
metrics is referred to as an ML problem.
Users may use ML techniques to deduce underlying structure and make predictions from
large datasets.
Contd...
ML thrives on strong computational environments, efficient learning techniques
(algorithms), and rich and/or large data.
As a result, ML has a lot of potential and is an essential part of big data analytics
Fig. 1. A Framework of machine learning on big data
(MLBid)
Data pre-processing, learning, and assessment are
common stages of Machine Learning.
Data pre-processing aids in the transformation of raw
data into the "right form" for further learning steps.
Via data cleaning, extraction, transformation, and fusion,
the pre-processing phase transforms such data into a
form that can be used as inputs to learning.
Contd...
MACHINE
LEARNING
Using the pre-processed input data, the learning step selects learning algorithms and
tunes model parameters to produce desired outputs.
Data pre-processing can be done with some learning methods, especially
representational learning.
After that, the trained models are evaluated to see how well they do.
The essence of learning input, the goal of learning activities, and the timing of data
availability are all characteristics of machine learning.
Contd...
ML can be divided into three major categories based on the quality of the input available
to a learning system: supervised learning, unsupervised learning, and reinforcement
learning.
ML can be divided into two types: representational learning and task learning,
depending on whether the learning goal is to learn particular tasks using input features
or to learn the features themselves.
Each Machine Learning Algorithm can be classified in a variety of ways.
Fig. 2. A multi-dimensional taxonomy of machine
learning
Volume, velocity, variety, veracity, and value are the five
dimensions of big data.
Starting from the bottom, we organised the five dimensions into
a stack of high, data, and value layers.
The data layer is integral to big data, and the meaning factor
characterises the influence of big data real-world applications.
Contd...
BIGDATA
The lower layer is more reliant on technical advancements, while the higher layer is
more focused on applications that leverage big data's strategic strength.
Established machine learning paradigms and algorithms must be modified to
understand the potential of big data analytics and to process big data efficiently.
We recognise key opportunities and challenges in this section.
We go through them individually for each of the three phases of machine learning:
preprocessing, learning, and assessment.
Contd...
Fig. 3. Big data
stack
Data replication or inconsistency can have a
significant impact on machine learning.
Traditional methods such as pairwise similarity
comparison are no longer feasible for big data,
despite a variety of techniques for detecting
duplicates produced in the last 20 years.
Contd...
When two or more data samples represent the
same object, duplication occurs.
DATA REDUNDANCY
DATAPREPROCESSING
OPPORTUNITIES AND
CHALLENGES
Furthermore, the conventional presumption that duplicated pairs are rarer than
non-duplicated pairs is no longer true.
Dynamic Time Warping can be much faster than current Euclidean distance
algorithms in this regard
DATA HETEROGENEITY
Big data promises to include multi-view data from a variety of repositories, in a
variety of formats, and from a variety of population samples, and thus is highly
heterogeneous.
Contd...
The value of these multi-view heterogeneous data. As a result, combining all of
the characteristics and treating them equally relevant is unlikely to result in
optimal learning outcomes.
Big data offers the possibility of simultaneously learning from different views and
then assembling multiple findings by learning the relevance of feature views to
the task.
The approach is supposed to be resistant to data outliers and to be able to solve
optimization and convergence problems.
Contd...
DATA DISCRETIZATION
However, most current discretization
dealing with large amounts of data.
methods would be ineffective when
Traditional discretization approaches have been parallelized in big data platforms
to solve big data problems, with a distributed variant of the entropy minimization
discretizer based on the Minimum Description Length Principle improving both
efficiency and accuracy.
Contd...
DATA LABELLING
Active learning can be used as an optimization technique for marking activities
in crowd-sourced databases, reducing the number of questions posed to the
crowd and enabling crowd-sourced applications to scale.
Designing active Learning Algorithms for a crowd-sourced dataset, on the other
hand, presents a number of practical challenges, including generality, scalability,
and usability.
Another problem is that such a dataset cannot cover all user-specific contexts,
resulting in output that is often inferior to user-centric training.
Contd...
IMBALANCED DATA
Traditional stratified random sampling approaches have tackled the problem of
unbalanced data.
However, if iterations of sub-sample generation and error metrics measurement are
needed, the process can take a long time.
Furthermore, conventional sampling methods are unable to support data sampling
over a user-specified subset of data that includes value-based sampling efficiently.
Parallel data sampling is needed by big data.
This paper provides a summary of the benefits and
drawbacks of machine learning on big data.
Big data poses new possibilities for inspiring revolutionary
and novel ML technologies to solve many associated
technological problems and generate real-world impacts,
while also posing multiple challenges for conventional ML in
terms of scalability, adaptability, and usability.
Contd...
FUTURE
RESEARCH
These opportunities and challenges can be used to evaluate current research in
this field.
According to the components of the MLBiD system, we also highlight some open
Research issues in ML on big data, as shown in Table.
In conclusion, machine learning is needed to address the
challenges faced by big data and to discover hidden patterns,
information, and insights from big data in order to transform its
potential into real value for business decision-making and
scientific exploration.
The combination of machine learning and big data points to a
bright future in a modern frontier.
CONCLUSION
Contact Us
UNITED KINGDOM
+44-1143520021
INDIA
+91-4448137070
EMAIL
info@phdassistance.com

More Related Content

What's hot

Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive Modelling
Amit Kumar
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
Editor IJCATR
 

What's hot (20)

Cognitive automation
Cognitive automationCognitive automation
Cognitive automation
 
Prediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural NetworkPrediction of Default Customer in Banking Sector using Artificial Neural Network
Prediction of Default Customer in Banking Sector using Artificial Neural Network
 
Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive Modelling
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt
 
Machine learning
Machine learningMachine learning
Machine learning
 
Data analytics
Data analyticsData analytics
Data analytics
 
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
 
V2 i9 ijertv2is90699-1
V2 i9 ijertv2is90699-1V2 i9 ijertv2is90699-1
V2 i9 ijertv2is90699-1
 
A tutorial on secure outsourcing of large scalecomputation for big data
A tutorial on secure outsourcing of large scalecomputation for big dataA tutorial on secure outsourcing of large scalecomputation for big data
A tutorial on secure outsourcing of large scalecomputation for big data
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine learning (domingo's paper)
Machine learning (domingo's paper)Machine learning (domingo's paper)
Machine learning (domingo's paper)
 
Machine Learning Using Python
Machine Learning Using PythonMachine Learning Using Python
Machine Learning Using Python
 
GTU GeekDay Data Science and Applications
GTU GeekDay Data Science and ApplicationsGTU GeekDay Data Science and Applications
GTU GeekDay Data Science and Applications
 
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - PhdassistanceSelecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
 
Ml introduction
Ml introductionMl introduction
Ml introduction
 
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - PhdassistanceSelecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
 
Data science lecture4_doaa_mohey
Data science lecture4_doaa_moheyData science lecture4_doaa_mohey
Data science lecture4_doaa_mohey
 
Industrial training ppt
Industrial training pptIndustrial training ppt
Industrial training ppt
 

Similar to Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars - Phdassistance

SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
ijdpsjournal
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
ijdpsjournal
 
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
eswaralaldevadoss
 
AI Powered Campus Resource Assistance using Google Dialog Flow
AI Powered Campus Resource Assistance using Google Dialog FlowAI Powered Campus Resource Assistance using Google Dialog Flow
AI Powered Campus Resource Assistance using Google Dialog Flow
YaswantAY
 
Knowledge Graphs and their central role in big data processing: Past, Present...
Knowledge Graphs and their central role in big data processing: Past, Present...Knowledge Graphs and their central role in big data processing: Past, Present...
Knowledge Graphs and their central role in big data processing: Past, Present...
Amit Sheth
 

Similar to Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars - Phdassistance (20)

Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Technovision
TechnovisionTechnovision
Technovision
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
 
How Does Data Create Economic Value_ Foundations For Valuation Models.pdf
How Does Data Create Economic Value_ Foundations For Valuation Models.pdfHow Does Data Create Economic Value_ Foundations For Valuation Models.pdf
How Does Data Create Economic Value_ Foundations For Valuation Models.pdf
 
DSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco NathanDSSG Speaker Series: Paco Nathan
DSSG Speaker Series: Paco Nathan
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Mining
 
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
Unlocking the Potential of Artificial Intelligence_ Machine Learning in Pract...
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
IRJET - Employee Performance Prediction System using Data Mining
IRJET - Employee Performance Prediction System using Data MiningIRJET - Employee Performance Prediction System using Data Mining
IRJET - Employee Performance Prediction System using Data Mining
 
AI Powered Campus Resource Assistance using Google Dialog Flow
AI Powered Campus Resource Assistance using Google Dialog FlowAI Powered Campus Resource Assistance using Google Dialog Flow
AI Powered Campus Resource Assistance using Google Dialog Flow
 
Automated machine learning: the new data science challenge
Automated machine learning: the new data science challengeAutomated machine learning: the new data science challenge
Automated machine learning: the new data science challenge
 
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
AI TESTING: ENSURING A GOOD DATA SPLIT BETWEEN DATA SETS (TRAINING AND TEST) ...
 
Machine Learning The Powerhouse of AI Explained.pdf
Machine Learning The Powerhouse of AI Explained.pdfMachine Learning The Powerhouse of AI Explained.pdf
Machine Learning The Powerhouse of AI Explained.pdf
 
Research paper on big data and hadoop
Research paper on big data and hadoopResearch paper on big data and hadoop
Research paper on big data and hadoop
 
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLMITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
 
Knowledge Graphs and their central role in big data processing: Past, Present...
Knowledge Graphs and their central role in big data processing: Past, Present...Knowledge Graphs and their central role in big data processing: Past, Present...
Knowledge Graphs and their central role in big data processing: Past, Present...
 

More from PhD Assistance

More from PhD Assistance (20)

The relationship between clinical and biochemical findings with diabetic keto...
The relationship between clinical and biochemical findings with diabetic keto...The relationship between clinical and biochemical findings with diabetic keto...
The relationship between clinical and biochemical findings with diabetic keto...
 
Referencing an Article - Its styles and type.pptx
Referencing an Article - Its styles and type.pptxReferencing an Article - Its styles and type.pptx
Referencing an Article - Its styles and type.pptx
 
Referencing an Article - Its styles and type.pdf
Referencing an Article - Its styles and type.pdfReferencing an Article - Its styles and type.pdf
Referencing an Article - Its styles and type.pdf
 
ROLE OF COMMUNITY TO BOOST MENTAL HEALTH .pptx
ROLE OF COMMUNITY TO BOOST MENTAL HEALTH .pptxROLE OF COMMUNITY TO BOOST MENTAL HEALTH .pptx
ROLE OF COMMUNITY TO BOOST MENTAL HEALTH .pptx
 
Current and future developments in cultural psychology of inequality in PhD r...
Current and future developments in cultural psychology of inequality in PhD r...Current and future developments in cultural psychology of inequality in PhD r...
Current and future developments in cultural psychology of inequality in PhD r...
 
Quantum Machine Learning is all you Need – PhD Assistance.pdf
Quantum Machine Learning is all you Need – PhD Assistance.pdfQuantum Machine Learning is all you Need – PhD Assistance.pdf
Quantum Machine Learning is all you Need – PhD Assistance.pdf
 
Nutritional Interventional trials in muscle and cachexia PhD research directi...
Nutritional Interventional trials in muscle and cachexia PhD research directi...Nutritional Interventional trials in muscle and cachexia PhD research directi...
Nutritional Interventional trials in muscle and cachexia PhD research directi...
 
Nutritional Interventional trials in muscle and cachexia PhD research directi...
Nutritional Interventional trials in muscle and cachexia PhD research directi...Nutritional Interventional trials in muscle and cachexia PhD research directi...
Nutritional Interventional trials in muscle and cachexia PhD research directi...
 
7 Major Types of Cyber Security Threats.pdf
7 Major Types of Cyber Security Threats.pdf7 Major Types of Cyber Security Threats.pdf
7 Major Types of Cyber Security Threats.pdf
 
Machine Learning Algorithm for Business Strategy.pdf
Machine Learning Algorithm for Business Strategy.pdfMachine Learning Algorithm for Business Strategy.pdf
Machine Learning Algorithm for Business Strategy.pdf
 
Key Factors Influencing Customer Purchasing Behavior.pptx
Key Factors Influencing Customer Purchasing Behavior.pptxKey Factors Influencing Customer Purchasing Behavior.pptx
Key Factors Influencing Customer Purchasing Behavior.pptx
 
Key Factors Influencing Customer Purchasing Behavior.pdf
Key Factors Influencing Customer Purchasing Behavior.pdfKey Factors Influencing Customer Purchasing Behavior.pdf
Key Factors Influencing Customer Purchasing Behavior.pdf
 
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pptx
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pptxFactors Contributing and Counter Measure in Drowsiness Detection of Drivers.pptx
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pptx
 
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pdf
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pdfFactors Contributing and Counter Measure in Drowsiness Detection of Drivers.pdf
Factors Contributing and Counter Measure in Drowsiness Detection of Drivers.pdf
 
Immigrant’s Potentials to Emerge as Entrepreneurs.pptx
Immigrant’s Potentials to Emerge as Entrepreneurs.pptxImmigrant’s Potentials to Emerge as Entrepreneurs.pptx
Immigrant’s Potentials to Emerge as Entrepreneurs.pptx
 
Immigrant’s Potentials to Emerge as Entrepreneurs - PhD Assistance.pdf
Immigrant’s Potentials to Emerge as Entrepreneurs - PhD Assistance.pdfImmigrant’s Potentials to Emerge as Entrepreneurs - PhD Assistance.pdf
Immigrant’s Potentials to Emerge as Entrepreneurs - PhD Assistance.pdf
 
An overview of cyber security data science from a perspective of machine lear...
An overview of cyber security data science from a perspective of machine lear...An overview of cyber security data science from a perspective of machine lear...
An overview of cyber security data science from a perspective of machine lear...
 
An overview of cyber security data science from a perspective of machine lear...
An overview of cyber security data science from a perspective of machine lear...An overview of cyber security data science from a perspective of machine lear...
An overview of cyber security data science from a perspective of machine lear...
 
Selecting a Research Topic - Framework for Doctoral Students.pdf
Selecting a Research Topic - Framework for Doctoral Students.pdfSelecting a Research Topic - Framework for Doctoral Students.pdf
Selecting a Research Topic - Framework for Doctoral Students.pdf
 
Identifying and Formulating the Research Problem in Food and Nutrition Study ...
Identifying and Formulating the Research Problem in Food and Nutrition Study ...Identifying and Formulating the Research Problem in Food and Nutrition Study ...
Identifying and Formulating the Research Problem in Food and Nutrition Study ...
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
httgc7rh9c
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
AnaAcapella
 

Recently uploaded (20)

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lessonQUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
QUATER-1-PE-HEALTH-LC2- this is just a sample of unpacked lesson
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
PANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptxPANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptx
 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food Additives
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 

Machine Learning On Big Data: Opportunities And Challenges- Future Research Direction For Phd Scholars - Phdassistance

  • 1. MACHINE LEARNING ON BIG DATA: OPPORTUNITIES AND CHALLENGES - FUTURE RESEARCH DIRECTION FOR PHD SCHOLARS An Academic presentation by Dr. Nancy Agnes, Head, Technical Operations, Phdassistance Group www.phdassistance.com Email: info@phdassistance.com
  • 2. In-brief Introduction Machine learning Big data Data preprocessing opportunities and challenges Evaluation opportunities and challenges Future research Conclusion Outline TODAY'SDISCUSSION
  • 3. Machine Learning (ML) is rapidly used in a variety of applications. It has risen to prominence in recent years, owing in part to the emergence of big data. When it comes to big data, ML algorithms have never been more promising. Big data allows machine learning algorithms to discover finer-grained patterns and make more timely and precise predictions than ever before; however, it also poses significant challenges to machine learning, such as model scalability and distributed computing. In-Brief
  • 4. In various fields as computer vision, speech recognition, natural language comprehension, neuroscience, fitness, and the Internet of Things, ML techniques have had enormous societal impacts. The emergence of the era of big data has stirred up interest in Machine Learning Big Data has never promised or questioned machine learning algorithms to gain new insights into a variety of business applications and human behaviours. Contd... INTRODUCTION
  • 5. On the one hand, big data provides ML algorithms with unparalleled amounts of data from which to derive underlying patterns and create predictive models; on the other hand, conventional ML algorithms face crucial challenges such as scalability in order to fully unlock the value of big data. With the ever-expanding world of big data, ML must develop and grow in order to turn big data into actionable intelligence. Contd...
  • 6. ML aims to answer the question of how to build a computer system that improves itself over time. The problem of learning from experience with respect to certain tasks and performance metrics is referred to as an ML problem. Users may use ML techniques to deduce underlying structure and make predictions from large datasets. Contd...
  • 7. ML thrives on strong computational environments, efficient learning techniques (algorithms), and rich and/or large data. As a result, ML has a lot of potential and is an essential part of big data analytics
  • 8. Fig. 1. A Framework of machine learning on big data (MLBid)
  • 9. Data pre-processing, learning, and assessment are common stages of Machine Learning. Data pre-processing aids in the transformation of raw data into the "right form" for further learning steps. Via data cleaning, extraction, transformation, and fusion, the pre-processing phase transforms such data into a form that can be used as inputs to learning. Contd... MACHINE LEARNING
  • 10. Using the pre-processed input data, the learning step selects learning algorithms and tunes model parameters to produce desired outputs. Data pre-processing can be done with some learning methods, especially representational learning. After that, the trained models are evaluated to see how well they do. The essence of learning input, the goal of learning activities, and the timing of data availability are all characteristics of machine learning. Contd...
  • 11. ML can be divided into three major categories based on the quality of the input available to a learning system: supervised learning, unsupervised learning, and reinforcement learning. ML can be divided into two types: representational learning and task learning, depending on whether the learning goal is to learn particular tasks using input features or to learn the features themselves. Each Machine Learning Algorithm can be classified in a variety of ways.
  • 12. Fig. 2. A multi-dimensional taxonomy of machine learning
  • 13. Volume, velocity, variety, veracity, and value are the five dimensions of big data. Starting from the bottom, we organised the five dimensions into a stack of high, data, and value layers. The data layer is integral to big data, and the meaning factor characterises the influence of big data real-world applications. Contd... BIGDATA
  • 14. The lower layer is more reliant on technical advancements, while the higher layer is more focused on applications that leverage big data's strategic strength. Established machine learning paradigms and algorithms must be modified to understand the potential of big data analytics and to process big data efficiently. We recognise key opportunities and challenges in this section. We go through them individually for each of the three phases of machine learning: preprocessing, learning, and assessment. Contd...
  • 15. Fig. 3. Big data stack
  • 16. Data replication or inconsistency can have a significant impact on machine learning. Traditional methods such as pairwise similarity comparison are no longer feasible for big data, despite a variety of techniques for detecting duplicates produced in the last 20 years. Contd... When two or more data samples represent the same object, duplication occurs. DATA REDUNDANCY DATAPREPROCESSING OPPORTUNITIES AND CHALLENGES
  • 17.
  • 18. Furthermore, the conventional presumption that duplicated pairs are rarer than non-duplicated pairs is no longer true. Dynamic Time Warping can be much faster than current Euclidean distance algorithms in this regard DATA HETEROGENEITY Big data promises to include multi-view data from a variety of repositories, in a variety of formats, and from a variety of population samples, and thus is highly heterogeneous. Contd...
  • 19. The value of these multi-view heterogeneous data. As a result, combining all of the characteristics and treating them equally relevant is unlikely to result in optimal learning outcomes. Big data offers the possibility of simultaneously learning from different views and then assembling multiple findings by learning the relevance of feature views to the task. The approach is supposed to be resistant to data outliers and to be able to solve optimization and convergence problems. Contd...
  • 20. DATA DISCRETIZATION However, most current discretization dealing with large amounts of data. methods would be ineffective when Traditional discretization approaches have been parallelized in big data platforms to solve big data problems, with a distributed variant of the entropy minimization discretizer based on the Minimum Description Length Principle improving both efficiency and accuracy. Contd...
  • 21. DATA LABELLING Active learning can be used as an optimization technique for marking activities in crowd-sourced databases, reducing the number of questions posed to the crowd and enabling crowd-sourced applications to scale. Designing active Learning Algorithms for a crowd-sourced dataset, on the other hand, presents a number of practical challenges, including generality, scalability, and usability. Another problem is that such a dataset cannot cover all user-specific contexts, resulting in output that is often inferior to user-centric training. Contd...
  • 22. IMBALANCED DATA Traditional stratified random sampling approaches have tackled the problem of unbalanced data. However, if iterations of sub-sample generation and error metrics measurement are needed, the process can take a long time. Furthermore, conventional sampling methods are unable to support data sampling over a user-specified subset of data that includes value-based sampling efficiently. Parallel data sampling is needed by big data.
  • 23.
  • 24. This paper provides a summary of the benefits and drawbacks of machine learning on big data. Big data poses new possibilities for inspiring revolutionary and novel ML technologies to solve many associated technological problems and generate real-world impacts, while also posing multiple challenges for conventional ML in terms of scalability, adaptability, and usability. Contd... FUTURE RESEARCH
  • 25. These opportunities and challenges can be used to evaluate current research in this field. According to the components of the MLBiD system, we also highlight some open Research issues in ML on big data, as shown in Table.
  • 26. In conclusion, machine learning is needed to address the challenges faced by big data and to discover hidden patterns, information, and insights from big data in order to transform its potential into real value for business decision-making and scientific exploration. The combination of machine learning and big data points to a bright future in a modern frontier. CONCLUSION