SlideShare a Scribd company logo
Natural language processing and
machine learning
Nikola Milosevic
What is AI?
• Intelligence presented by a machine
• Flexible agent that interacts with the environment and
performs actions to maximize success towards certain goal
Popular AI
What is machine learning
• Subfield of computer science that explores
how machines can learn to perform certain
task without explicit programming
Data mining generally
Types of machine learning
• Supervised learning
• Semi-supervised learning
• Unsupervised learning
• Reinforcement learning
Machine learning problems
• Classification
• Clustering
• Regression
Testing the model
• Iteratively improve the model
• Test multiple algorithms – find the best one
• No free lunch theory
• Feedback loop for feature selection
• Konfuziona matrica
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
Examples of ML frameworks and
algorithms
• SCI-kit learn
– Python library
– Implementation of the most useful algorithms
– Naïve Bayes, SVM, Random forests, decision
trees…
• Keras
– Python library implementing about everything
related to neural networks
Text data
• About 80% of data in organizations are in text
format
• Harder to analyse than structured data
• Huge amount of textual documents
– Only in biomedicine 2200 scientific papers are
published every day
• Growing exponentially
Main goals of text mining
• Make communication easier (e.g. translation)
• Automate some processes (e.g.
communication agents/chatbots)
• Do data mining on textual and unstructured
data
Process overview
Challenges
• Man saw a woman with the telescope.
– Who has a telescope?
• Multiple senses, synonyms,
homonyms, irony
• Grammar and context can help
• Acronyms
Approaches
• Rule based
– Human defined rules to extract information
– Needs expert humans who know how people express
certain things
– Is quite laborious
• Machine learning based
– Machine tries to learn what to extract guided by
human
– Needs annotated corpora (usually fairly large)
• This is expensive to create and quite laborious
Levels of analysis
• Lexical
– Analysis of words
• Syntactic
– Analysis of organization of words
(phrases, sentences)
• Semantic
– Analysis meaning
• Sometimes pragmatic
– Analysis pragmatics of the use of certain words,
phrases. Why author used that?
Steps
Lexical processing
• Part of speech tagging
• Parsing
– Constituency
– Dependency
Stanford parser
Semantic processing
• Text classification
– Sentiment analysis (positive/negative)
– Classification by topics (politics/sport/business)
– Authorship detection (Tolkien, Rowling, Shakespeare)
• Named entity recognition
• Topic modelling (unsupervised)
• Search
Sequence modelling
• Machine learning technique useful for named
entity recognition
• Conditional random fields (CRF) or recurrent
neural networks (often LSTM)
Feature engineering
• Selecting important features that help extract
information
• Can be:
– Words, PoS, word shapes, vocabulary features,
etc.
– May depend on task and methodology
– Iterative process of selecting and improving the
performance
– Some features may confuse the algorithm
Search
• Finds documents that are the most relevant
for a given user query
• Usual techniques include algorithm called TF-
IDF and cosine similarity
• May additionally use links towards text,
positions of matched words and similar things
to rank found documents
• Apache Lucene, Solr (Java), there are also
Python libraries
Language models
• Used as features to classification and other
NLP tasks
• Contain some basic characteristics of language
• The most naïve (but also frequently used) is
called Bag of Words
• NN use more advanced
models: word2vec, Glove,
ULMo, BERT…
Useful tools and libraries
• Apache OpenNLP – Java
• Apache Lucene – Java, C#
• Stanford Core NLP – Java
• NLTK – Python
• GATE – GUI alat
• SharpNLP
• ...
• Weka – for machine learning (GUI)

More Related Content

What's hot

AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AI
Ning Jiang
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
HJ van Veen
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesRui Pedro Paiva
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
MachinePulse
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
Databricks
 
Getting Started with Azure AutoML
Getting Started with Azure AutoMLGetting Started with Azure AutoML
Getting Started with Azure AutoML
Vivek Raja P S
 
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete DeckArtificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
SlideTeam
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
Yuriy Guts
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Rahul Kumar
 
Machine learning
Machine learningMachine learning
Machine learning
Wes Eklund
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Ganesh Satpute
 
Data science - An Introduction
Data science - An IntroductionData science - An Introduction
Data science - An Introduction
Ravishankar Rajagopalan
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learningbutest
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
QuantUniversity
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
OzgurOscarOzkan
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
SlideTeam
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Dinesh V
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Vivek Garg
 
Artificial Intelligence: Knowledge Engineering
Artificial Intelligence: Knowledge EngineeringArtificial Intelligence: Knowledge Engineering
Artificial Intelligence: Knowledge Engineering
The Integral Worm
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
Charles Vestur
 

What's hot (20)

AutoML - The Future of AI
AutoML - The Future of AIAutoML - The Future of AI
AutoML - The Future of AI
 
Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
 
Getting Started with Azure AutoML
Getting Started with Azure AutoMLGetting Started with Azure AutoML
Getting Started with Azure AutoML
 
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete DeckArtificial Intelligence PowerPoint Presentation Slide Template Complete Deck
Artificial Intelligence PowerPoint Presentation Slide Template Complete Deck
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Data science - An Introduction
Data science - An IntroductionData science - An Introduction
Data science - An Introduction
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
 
LangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AILangChain Intro by KeyMate.AI
LangChain Intro by KeyMate.AI
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Artificial Intelligence: Knowledge Engineering
Artificial Intelligence: Knowledge EngineeringArtificial Intelligence: Knowledge Engineering
Artificial Intelligence: Knowledge Engineering
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
 

Similar to Machine learning (ML) and natural language processing (NLP)

Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
Minha Hwang
 
Text Mining
Text MiningText Mining
Text Mining
Biniam Asnake
 
Techniques of information retrieval
Techniques of information retrieval Techniques of information retrieval
Techniques of information retrieval
Tariq Hassan
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognition
Mohammad Ilyas Malik
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
AmanBadesra1
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Vikas Bhushan
 
Final presentation
Final presentationFinal presentation
Final presentation
Nitish Upreti
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 documentUma Kant
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 
Comm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockComm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockMelanie Parlette-Stewart
 
Artificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarArtificial Intelligence by B. Ravikumar
Artificial Intelligence by B. Ravikumar
Garry D. Lasaga
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
Robert Lujo
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
sonykhan3
 
ICS1020 NLP 2020
ICS1020 NLP 2020ICS1020 NLP 2020
ICS1020 NLP 2020
Vanessa Camilleri
 
Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"
National Information Standards Organization (NISO)
 
Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?
Heimo Hänninen
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Simon Hughes
 

Similar to Machine learning (ML) and natural language processing (NLP) (20)

Introduction to Text Mining
Introduction to Text MiningIntroduction to Text Mining
Introduction to Text Mining
 
Text Mining
Text MiningText Mining
Text Mining
 
Techniques of information retrieval
Techniques of information retrieval Techniques of information retrieval
Techniques of information retrieval
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognition
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Intro 2 document
Intro 2 documentIntro 2 document
Intro 2 document
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Comm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockComm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcock
 
IR
IRIR
IR
 
Artificial Intelligence by B. Ravikumar
Artificial Intelligence by B. RavikumarArtificial Intelligence by B. Ravikumar
Artificial Intelligence by B. Ravikumar
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
 
ICS1020 NLP 2020
ICS1020 NLP 2020ICS1020 NLP 2020
ICS1020 NLP 2020
 
Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"Labou "Data Science and the Library at UC San Diego"
Labou "Data Science and the Library at UC San Diego"
 
Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?Semantic technology in nutshell 2013. Semantic! are you a linguist?
Semantic technology in nutshell 2013. Semantic! are you a linguist?
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank TalkDice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
 

More from Nikola Milosevic

Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...
Nikola Milosevic
 
Veštačka inteligencija
Veštačka inteligencijaVeštačka inteligencija
Veštačka inteligencija
Nikola Milosevic
 
AI an the future of society
AI an the future of societyAI an the future of society
AI an the future of society
Nikola Milosevic
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock markets
Nikola Milosevic
 
Equity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learningEquity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learning
Nikola Milosevic
 
BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...
Nikola Milosevic
 
Extracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literatureExtracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literature
Nikola Milosevic
 
Supporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table miningSupporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table mining
Nikola Milosevic
 
Mobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP SeraphimdroidMobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Nikola Milosevic
 
Serbia2
Serbia2Serbia2
Table mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureTable mining and data curation from biomedical literature
Table mining and data curation from biomedical literature
Nikola Milosevic
 
Malware
MalwareMalware
Sentiment analysis for Serbian language
Sentiment analysis for Serbian languageSentiment analysis for Serbian language
Sentiment analysis for Serbian language
Nikola Milosevic
 
Android business models
Android business modelsAndroid business models
Android business models
Nikola Milosevic
 
Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Nikola Milosevic
 
Mašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jezikuMašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jeziku
Nikola Milosevic
 
Malware
MalwareMalware
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Nikola Milosevic
 

More from Nikola Milosevic (20)

Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...Classifying intangible social innovation concepts using machine learning and ...
Classifying intangible social innovation concepts using machine learning and ...
 
Veštačka inteligencija
Veštačka inteligencijaVeštačka inteligencija
Veštačka inteligencija
 
AI an the future of society
AI an the future of societyAI an the future of society
AI an the future of society
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock markets
 
Equity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learningEquity forecast: Predicting long term stock market prices using machine learning
Equity forecast: Predicting long term stock market prices using machine learning
 
BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...BelBi2016 presentation: Hybrid methodology for information extraction from ta...
BelBi2016 presentation: Hybrid methodology for information extraction from ta...
 
Extracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literatureExtracting patient data from tables in clinical literature
Extracting patient data from tables in clinical literature
 
Supporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table miningSupporting clinical trial data curation and integration with table mining
Supporting clinical trial data curation and integration with table mining
 
Mobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP SeraphimdroidMobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
Mobile security, OWASP Mobile Top 10, OWASP Seraphimdroid
 
Serbia2
Serbia2Serbia2
Serbia2
 
Table mining and data curation from biomedical literature
Table mining and data curation from biomedical literatureTable mining and data curation from biomedical literature
Table mining and data curation from biomedical literature
 
Malware
MalwareMalware
Malware
 
Sentiment analysis for Serbian language
Sentiment analysis for Serbian languageSentiment analysis for Serbian language
Sentiment analysis for Serbian language
 
Http and security
Http and securityHttp and security
Http and security
 
Android business models
Android business modelsAndroid business models
Android business models
 
Android(1)
Android(1)Android(1)
Android(1)
 
Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture Sigurnosne prijetnje i mjere zaštite IT infrastrukture
Sigurnosne prijetnje i mjere zaštite IT infrastrukture
 
Mašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jezikuMašinska analiza sentimenta rečenica na srpskom jeziku
Mašinska analiza sentimenta rečenica na srpskom jeziku
 
Malware
MalwareMalware
Malware
 
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
Software Freedom day Serbia - Owasp - informaciona bezbednost u Srbiji open s...
 

Recently uploaded

AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 

Recently uploaded (20)

AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 

Machine learning (ML) and natural language processing (NLP)

  • 1. Natural language processing and machine learning Nikola Milosevic
  • 2. What is AI? • Intelligence presented by a machine • Flexible agent that interacts with the environment and performs actions to maximize success towards certain goal
  • 4. What is machine learning • Subfield of computer science that explores how machines can learn to perform certain task without explicit programming
  • 6. Types of machine learning • Supervised learning • Semi-supervised learning • Unsupervised learning • Reinforcement learning
  • 7. Machine learning problems • Classification • Clustering • Regression
  • 8. Testing the model • Iteratively improve the model • Test multiple algorithms – find the best one • No free lunch theory • Feedback loop for feature selection • Konfuziona matrica 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
  • 9. Examples of ML frameworks and algorithms • SCI-kit learn – Python library – Implementation of the most useful algorithms – Naïve Bayes, SVM, Random forests, decision trees… • Keras – Python library implementing about everything related to neural networks
  • 10. Text data • About 80% of data in organizations are in text format • Harder to analyse than structured data • Huge amount of textual documents – Only in biomedicine 2200 scientific papers are published every day • Growing exponentially
  • 11. Main goals of text mining • Make communication easier (e.g. translation) • Automate some processes (e.g. communication agents/chatbots) • Do data mining on textual and unstructured data
  • 13. Challenges • Man saw a woman with the telescope. – Who has a telescope? • Multiple senses, synonyms, homonyms, irony • Grammar and context can help • Acronyms
  • 14. Approaches • Rule based – Human defined rules to extract information – Needs expert humans who know how people express certain things – Is quite laborious • Machine learning based – Machine tries to learn what to extract guided by human – Needs annotated corpora (usually fairly large) • This is expensive to create and quite laborious
  • 15. Levels of analysis • Lexical – Analysis of words • Syntactic – Analysis of organization of words (phrases, sentences) • Semantic – Analysis meaning • Sometimes pragmatic – Analysis pragmatics of the use of certain words, phrases. Why author used that?
  • 16. Steps
  • 17. Lexical processing • Part of speech tagging • Parsing – Constituency – Dependency Stanford parser
  • 18. Semantic processing • Text classification – Sentiment analysis (positive/negative) – Classification by topics (politics/sport/business) – Authorship detection (Tolkien, Rowling, Shakespeare) • Named entity recognition • Topic modelling (unsupervised) • Search
  • 19. Sequence modelling • Machine learning technique useful for named entity recognition • Conditional random fields (CRF) or recurrent neural networks (often LSTM)
  • 20. Feature engineering • Selecting important features that help extract information • Can be: – Words, PoS, word shapes, vocabulary features, etc. – May depend on task and methodology – Iterative process of selecting and improving the performance – Some features may confuse the algorithm
  • 21. Search • Finds documents that are the most relevant for a given user query • Usual techniques include algorithm called TF- IDF and cosine similarity • May additionally use links towards text, positions of matched words and similar things to rank found documents • Apache Lucene, Solr (Java), there are also Python libraries
  • 22. Language models • Used as features to classification and other NLP tasks • Contain some basic characteristics of language • The most naïve (but also frequently used) is called Bag of Words • NN use more advanced models: word2vec, Glove, ULMo, BERT…
  • 23. Useful tools and libraries • Apache OpenNLP – Java • Apache Lucene – Java, C# • Stanford Core NLP – Java • NLTK – Python • GATE – GUI alat • SharpNLP • ... • Weka – for machine learning (GUI)