SlideShare a Scribd company logo
1 of 38
Text Analytics
with Python
TD Workshop 2
Nhi Nguyen & Michelle Purnama
Pre-Workshop Checklist
⬡ 1. This is pretty obvious…. but do you have your laptop with
you? If you don’t…. Perhaps go grab it?
⬡ 2. Did you download Anaconda?
⬡ 3. Did you have access to TD WS 2 Shared Folder?
⬡ 4. If you say “no” to questions 2 and 3 → go to AIS website for
the instruction!
AIS Upcoming Events
⬡ No Speaker Series, Next Monday, 10/21
⬡ EY Office VIsit - Next Thursday, 10/24, 9:00AM - 12:00PM
∙ Find the signup in the newsletter
⬡ PD Meeting: Friday, 10/25, 12:00 – 12:50
∙ Talking Tech with Ilya Rogov
Hello!
I am Michelle Purnama
I hope you’re all excited to learn
Python with us! Don’t be scared -
this Python won’t bite :)
4
1.
What is Python?
Python 101 starts now!
Python
⬡ Python is an interpreted, high-
level, general-purpose
programming language
⬡ It supports the use of modules
and packages
⬡ Code can be reused in a variety
of projects by importing and
exporting these modules
6
This Python ?
Or this
Python ?
Python Packages
7
2. Anaconda &
Jupyter Notebook
What are they again?
Anaconda
⬡ Free and open-source
distribution of Python and R
programming languages that
aims to simplify package
management & deployment
⬡ In this workshop, we are using
Anaconda to install Python and
Jupyter Notebook
9
Jupyter Notebook
⬡ Open-source web application that
allows you to create & share
documents that contain live code,
equations, visualization and narrative
text
⬡ Powerful way to iterate our Python
code and writing lines of code and
running them one at a time
10
Text Analytics -
Main Phases
Let’s start coding!
Text Analytics & NLP
⬡ Day-to-day texts generated are unstructured
⬡ NLP - Natural Language Processing
⬡ NLP enables computer to interact with humans in a
natural manner
⬡ Example: analyzing movie review
12
Text Analytics Operations using NLTK
⬡ NLTK - Natural Language Toolkit
⬡ Python package that provides a set of diverse
natural languages algorithms
⬡ Free, open source, easy to use, well documented
⬡ Helps computer analyze, preprocess, and understand
written text
13
14
Tokenization
Stop
words
Removal
Lexicon
Normalization
Sentiment
Analysis
Understand
POS Tag
Phase 1
Phase 2
Phase 3
Phase 4
Phase 5
15
Phase 1: Tokenization
Tokenization
⬡ First step in text analytics
⬡ The process of breaking down a text
paragraph into smaller chunks such as
words or sentences
⬡ Token - a single entity that is building
blocks for sentence or paragraph
⬡ nltk.tokenize - a module inside NLTK
package
16
Sentence Tokenization
⬡ Breaks text paragraph
into sentences
⬡ Import sent_tokenize
Sentence & Word Tokenization
17
Word Tokenization
⬡ Breaks text paragraph
into words
⬡ Import word_tokenize
Frequency Distribution
⬡ Frequency of occurrence
of each word in a text
⬡ Import FreqDist from
nltk.probability module
⬡ Import matplotlib
package to plot the
chart
18
Do It
Yourself!
Choose any story from the
Funny Halloween Stories link
and plot a frequency
distribution using Python! Boo!
19
20
Phase 2: Stop
words Removal
Stopwords
⬡ Noise in the text
⬡ Examples: is, am, are, this, a, an, the
⬡ We need to create a list of stopwords and filter out our
list of tokens from these words
21
Wow,
that’s a
mouthful
Do It
Yourself!
Use the same story you picked
in Phase 1 and remove the
stopwords from that text. Let’s
do it!
22
23
Phase 3: Lexicon
Normalization
Lexicon Normalization
⬡ Reduces derivationally related forms of a word to a
common root word
⬡ For example, connection, connected, connecting
word reduce to a common word “connect”
24
Stemming
⬡ Reduces word to their
root word / chops off the
derivational affixes
⬡ Does not recognize the
knowledge of the word in
context
Stemming & Lemmatization
25
Lemmatization
⬡ More sophisticated
⬡ Reduces words to their
base word - linguistically
correct lemmas
⬡ Considers context of the
word
26
Phase 4: POS Tag
POS Tagging
⬡ Part-of-Speech (POS) tagging looks to identify the
grammatical group of a given word based on the
context
⬡ For example, noun, pronoun, adjective, verb, adverbs,
etc
27
Do It
Yourself!
Choose a sentence from the
Halloween Story and apply
POS tags to the tokenized
sentence!
28
29
Phase 5: Sentiment
Analysis
Text Classification
⬡ Important task in text mining
⬡ Identifying category/class of given text such as blog,
book, web page, tweets
⬡ Various application in spam detection, classifying
website content for a search engine, sentiments of
customer feedback, etc
30
Text Classification
31
Sentiment Analysis
⬡ Quantifying user content, idea, belief, opinion
⬡ Combination of words, tone, and writing
style
⬡ Analyzes user messages and classifies
underlying sentiment as positive, negative,
or neutral
⬡ Two approaches:
∙ Lexicon-based
∙ Machine learning-based approach
32
Dataset - sentimentanalysis.tsv
33
What We’ve Learned Today..
34
⬡ Break down paragraphs into smaller chunks
⬡ Remove punctuation and stopwords to eliminate noise
⬡ Use Stemming & Lemmatization to reduce words to their
base words
⬡ Understand Part-of-Speech tagging
⬡ Create simple graphs in Python
⬡ Scratch a bit of the surface of Sentiment Text Analysis!
35
Tokenization
Stop
words
Removal
Lexicon
Normalization
Sentiment
Analysis
Understand
POS Tag
Phase 1
Phase 2
Phase 3
Phase 4
Phase 5
5.
Extra Resources
More Python?
Additional Learning Resources
⬡ To read more about Text Analysis
∙ https://monkeylearn.com/text-analysis/
⬡ More advanced Text Analysis tutorial
∙ https://www.dataquest.io/blog/tutorial-text-analysis-python-test-
hypothesis
⬡ Bootcamp course on Python
∙ https://www.udemy.com/course/complete-python-bootcamp/
37
38
Thanks for coming!
http://bit.ly/TD-SAT2
Suitable Code Exit Code: Nychella

More Related Content

What's hot

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Language Modeling and English Speech Prediction System to aid People with Stu...
Language Modeling and English Speech Prediction System to aid People with Stu...Language Modeling and English Speech Prediction System to aid People with Stu...
Language Modeling and English Speech Prediction System to aid People with Stu...Chandana T L
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentationSai Mohith
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Natural language processing
Natural language processingNatural language processing
Natural language processingKarenVacca
 
Natural language processing
Natural language processingNatural language processing
Natural language processingprashantdahake
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationDivya Sugumar
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing Adarsh Saxena
 
Natural language processing
Natural language processingNatural language processing
Natural language processingBasha Chand
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4DigiGurukul
 
Natural language processing
Natural language processingNatural language processing
Natural language processingHansi Thenuwara
 
Natural language processing
Natural language processingNatural language processing
Natural language processingSaurav Aryal
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingParrotAI
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingsaurabhnarhe
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyMarina Santini
 

What's hot (20)

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Networks and Natural Language Processing
Networks and Natural Language ProcessingNetworks and Natural Language Processing
Networks and Natural Language Processing
 
Language Modeling and English Speech Prediction System to aid People with Stu...
Language Modeling and English Speech Prediction System to aid People with Stu...Language Modeling and English Speech Prediction System to aid People with Stu...
Language Modeling and English Speech Prediction System to aid People with Stu...
 
Natural language processing PPT presentation
Natural language processing PPT presentationNatural language processing PPT presentation
Natural language processing PPT presentation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative Communication
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Nlp
NlpNlp
Nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Lecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language TechnologyLecture 1: Semantic Analysis in Language Technology
Lecture 1: Semantic Analysis in Language Technology
 

Similar to Technical Development Workshop - Text Analytics with Python

Fast and accurate sentiment classification us and naive bayes model b516001
Fast and accurate sentiment classification  us and naive bayes model b516001Fast and accurate sentiment classification  us and naive bayes model b516001
Fast and accurate sentiment classification us and naive bayes model b516001Abhisek Sahoo
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingVeenaSKumar2
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing WorkshopLakshya Sivaramakrishnan
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxFitsum36
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxsaivinay93
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Big data
Big dataBig data
Big dataIshucs
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Abdullah al Mamun
 
Addis Ababa University.pptx
Addis Ababa University.pptxAddis Ababa University.pptx
Addis Ababa University.pptxBelay Alemayehu
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxSHIBDASDUTTA
 
Natural language processing using python
Natural language processing using pythonNatural language processing using python
Natural language processing using pythonPrakash Anand
 
Untitled presentation.pdf
Untitled presentation.pdfUntitled presentation.pdf
Untitled presentation.pdfUpinder Kaur
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Master Python.pdf
Master Python.pdfMaster Python.pdf
Master Python.pdfUncodemy
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsShreyas Suresh Rao
 

Similar to Technical Development Workshop - Text Analytics with Python (20)

Fast and accurate sentiment classification us and naive bayes model b516001
Fast and accurate sentiment classification  us and naive bayes model b516001Fast and accurate sentiment classification  us and naive bayes model b516001
Fast and accurate sentiment classification us and naive bayes model b516001
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Big data
Big dataBig data
Big data
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Addis Ababa University.pptx
Addis Ababa University.pptxAddis Ababa University.pptx
Addis Ababa University.pptx
 
Project report
Project reportProject report
Project report
 
Natural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptxNatural Language Processing (NLP).pptx
Natural Language Processing (NLP).pptx
 
Natural language processing using python
Natural language processing using pythonNatural language processing using python
Natural language processing using python
 
Untitled presentation.pdf
Untitled presentation.pdfUntitled presentation.pdf
Untitled presentation.pdf
 
NLP
NLPNLP
NLP
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Nltk
NltkNltk
Nltk
 
Master Python.pdf
Master Python.pdfMaster Python.pdf
Master Python.pdf
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 

Recently uploaded

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 

Recently uploaded (20)

Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 

Technical Development Workshop - Text Analytics with Python

Editor's Notes

  1. Nhi
  2. Michelle
  3. M
  4. Python is an interpreted, high-level, general-purpose programming language. Python supports the use of modules and packages, which means that programs can be designed in a modular style and code can be reused across a variety of projects. Once you've developed a module or package you need, it can be scaled for use in other projects, and it's easy to import or export these modules.
  5. What is Anaconda? Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. In this workshop, we will use Anaconda to Install Python and Jupyter Notebook as Anaconda also includes other commonly used packages for scientific computing and data science (and in this case, for text analytics!)
  6. M What is Jupyter Notebook? The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Jupyter Notebooks are a powerful way to write and iterate on your Python code for data analysis. Rather than writing and re-writing an entire program, you can write lines of code and run them one at a time.
  7. N
  8. NLP enables the computer to interact with humans in a natural manner. It helps the computer to understand the human language and derive meaning from it. Analyzing movie review is one of the classic examples to demonstrate a simple NLP Bag-of-words model, on movie reviews.
  9. N NLTK is a powerful Python package that provides a set of diverse natural languages algorithms. It is free, opensource, easy to use, large community, and well documented. NLTK helps the computer to analysis, preprocess, and understand the written text. Going back to the phase slide, NLTK consists of the most common algorithms such as tokenizing, part-of-speech tagging, stemming, sentiment analysis, topic segmentation, and named entity recognition.
  10. M
  11. Talk about package —> module —> class (draw venn diagram on white board maybe?)
  12. # Frequency Distribution Plot import matplotlib.pyplot as plt fdist.plot(30,cumulative=False) plt.show() https://matplotlib.org/tutorials/introductory/pyplot.html#pyplot-tutorial
  13. M
  14. N
  15. Lexicon normalization considers another type of noise in the text. For example, connection, connected, connecting word reduce to a common word "connect". It reduces derivationally related forms of a word to a common root word.
  16. Stemming process of linguistic normalization reduces words to their word root word or chops off the derivational affixes. Lemmatization more sophisticated than stemming. reduces words to their base word, which is linguistically correct lemmas. A lemma is a word that stands at the head of a definition in a dictionary. All the head words in a dictionary are lemmas. Technically, it is "a base word and its inflections" Stemmer works on an individual word without knowledge of the context. For example, The word "better" has "good" as its lemma. This thing will miss by stemming because it requires a dictionary look-up.
  17. N
  18. The primary target of Part-of-Speech(POS) tagging is to identify the grammatical group of a given word. Whether it is a NOUN, PRONOUN, ADJECTIVE, VERB, ADVERBS, etc. based on the context. POS Tagging looks for relationships within the sentence and assigns a corresponding tag to the word. List of POS tags: https://medium.com/@gianpaul.r/tokenization-and-parts-of-speech-pos-tagging-in-pythons-nltk-library-2d30f70af13b
  19. Text classification is one of the important tasks of text mining. Identifying category or class of given text such as a blog, book, web page, news articles, and tweets. It has various application in today's computer world such as spam detection, task categorization in CRM services, categorizing products on E-retailer websites, classifying the content of websites for a search engine, sentiments of customer feedback, etc.
  20. What users and the general public think about the latest feature? You can quantify such information with reasonable accuracy using sentiment analysis. Quantifying users content, idea, belief, and opinion is known as sentiment analysis. Human communication is just not limited to words, it is more than words. Sentiments are combination words, tone, and writing style. Two approaches Lexicon-based: Count a number of positive and negative words in given text and the larger count will be the sentiment of text. Machine learning based approach: Develop a classification model, which is trained using the pre-labeled dataset of positive, negative, and neutral. In this Tutorial, you will use the second approach(Machine learning based approach). This is how you learn sentiment and text classification with a single example.
  21. Break down paragraphs into smaller chunks like sentences or words. Remove punctuation and stopwords to increase the accuracy of our analysis. Use Stemming or Lemmatization to reduce words to their base words. Understand Part-of-Speech tagging. Create simple graphs in Python. Scratch a bit of the surface of Sentiment Text Analysis!