Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Тема доклада
Тема доклада
Тема доклада
KYIV 2019
Natural Language Processing with .NET
.NET CONFERENCE #1 IN UKRAINE
Тема доклада
Тема доклада
Тема доклада
.NET LEVEL UP
About me
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Sergiy Korzh
25+ yea...
Тема доклада
Тема доклада
Тема доклада
.NET LEVEL UP
Agenda
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
1 Introduction to NLP ...
Тема доклада
Тема доклада
Тема доклада
.NET LEVEL UP
Why NLP on .NET?
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Тема доклада
Тема доклада
Тема доклада
.NET LEVEL UP
Why NLP on .NET?
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Because we l...
Тема доклада
Тема доклада
Тема доклада
.NET LEVEL UP
Remarks
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
“Light” NLP tasks onl...
.NET LEVEL UP
NLP Tasks
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
1 Linguistic
Analysis
Transformation
2
3
Generation4
.NET LEVEL UP
NLP Tasks
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
1 Linguistic
• Segmentation
• Part of speech tagging
• Nam...
.NET LEVEL UP
NLP Tasks’ Examples
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
2 Analysis
• Spam-filter
• Sentiment analysis
• ...
.NET LEVEL UP
NLP Tasks’ Examples
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
3 Transformation
• Machine translation
• Speech ...
.NET LEVEL UP
NLP Tasks’ Examples
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
4 Generation
• Question Answering
• Chat bots
• ...
.NET LEVEL UP
NLP Pipeline
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
TEXT Text Featurizing
(Numeric representation)
ML Algor...
.NET LEVEL UP
NLP Pipeline: Classic
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
from AYLIEN blog
.NET LEVEL UP
NLP Pipeline: Deep Learning
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
from AYLIEN blog
.NET LEVEL UP
NLP concepts: Bag of words
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
The way to represent your text for ML alg...
.NET LEVEL UP
NLP concepts: TF-IDF
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
For a word-document pair, TF-IDF shows the
impo...
.NET LEVEL UP
NLP concepts: N-grams
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Word N-grams
n-gram is a contiguous sequence o...
.NET LEVEL UP
NLP concepts: Word Embeddings
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
A set of techniques which allow to map...
.NET LEVEL UP
NLP concepts: Language Model
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
allows to compute a probability of a wo...
.NET LEVEL UP
NLP Tools
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
1 Online services
Python libraries
.NET Libraries
2
3
Azur...
.NET LEVEL UP
.NET libs: ML.NET
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
https://dotnet.microsoft.com/apps/machinelearning-...
.NET LEVEL UP
.NET libs: Catalyst
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
NLP features:
• Text normalization
• Tokenizing
...
.NET LEVEL UP
.NET libs: Microsoft.Recognizers
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
• Rule-based
• Recognizes numbers, ...
.NET LEVEL UP
Other useful libraries
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
DEMO 1
Text summarization (extraction based) ...
Other useful libraries
Other useful libraries
Other useful libraries
.NET LEVEL UP
Other useful libraries
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
DEMO 2
Text summarization using ML.NET
Other useful libraries
Other useful libraries
.NET LEVEL UP
Other useful libraries
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
DEMO 3
Document tagging
(with TF-IDF and Cata...
Other useful libraries
Other useful libraries
Other useful libraries
.NET LEVEL UP
Useful resources
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Universal Dependencies
https://universaldependencie...
.NET LEVEL UP
Conclusions
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Catalyst library
looks promising but still a way to go
C...
.NET LEVEL UP
Other useful libraries
.NET CONFERENCE #1 IN UKRAINE KYIV 2019
Thank you!
Sergiy Korzh
Twitter: @korzhs
Link...
Upcoming SlideShare
Loading in …5
×

of

.NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 1 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 2 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 3 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 4 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 5 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 6 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 7 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 8 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 9 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 10 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 11 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 12 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 13 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 14 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 15 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 16 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 17 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 18 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 19 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 20 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 21 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 22 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 23 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 24 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 25 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 26 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 27 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 28 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 29 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 30 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 31 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 32 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 33 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 34 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 35 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 36 .NET Fest 2019. Сергей Корж. Natural Language Processing in .NET Slide 37
Upcoming SlideShare
What to Upload to SlideShare
Next

0 Likes

Share

.NET Fest 2019. Сергей Корж. Natural Language Processing in .NET

Задачи по обработке естественного языка сейчас встречаются практически в любом проекте. К сожалению, до недавнего времени, платформа .NET не сильно подходила для решения подобных задач. С выходом ML.NET ситуация стала меняться к лучшему, но все еще далека от идеала.
На этом докладе я расскажу про основные задачи, которые решаются методами Natural Language Processing и какие существуют способы решения этих задач на платформе .NET (сервисы, библиотеки, фреймворки).

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

.NET Fest 2019. Сергей Корж. Natural Language Processing in .NET

  1. 1. Тема доклада Тема доклада Тема доклада KYIV 2019 Natural Language Processing with .NET .NET CONFERENCE #1 IN UKRAINE
  2. 2. Тема доклада Тема доклада Тема доклада .NET LEVEL UP About me .NET CONFERENCE #1 IN UKRAINE KYIV 2019 Sergiy Korzh 25+ years in software development 20 year running own business .NET developer since 2004 iForum.ua (technology section) Projects: EasyQuery (https://korzh.com/easyquery) Easy.Report (http://easy.report) Aistant (https://aistant.com/) Twitter: @korzhs LinkedIn: https://www.linkedin.com/in/korzh/
  3. 3. Тема доклада Тема доклада Тема доклада .NET LEVEL UP Agenda .NET CONFERENCE #1 IN UKRAINE KYIV 2019 1 Introduction to NLP (main tasks and basic concepts) NLP Tools for .NET (and not only)2 3 Demos 4 Useful materials and conclusions
  4. 4. Тема доклада Тема доклада Тема доклада .NET LEVEL UP Why NLP on .NET? .NET CONFERENCE #1 IN UKRAINE KYIV 2019
  5. 5. Тема доклада Тема доклада Тема доклада .NET LEVEL UP Why NLP on .NET? .NET CONFERENCE #1 IN UKRAINE KYIV 2019 Because we love .NET, right? Quick and easy (for simple NLP tasks) No “glue” code
  6. 6. Тема доклада Тема доклада Тема доклада .NET LEVEL UP Remarks .NET CONFERENCE #1 IN UKRAINE KYIV 2019 “Light” NLP tasks only! No Deep Learning Beginner level topics
  7. 7. .NET LEVEL UP NLP Tasks .NET CONFERENCE #1 IN UKRAINE KYIV 2019 1 Linguistic Analysis Transformation 2 3 Generation4
  8. 8. .NET LEVEL UP NLP Tasks .NET CONFERENCE #1 IN UKRAINE KYIV 2019 1 Linguistic • Segmentation • Part of speech tagging • Named-entity recognition • Relation extraction • Syntactic parsing • Coreference resolution • Semantic parsing
  9. 9. .NET LEVEL UP NLP Tasks’ Examples .NET CONFERENCE #1 IN UKRAINE KYIV 2019 2 Analysis • Spam-filter • Sentiment analysis • Text similarity • Information extraction
  10. 10. .NET LEVEL UP NLP Tasks’ Examples .NET CONFERENCE #1 IN UKRAINE KYIV 2019 3 Transformation • Machine translation • Speech to Text / Text to speech • Grammar correction • Text summarization
  11. 11. .NET LEVEL UP NLP Tasks’ Examples .NET CONFERENCE #1 IN UKRAINE KYIV 2019 4 Generation • Question Answering • Chat bots • Story generation
  12. 12. .NET LEVEL UP NLP Pipeline .NET CONFERENCE #1 IN UKRAINE KYIV 2019 TEXT Text Featurizing (Numeric representation) ML Algorithm RESULT
  13. 13. .NET LEVEL UP NLP Pipeline: Classic .NET CONFERENCE #1 IN UKRAINE KYIV 2019 from AYLIEN blog
  14. 14. .NET LEVEL UP NLP Pipeline: Deep Learning .NET CONFERENCE #1 IN UKRAINE KYIV 2019 from AYLIEN blog
  15. 15. .NET LEVEL UP NLP concepts: Bag of words .NET CONFERENCE #1 IN UKRAINE KYIV 2019 The way to represent your text for ML algorithms • Word frequency • One-hot encoding • TF-IDF • Other metrics Encoding approaches:
  16. 16. .NET LEVEL UP NLP concepts: TF-IDF .NET CONFERENCE #1 IN UKRAINE KYIV 2019 For a word-document pair, TF-IDF shows the importance of the word in the document. Used in all kinds of information retrieval tasks: • Search • Text mining • Stop-words filtering
  17. 17. .NET LEVEL UP NLP concepts: N-grams .NET CONFERENCE #1 IN UKRAINE KYIV 2019 Word N-grams n-gram is a contiguous sequence of n items from a given sample of text. “I live in Kyiv” word bi-grams 1. # I 2. I live 3. live in 4. in Kyiv 5. Kyiv # Character N-grams “I live in Kyiv” character bi-grams 1. #_ 2. _I 3. I_ 4. _l 5. li 6. Iv 7. ve 8. . . .
  18. 18. .NET LEVEL UP NLP concepts: Word Embeddings .NET CONFERENCE #1 IN UKRAINE KYIV 2019 A set of techniques which allow to map words (or phrases) to numeric vectors. The words with similar meanings have “close” vectors. word Vector man [0.23, 0.56, …] king [0.34, 0.16, …] woman [0.41, 0.73, …] queen [0.09, 0.62, …] [king] – [man] + [woman] ≈ [queen] Popular embeddings algorithms:  Word2Vec  fastText  Glove  . . .
  19. 19. .NET LEVEL UP NLP concepts: Language Model .NET CONFERENCE #1 IN UKRAINE KYIV 2019 allows to compute a probability of a word in a sequence. Where used? (spoiler: almost everywhere!) Please, give me a … [ pen: 0.002, example: 0.0001, hand:0.08, … ] • Machine translation • Error correction • Speech recognition • Text generation
  20. 20. .NET LEVEL UP NLP Tools .NET CONFERENCE #1 IN UKRAINE KYIV 2019 1 Online services Python libraries .NET Libraries 2 3 Azure Cognitive Services, IBM Watson, Amazon AI Services NLTK, spaCy, skikit-learn, gensim, Pattern ML.NET, Microsoft.Speech, Microsoft.Recognizers, Catalyst
  21. 21. .NET LEVEL UP .NET libs: ML.NET .NET CONFERENCE #1 IN UKRAINE KYIV 2019 https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet Pros: • Native for .NET (Core) • Backed my Microsoft • Super performant (at least MS says that ) • Extended with TensorFlow & more NLP features: • Text normalization • Tokenizing • N-gram • Word embeddings • Stop words removal Cons: • Poor NLP features • English-only (mostly) • Not convenient for using separately from ML pipeline
  22. 22. .NET LEVEL UP .NET libs: Catalyst .NET CONFERENCE #1 IN UKRAINE KYIV 2019 NLP features: • Text normalization • Tokenizing • POS-tagging • Word embeddings • Stop words removal https://github.com/curiosity-ai/catalyst Pros: • Native for .NET (Core) • Inspired by spaCy library • Fast tokenizer • Has pretrained models • Allows to train your own models (based on Universal Dependencies project) Cons: • Early beta (or even alpha). Version 0.0.2795 • English-only (mostly)
  23. 23. .NET LEVEL UP .NET libs: Microsoft.Recognizers .NET CONFERENCE #1 IN UKRAINE KYIV 2019 • Rule-based • Recognizes numbers, units, date/time, etc • Supports about 10 different languages • Not only .NET (JavaScript, Python, Java) • No support for Russian or Ukrainian  https://github.com/Microsoft/Recognizers-Text/
  24. 24. .NET LEVEL UP Other useful libraries .NET CONFERENCE #1 IN UKRAINE KYIV 2019 DEMO 1 Text summarization (extraction based) using home-brewed NLP TEXT Detect language Break into sentences Tokenize and get stems sentence1 sentence2 sentence3 stem1 1 3 5 stem2 0 2 4 stem3 3 4 0 stem4 2 0 2 Bag of words S1 S2 S3 S1 0 1.21 0.2 S2 1.21 0 3.56 S3 0.2 3.56 0 Similarity matrix Page rank algorithm Summary (top-rated sentences)
  25. 25. Other useful libraries
  26. 26. Other useful libraries
  27. 27. Other useful libraries
  28. 28. .NET LEVEL UP Other useful libraries .NET CONFERENCE #1 IN UKRAINE KYIV 2019 DEMO 2 Text summarization using ML.NET
  29. 29. Other useful libraries
  30. 30. Other useful libraries
  31. 31. .NET LEVEL UP Other useful libraries .NET CONFERENCE #1 IN UKRAINE KYIV 2019 DEMO 3 Document tagging (with TF-IDF and Catalyst POS tagging)
  32. 32. Other useful libraries
  33. 33. Other useful libraries
  34. 34. Other useful libraries
  35. 35. .NET LEVEL UP Useful resources .NET CONFERENCE #1 IN UKRAINE KYIV 2019 Universal Dependencies https://universaldependencies.org/ Lang-uk http://lang.org.ua/uk/ https://github.com/korzh/Korzh.NLP All source code of this talk Math.net – numerical computation algorithms for .NET https://www.mathdotnet.com/ http://tiny.cc/dotnet-nlp-libs List of .NET libraries with some NLP features
  36. 36. .NET LEVEL UP Conclusions .NET CONFERENCE #1 IN UKRAINE KYIV 2019 Catalyst library looks promising but still a way to go Contribute! We can do NLP on .NET (for the basic tasks at least) ML.NET library good and reliable but limited NLP features
  37. 37. .NET LEVEL UP Other useful libraries .NET CONFERENCE #1 IN UKRAINE KYIV 2019 Thank you! Sergiy Korzh Twitter: @korzhs LinkedIn: https://www.linkedin.com/in/korzh/ Facebook: https://www.facebook.com/sergiy.korzh Email: sergiy@korzh.com

Задачи по обработке естественного языка сейчас встречаются практически в любом проекте. К сожалению, до недавнего времени, платформа .NET не сильно подходила для решения подобных задач. С выходом ML.NET ситуация стала меняться к лучшему, но все еще далека от идеала. На этом докладе я расскажу про основные задачи, которые решаются методами Natural Language Processing и какие существуют способы решения этих задач на платформе .NET (сервисы, библиотеки, фреймворки).

Views

Total views

332

On Slideshare

0

From embeds

0

Number of embeds

1

Actions

Downloads

0

Shares

0

Comments

0

Likes

0

×