0
#TOSMAC
Toronto SMAC Meetup – Welcome!
An Intro to Text Analytics on Big Data with a use case
#TOSMAC
Toronto SMAC Team
| © 2014 IBM Corporation2
Lucas Silva Felipe MosquettaMarcos de
Mello
#TOSMAC
Twitters numbers
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation3
As you know:
-50...
#TOSMAC
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation4
Overview
Section1 Section2 Sectio...
#TOSMAC
Section1 Section2 Section3 Section4 Section5
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM C...
#TOSMAC
Section1 Section2 Section3 Section4 Section5
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM C...
#TOSMAC
Let’s get started!
| © 2014 IBM Corporation7
#TOSMAC
Input data
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation8
#TOSMAC
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation9
Section2
#TOSMAC
Demo
| © 2014 IBM Corporation10
#TOSMAC
Section1 Section2 Section3 Section4 Section5
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM C...
#TOSMAC
Section1 Section2 Section3 Section4 Section5
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM C...
#TOSMAC
Section1 Section2 Section3 Section4 Section5
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM C...
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation14
Types of extractio...
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation15
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation16
#TOSMAC
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation17
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation18
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation19
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation20
Types of extractio...
#TOSMAC
Demo
| © 2014 IBM Corporation21
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation22
Types of extractio...
#TOSMAC
Main concepts
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation23
#TOSMAC
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation24
#TOSMAC
| © 2014 IBM Corporation25
An Intro to Text Analytics on Big Data with a use case
AQL Guidelines
Basic feature AQL...
#TOSMAC
| © 2014 IBM Corporation26
An Intro to Text Analytics on Big Data with a use case
AQL Guidelines
Candidate generat...
#TOSMAC
| © 2014 IBM Corporation27
An Intro to Text Analytics on Big Data with a use case
Candidate generation AQL stateme...
#TOSMAC
| © 2014 IBM Corporation28
An Intro to Text Analytics on Big Data with a use case
Candidate generation AQL stateme...
#TOSMAC
| © 2014 IBM Corporation29
An Intro to Text Analytics on Big Data with a use case
AQL Guidelines
Filter and consol...
#TOSMAC
Demo
| © 2014 IBM Corporation30
#TOSMAC
| © 2014 IBM Corporation31
An Intro to Text Analytics on Big Data with a use case
Conclusion
#TOSMAC
Check point
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation32
#TOSMAC
What we have done
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation33
Section1 Secti...
#TOSMAC
What are we going to do?
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation34
Section...
#TOSMAC
Demo
| © 2014 IBM Corporation35
#TOSMAC
Also using R
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation36
1.75 0.32
#TOSMAC
What are we going to do?
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation37
#TOSMAC
Demo
| © 2014 IBM Corporation38
#TOSMAC
So what?
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation39
#TOSMAC
Companies
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation40
#TOSMAC
Exporting to you
An Intro to Text Analytics on Big Data with a use case
| © 2014 IBM Corporation41
#TOSMAC
Thank you!
Let's network!
| © 2014 IBM Corporation42
Upcoming SlideShare
Loading in...5
×

An Intro to Text Analytics on Big Data with a use case

217

Published on

Introduction on how to perform text analytics using input from twitter and the "Emmys" as use case example.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
217
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "An Intro to Text Analytics on Big Data with a use case"

  1. 1. #TOSMAC Toronto SMAC Meetup – Welcome! An Intro to Text Analytics on Big Data with a use case
  2. 2. #TOSMAC Toronto SMAC Team | © 2014 IBM Corporation2 Lucas Silva Felipe MosquettaMarcos de Mello
  3. 3. #TOSMAC Twitters numbers An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation3 As you know: -500 million Tweets are sent per day. -Twitter supports 35+ languages. -255 million monthly active users. Huge amount of data!
  4. 4. #TOSMAC An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation4 Overview Section1 Section2 Section3 Section4 Section5
  5. 5. #TOSMAC Section1 Section2 Section3 Section4 Section5 An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation5 Overview
  6. 6. #TOSMAC Section1 Section2 Section3 Section4 Section5 An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation6 Overview
  7. 7. #TOSMAC Let’s get started! | © 2014 IBM Corporation7
  8. 8. #TOSMAC Input data An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation8
  9. 9. #TOSMAC An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation9 Section2
  10. 10. #TOSMAC Demo | © 2014 IBM Corporation10
  11. 11. #TOSMAC Section1 Section2 Section3 Section4 Section5 An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation11 Next section
  12. 12. #TOSMAC Section1 Section2 Section3 Section4 Section5 An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation12 Next section Extractor: used to extract structured information from unstructured and semi-structured data. AQL: Annotation Query Language. Rule language with familiar SQL-like syntax.
  13. 13. #TOSMAC Section1 Section2 Section3 Section4 Section5 An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation13 Next section Profiler: troubleshooting performance problems.
  14. 14. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation14 Types of extraction specifications: - Dictionaries - Regular expressions - Part of speech
  15. 15. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation15
  16. 16. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation16
  17. 17. #TOSMAC An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation17
  18. 18. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation18
  19. 19. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation19
  20. 20. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation20 Types of extraction specifications: - Dictionaries -Regular expressions - Part of speech numbers: 7.5 4 13
  21. 21. #TOSMAC Demo | © 2014 IBM Corporation21
  22. 22. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation22 Types of extraction specifications: - Dictionaries - Regular expressions - Part of speech
  23. 23. #TOSMAC Main concepts An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation23
  24. 24. #TOSMAC An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation24
  25. 25. #TOSMAC | © 2014 IBM Corporation25 An Intro to Text Analytics on Big Data with a use case AQL Guidelines Basic feature AQL statements - Develop the core building blocks of the extractor.
  26. 26. #TOSMAC | © 2014 IBM Corporation26 An Intro to Text Analytics on Big Data with a use case AQL Guidelines Candidate generation AQL statements - Combine basic features AQL statements.
  27. 27. #TOSMAC | © 2014 IBM Corporation27 An Intro to Text Analytics on Big Data with a use case Candidate generation AQL statements $7.5 million $4 thousand $ 7.5 million
  28. 28. #TOSMAC | © 2014 IBM Corporation28 An Intro to Text Analytics on Big Data with a use case Candidate generation AQL statements $7.5 million $4 thousand $ 7.5 million $7.5 million
  29. 29. #TOSMAC | © 2014 IBM Corporation29 An Intro to Text Analytics on Big Data with a use case AQL Guidelines Filter and consolidate AQL statements - Refine results - Remove invalid annotations - Resolve overlap between annotations.
  30. 30. #TOSMAC Demo | © 2014 IBM Corporation30
  31. 31. #TOSMAC | © 2014 IBM Corporation31 An Intro to Text Analytics on Big Data with a use case Conclusion
  32. 32. #TOSMAC Check point An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation32
  33. 33. #TOSMAC What we have done An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation33 Section1 Section2 Section3
  34. 34. #TOSMAC What are we going to do? An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation34 Section4 Section5
  35. 35. #TOSMAC Demo | © 2014 IBM Corporation35
  36. 36. #TOSMAC Also using R An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation36 1.75 0.32
  37. 37. #TOSMAC What are we going to do? An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation37
  38. 38. #TOSMAC Demo | © 2014 IBM Corporation38
  39. 39. #TOSMAC So what? An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation39
  40. 40. #TOSMAC Companies An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation40
  41. 41. #TOSMAC Exporting to you An Intro to Text Analytics on Big Data with a use case | © 2014 IBM Corporation41
  42. 42. #TOSMAC Thank you! Let's network! | © 2014 IBM Corporation42
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×