SlideShare a Scribd company logo
SEARCH'YOUR'TWEETS
SEARCH'LIKE'A'PROFESSIONAL
Motivation
• Twitter'represents'a'rich'flow'of'information
• Lack'of'an'effective'way'to'query'the'twitter
• Hard'to'monitor'interested'topics'at'real'time
Search'Tweets'Like'a'Professional
A'Real'Time'Twitter'Search'Engine'That'
Allows'you'to'Search'based'on:
•Keywords
◦Country
◦Language
◦Negative'words
Demo(http://searchyourtweet.info:5000/input)
Keep'an'eye'on'your'interested'topic
•Express'your'interest,'we'will'keep'you'update'on'the'newest'event
•Video'(https://youtu.be/GdRmXNfukos)
Data'pipeline
Query'Controller
Backend'Database
percolator
Logic'Layer Frontend
Searching'database
Data'Backup
Pub/Sub
Publish
Matching'query
Register'query
searching
Real'Time'Monitor'on'Twitter
◦Implemented'using'ElasticSearch Percolator
◦Think'it'as'“search'in'reverse”
◦ User'register'queries'into'percolator
◦ Percolator'match'incoming'documents'with'registered'queries
◦Challenge:
◦ How'to'design'the'percolator'data'pipeline?
◦ How'to'decouple'the'backend'database'with'frontend'server?
◦ Use'publish'/'subscribe'design'pattern
Real'Time'Monitor'Data'Flow
Percolator
Query'database
Twitter'database
Controller
Pub/Sub
New'incoming'tweets
publish
subscribe
Open'channel
Challenge
Build'a'high'throughput'real'time'
backend'data'pipeline?
• Use'Logstash!
◦ Highly Scalable
◦ Compatiblewith'different'sources'and'
destination
A'scalable'high'throughput' pipelineCurrent'backend'pipeline
Challenge
• Real'time'update'on'frontend'client:
• Instead'of'using'“setInterval()”'javascript function,'I'use'“socketIO”'to'keep'
socket'open'between'front^end'client'and'flask'server'
• Construct'ElasticSearch query
• Use'python'requests'library'to'query'ElasticSearch
• Fine'tuning'on'ElasticSearch
About'Me
M.Math,'University'of'Waterloo
◦ Field:'Statistics'and'Machine'Learning
B.S.,'University'of'Toronto
◦ Field:'Applied'Mathematics
Data'Scientist'Intern,'Neon'Inc.,'San'Francisco
Back^end'Model'Developer,'MetricAid Inc.,'Toronto
Experience'in'Deep'Learning:'
◦ Convolutional'Network,'Recurrent'Network
•OS/161'(a'simplified'POSIX'OS)
Questions?
Thank'you!'
Parallelization'of'percolator
• Will'consumes'a'lot'
hardware:'O(mn)
• Another'choice:
Luwak +'Samza

More Related Content

Similar to Jinchao demo v6

Jinchao demo v3
Jinchao demo v3Jinchao demo v3
Jinchao demo v3
Jinchao Lin
 
Jinchao demo
Jinchao demoJinchao demo
Jinchao demo
Jinchao Lin
 
Connecting to the Pulse of the Planet with the Twitter Platform
Connecting to the Pulse of the Planet with the Twitter PlatformConnecting to the Pulse of the Planet with the Twitter Platform
Connecting to the Pulse of the Planet with the Twitter Platform
Andy Piper
 
Building Social Tools
Building Social ToolsBuilding Social Tools
Building Social Tools
Anand Hemmige
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
Matthew Russell
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
Digital Reasoning
 
Working With Facebook, Twitter, et al. - Social Media Camp
Working With Facebook, Twitter, et al. - Social Media CampWorking With Facebook, Twitter, et al. - Social Media Camp
Working With Facebook, Twitter, et al. - Social Media Camp
Mike Anderson
 
Growth Hacking with Data: How to Find Big Growth with Deep Data Dives
Growth Hacking with Data: How to Find Big Growth with Deep Data DivesGrowth Hacking with Data: How to Find Big Growth with Deep Data Dives
Growth Hacking with Data: How to Find Big Growth with Deep Data Dives
Sean Ellis
 
How to Uncover Big Growth Opportunities with Data
How to Uncover Big Growth Opportunities with DataHow to Uncover Big Growth Opportunities with Data
How to Uncover Big Growth Opportunities with Data
Looker
 
Twitter for trainers webcast
Twitter for trainers webcastTwitter for trainers webcast
Twitter for trainers webcast
Kella Price
 
NPTs
NPTsNPTs
PlayFab ugc gdc
PlayFab ugc gdcPlayFab ugc gdc
PlayFab ugc gdc
Crystin Cox
 
Everything You Wish You Knew About Search
Everything You Wish You Knew About SearchEverything You Wish You Knew About Search
Everything You Wish You Knew About Search
IDEAS - Int'l Data Engineering and Science Association
 
Mining Georeferenced Data
Mining Georeferenced DataMining Georeferenced Data
Mining Georeferenced Data
Bruno Gonçalves
 
Tickery, Pyjamas and FluidDB
Tickery, Pyjamas and FluidDBTickery, Pyjamas and FluidDB
Tickery, Pyjamas and FluidDB
Terry Jones
 
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
Anand Hemmige
 
Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit
Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, RedditMaking Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit
Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit
Lucidworks
 
xAPI Camp-Correlating Results with xAPI
xAPI Camp-Correlating Results with xAPIxAPI Camp-Correlating Results with xAPI
xAPI Camp-Correlating Results with xAPI
Anthony Altieri
 
Zemanta Fast Track To Social Publishing
Zemanta Fast Track To Social PublishingZemanta Fast Track To Social Publishing
Zemanta Fast Track To Social Publishing
Russell Pierpoint
 
Real-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsReal-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 chars
Joshua Shinavier
 

Similar to Jinchao demo v6 (20)

Jinchao demo v3
Jinchao demo v3Jinchao demo v3
Jinchao demo v3
 
Jinchao demo
Jinchao demoJinchao demo
Jinchao demo
 
Connecting to the Pulse of the Planet with the Twitter Platform
Connecting to the Pulse of the Planet with the Twitter PlatformConnecting to the Pulse of the Planet with the Twitter Platform
Connecting to the Pulse of the Planet with the Twitter Platform
 
Building Social Tools
Building Social ToolsBuilding Social Tools
Building Social Tools
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
 
Working With Facebook, Twitter, et al. - Social Media Camp
Working With Facebook, Twitter, et al. - Social Media CampWorking With Facebook, Twitter, et al. - Social Media Camp
Working With Facebook, Twitter, et al. - Social Media Camp
 
Growth Hacking with Data: How to Find Big Growth with Deep Data Dives
Growth Hacking with Data: How to Find Big Growth with Deep Data DivesGrowth Hacking with Data: How to Find Big Growth with Deep Data Dives
Growth Hacking with Data: How to Find Big Growth with Deep Data Dives
 
How to Uncover Big Growth Opportunities with Data
How to Uncover Big Growth Opportunities with DataHow to Uncover Big Growth Opportunities with Data
How to Uncover Big Growth Opportunities with Data
 
Twitter for trainers webcast
Twitter for trainers webcastTwitter for trainers webcast
Twitter for trainers webcast
 
NPTs
NPTsNPTs
NPTs
 
PlayFab ugc gdc
PlayFab ugc gdcPlayFab ugc gdc
PlayFab ugc gdc
 
Everything You Wish You Knew About Search
Everything You Wish You Knew About SearchEverything You Wish You Knew About Search
Everything You Wish You Knew About Search
 
Mining Georeferenced Data
Mining Georeferenced DataMining Georeferenced Data
Mining Georeferenced Data
 
Tickery, Pyjamas and FluidDB
Tickery, Pyjamas and FluidDBTickery, Pyjamas and FluidDB
Tickery, Pyjamas and FluidDB
 
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
 
Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit
Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, RedditMaking Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit
Making Reddit Search Relevant and Scalable - Anupama Joshi & Jerry Bao, Reddit
 
xAPI Camp-Correlating Results with xAPI
xAPI Camp-Correlating Results with xAPIxAPI Camp-Correlating Results with xAPI
xAPI Camp-Correlating Results with xAPI
 
Zemanta Fast Track To Social Publishing
Zemanta Fast Track To Social PublishingZemanta Fast Track To Social Publishing
Zemanta Fast Track To Social Publishing
 
Real-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsReal-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 chars
 

Recently uploaded

What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 

Recently uploaded (20)

What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 

Jinchao demo v6