Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Jinchao demo v6

insight demo

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

  • Be the first to like this

Jinchao demo v6

  1. 1. SEARCH'YOUR'TWEETS SEARCH'LIKE'A'PROFESSIONAL
  2. 2. Motivation • Twitter'represents'a'rich'flow'of'information • Lack'of'an'effective'way'to'query'the'twitter • Hard'to'monitor'interested'topics'at'real'time
  3. 3. Search'Tweets'Like'a'Professional A'Real'Time'Twitter'Search'Engine'That' Allows'you'to'Search'based'on: •Keywords ◦Country ◦Language ◦Negative'words Demo(http://searchyourtweet.info:5000/input)
  4. 4. Keep'an'eye'on'your'interested'topic •Express'your'interest,'we'will'keep'you'update'on'the'newest'event •Video'(https://youtu.be/GdRmXNfukos)
  5. 5. Data'pipeline Query'Controller Backend'Database percolator Logic'Layer Frontend Searching'database Data'Backup Pub/Sub Publish Matching'query Register'query searching
  6. 6. Real'Time'Monitor'on'Twitter ◦Implemented'using'ElasticSearch Percolator ◦Think'it'as'“search'in'reverse” ◦ User'register'queries'into'percolator ◦ Percolator'match'incoming'documents'with'registered'queries ◦Challenge: ◦ How'to'design'the'percolator'data'pipeline? ◦ How'to'decouple'the'backend'database'with'frontend'server? ◦ Use'publish'/'subscribe'design'pattern
  7. 7. Real'Time'Monitor'Data'Flow Percolator Query'database Twitter'database Controller Pub/Sub New'incoming'tweets publish subscribe Open'channel
  8. 8. Challenge Build'a'high'throughput'real'time' backend'data'pipeline? • Use'Logstash! ◦ Highly Scalable ◦ Compatiblewith'different'sources'and' destination A'scalable'high'throughput' pipelineCurrent'backend'pipeline
  9. 9. Challenge • Real'time'update'on'frontend'client: • Instead'of'using'“setInterval()”'javascript function,'I'use'“socketIO”'to'keep' socket'open'between'front^end'client'and'flask'server' • Construct'ElasticSearch query • Use'python'requests'library'to'query'ElasticSearch • Fine'tuning'on'ElasticSearch
  10. 10. About'Me M.Math,'University'of'Waterloo ◦ Field:'Statistics'and'Machine'Learning B.S.,'University'of'Toronto ◦ Field:'Applied'Mathematics Data'Scientist'Intern,'Neon'Inc.,'San'Francisco Back^end'Model'Developer,'MetricAid Inc.,'Toronto Experience'in'Deep'Learning:' ◦ Convolutional'Network,'Recurrent'Network •OS/161'(a'simplified'POSIX'OS)
  11. 11. Questions? Thank'you!'
  12. 12. Parallelization'of'percolator • Will'consumes'a'lot' hardware:'O(mn) • Another'choice: Luwak +'Samza

    Be the first to comment

    Login to see the comments

insight demo

Views

Total views

166

On Slideshare

0

From embeds

0

Number of embeds

3

Actions

Downloads

3

Shares

0

Comments

0

Likes

0

×