SlideShare a Scribd company logo
1 of 22
Effective Use of the
Twitter Search API
Eric Jensen
Twitter Search

Submit your questions via
http://bit.ly/chirpsearch
or hashtag #chirpsearch
Agenda
•   Mission of the Twitter Search API

•   History

•   Most recently: ranking the top results

•   What’s next
Search API Mission

Connect users with what's most
important and interesting to
them in the here and now

(return the best stuff for a query)
Search Stats
•   Over 600 million queries per day

•   Typically less than 200 milliseconds per query

•   Typically less than 20 seconds indexing
    latency

•   Index of hundreds of millions of tweets
Search API Use Cases
•   Search interfaces: collecta, oneriot, crowdeye, ...

•   Dashboard clients: tweetdeck, seesmic, ...

•   Widgets: twitter, tweetgrid, monitter, ...

•   Location search: trendsmap, foursquare, ...

•   Visualizations: radian6, crimsonhexagon, twistori, ...

•   Analytics: stocktwits, trendrr, tweetstats, ...

•   Recommenders: mrtweet, ...

•   Thousands not listed here + not invented yet
Search vs. Streaming
•   Do use the search API for your app when:

    •   The user can input a query

    •   You need immediate results, not tracking

•   Don’t use the search API for your app when:

    •   Your user experience requires comprehensive
        results (all the tweets, not just the best ones)

    •   You only need tweets from/to/at particular users
Refreshing Results
Client                                           API
                search.json?q=twitter

   "refresh_url":"?since_id=9290798834&q=twitter"




                                                       seconds
                                                         ~20
     search.json?since_id=9290798834&q=twitter

   "refresh_url":"?since_id=9290800152&q=twitter"
Why is this OK?
search.json?q=twitter   search.json?since_id=9290798834
                                   &q=twitter


  Timeline Cache               Timeline Cache
                             q=twitter    1   2 3 4




      Search                                          Tweets
      Index
Search API History

                                                                                             Quality Filtering on Trends
                                                                                             Nov 5, 2009

Summize Launches Twitter Search                                                                                            Top Results Include Popular
Apr 4, 2008                                                                                                                Apr 1, 2010

                 Summize Acquired by Twitter           Search on Twitter.com                             Local Trends        Chirp!
                 Jul 14, 2008                          Apr 1, 2009                                       Jan 6, 2010         Apr 15, 2010


                                                                                                                                     Twitter Search API
                    Sep 1, 2008          Jan 1, 2009   May 1, 2009             Sep 1, 2009        Jan 1, 2010
Ranking Top Results
             • Best stuff for a query

             • Many factors

             • First step

             • Available from API
Top Results API
•   New parameter: result_type

    •   mixed: Eventually this will become the
        default value. Include both popular and real
        time results in the response.

    •   recent: The current default value. Return
        only the most recent results in the response.

    •   popular: Return only the most popular
        results in the response.
Top Results Metadata
{"results":[
     {"text":"@twitterapi  http://
tinyurl.com/ctrefg",
     "from_user":"jkoum",
     "metadata":
     {
      "result_type":"popular",
      "recent_retweets": 100
     },
     "id":1478555574,   
Top Results API Example
        • Initial load includes top results

        • Metadata annotates them

        • Refreshes recent results on top
Include Top Results
url =
  ‘http://search.twitter.com/search.' +
  format +
  '?q=' + query +
  '&result_type=mixed'
Annotate w/ Metadata
if (tweet.metadata.result_type ==
     'popular') {


    return '<div class="twtr-popular">' +
     tweet.metadata.recent_retweets +
     ' recent retweets</div>';
}
Refresh Recent Results
refresh_url = response.refresh_url


...


url =
  ‘http://search.twitter.com/search.' +
  format +
  refresh_url
The Near Future
•   Remove duplicates (retweets)

•   Deeper index

•   Hit highlighting in the API

•   More consistency (with the REST API)

•   Better rate limiting
The Future (cont)
•   More relevance

•   More metadata

•   More stuff

•   More operators

    •   places, @anywhere, annotations
Open Source in Search
•   http://twitter.com/about/opensource

    •   mysql, hadoop, kestrel, twitter-text, etc.

•   lucene

•   commons-pipeline

•   varnish

•   jmeter

•   nutch language identifier

•   mecab
We’re Hiring
•   http://twitter.com/jobs

•   Data Analyst - Search

•   Product Manager - Search

•   Software Engineer - Search

•   Software Engineer - Search Front-End

•   Software Engineer - Search Relevance
Questions?

http://bit.ly/chirpsearch
or hashtag #chirpsearch

Also join us at the Real-Time
Search Birds of a Feather @
1:30 in The Coop

More Related Content

Similar to Effective Use of the Twitter Search API

Social Developers London update for Twitter Developers
Social Developers London update for Twitter Developers Social Developers London update for Twitter Developers
Social Developers London update for Twitter Developers Angus Fox
 
Harvesting Data from Twitter Workshop: Hands-on Experience
Harvesting Data from Twitter Workshop: Hands-on ExperienceHarvesting Data from Twitter Workshop: Hands-on Experience
Harvesting Data from Twitter Workshop: Hands-on ExperienceASA_Group
 
CSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from TwitterCSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from TwitterMarcello Tomasini
 
Open Network Live - Chirp 情報共有
Open Network Live - Chirp 情報共有Open Network Live - Chirp 情報共有
Open Network Live - Chirp 情報共有Open Network Lab
 
iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携So Matsuda
 
A case about Twitter
A case about TwitterA case about Twitter
A case about TwitterRuishan Xu
 
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...Anand Hemmige
 
Twitter - What, Why, Who & How
Twitter - What, Why, Who & HowTwitter - What, Why, Who & How
Twitter - What, Why, Who & Howpoint2five
 
HootSuite 101 Workshop
HootSuite 101 WorkshopHootSuite 101 Workshop
HootSuite 101 WorkshopMisha Abasov
 
Sentiment analysis on demonetisation
Sentiment analysis on demonetisationSentiment analysis on demonetisation
Sentiment analysis on demonetisationAbrarMohamed5
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightMatthew Russell
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insightDigital Reasoning
 
Turbocharge Twitter With Apps SMBMTL 082510
Turbocharge Twitter With Apps SMBMTL 082510Turbocharge Twitter With Apps SMBMTL 082510
Turbocharge Twitter With Apps SMBMTL 082510Adele McAlear
 

Similar to Effective Use of the Twitter Search API (20)

Social Developers London update for Twitter Developers
Social Developers London update for Twitter Developers Social Developers London update for Twitter Developers
Social Developers London update for Twitter Developers
 
Twitter api
Twitter apiTwitter api
Twitter api
 
Harvesting Data from Twitter Workshop: Hands-on Experience
Harvesting Data from Twitter Workshop: Hands-on ExperienceHarvesting Data from Twitter Workshop: Hands-on Experience
Harvesting Data from Twitter Workshop: Hands-on Experience
 
CSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from TwitterCSE5656 Complex Networks - Gathering Data from Twitter
CSE5656 Complex Networks - Gathering Data from Twitter
 
Open Network Live - Chirp 情報共有
Open Network Live - Chirp 情報共有Open Network Live - Chirp 情報共有
Open Network Live - Chirp 情報共有
 
Internship
InternshipInternship
Internship
 
We are losing our tweets!
We are losing our tweets!We are losing our tweets!
We are losing our tweets!
 
Twet
TwetTwet
Twet
 
iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携iPhoneアプリのTwitter連携
iPhoneアプリのTwitter連携
 
A case about Twitter
A case about TwitterA case about Twitter
A case about Twitter
 
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
South JVM Users Group Talk - Building Social Media Tools using JVM Supported ...
 
Twitter - What, Why, Who & How
Twitter - What, Why, Who & HowTwitter - What, Why, Who & How
Twitter - What, Why, Who & How
 
HootSuite 101 Workshop
HootSuite 101 WorkshopHootSuite 101 Workshop
HootSuite 101 Workshop
 
Sentiment analysis on demonetisation
Sentiment analysis on demonetisationSentiment analysis on demonetisation
Sentiment analysis on demonetisation
 
Jinchao demo v7
Jinchao demo v7Jinchao demo v7
Jinchao demo v7
 
Potential of twitter archives
Potential of twitter archivesPotential of twitter archives
Potential of twitter archives
 
Twitter
TwitterTwitter
Twitter
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
 
Turbocharge Twitter With Apps SMBMTL 082510
Turbocharge Twitter With Apps SMBMTL 082510Turbocharge Twitter With Apps SMBMTL 082510
Turbocharge Twitter With Apps SMBMTL 082510
 

Recently uploaded

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Effective Use of the Twitter Search API

  • 1.
  • 2. Effective Use of the Twitter Search API Eric Jensen Twitter Search Submit your questions via http://bit.ly/chirpsearch or hashtag #chirpsearch
  • 3. Agenda • Mission of the Twitter Search API • History • Most recently: ranking the top results • What’s next
  • 4. Search API Mission Connect users with what's most important and interesting to them in the here and now (return the best stuff for a query)
  • 5. Search Stats • Over 600 million queries per day • Typically less than 200 milliseconds per query • Typically less than 20 seconds indexing latency • Index of hundreds of millions of tweets
  • 6. Search API Use Cases • Search interfaces: collecta, oneriot, crowdeye, ... • Dashboard clients: tweetdeck, seesmic, ... • Widgets: twitter, tweetgrid, monitter, ... • Location search: trendsmap, foursquare, ... • Visualizations: radian6, crimsonhexagon, twistori, ... • Analytics: stocktwits, trendrr, tweetstats, ... • Recommenders: mrtweet, ... • Thousands not listed here + not invented yet
  • 7. Search vs. Streaming • Do use the search API for your app when: • The user can input a query • You need immediate results, not tracking • Don’t use the search API for your app when: • Your user experience requires comprehensive results (all the tweets, not just the best ones) • You only need tweets from/to/at particular users
  • 8. Refreshing Results Client API search.json?q=twitter "refresh_url":"?since_id=9290798834&q=twitter" seconds ~20 search.json?since_id=9290798834&q=twitter "refresh_url":"?since_id=9290800152&q=twitter"
  • 9. Why is this OK? search.json?q=twitter search.json?since_id=9290798834 &q=twitter Timeline Cache Timeline Cache q=twitter 1 2 3 4 Search Tweets Index
  • 10. Search API History Quality Filtering on Trends Nov 5, 2009 Summize Launches Twitter Search Top Results Include Popular Apr 4, 2008 Apr 1, 2010 Summize Acquired by Twitter Search on Twitter.com Local Trends Chirp! Jul 14, 2008 Apr 1, 2009 Jan 6, 2010 Apr 15, 2010 Twitter Search API Sep 1, 2008 Jan 1, 2009 May 1, 2009 Sep 1, 2009 Jan 1, 2010
  • 11. Ranking Top Results • Best stuff for a query • Many factors • First step • Available from API
  • 12. Top Results API • New parameter: result_type • mixed: Eventually this will become the default value. Include both popular and real time results in the response. • recent: The current default value. Return only the most recent results in the response. • popular: Return only the most popular results in the response.
  • 13. Top Results Metadata {"results":[      {"text":"@twitterapi  http:// tinyurl.com/ctrefg",      "from_user":"jkoum",      "metadata":      {       "result_type":"popular",       "recent_retweets": 100      },      "id":1478555574,   
  • 14. Top Results API Example • Initial load includes top results • Metadata annotates them • Refreshes recent results on top
  • 15. Include Top Results url = ‘http://search.twitter.com/search.' + format + '?q=' + query + '&result_type=mixed'
  • 16. Annotate w/ Metadata if (tweet.metadata.result_type == 'popular') { return '<div class="twtr-popular">' + tweet.metadata.recent_retweets + ' recent retweets</div>'; }
  • 17. Refresh Recent Results refresh_url = response.refresh_url ... url = ‘http://search.twitter.com/search.' + format + refresh_url
  • 18. The Near Future • Remove duplicates (retweets) • Deeper index • Hit highlighting in the API • More consistency (with the REST API) • Better rate limiting
  • 19. The Future (cont) • More relevance • More metadata • More stuff • More operators • places, @anywhere, annotations
  • 20. Open Source in Search • http://twitter.com/about/opensource • mysql, hadoop, kestrel, twitter-text, etc. • lucene • commons-pipeline • varnish • jmeter • nutch language identifier • mecab
  • 21. We’re Hiring • http://twitter.com/jobs • Data Analyst - Search • Product Manager - Search • Software Engineer - Search • Software Engineer - Search Front-End • Software Engineer - Search Relevance
  • 22. Questions? http://bit.ly/chirpsearch or hashtag #chirpsearch Also join us at the Real-Time Search Birds of a Feather @ 1:30 in The Coop

Editor's Notes

  1. i will talk about: - start by giving some of our thinking about why we have a search api and what differentiates it from the other api&amp;#x2019;s twitter offers - i&amp;#x2019;ll get into some technical implications of these differences with respect to polling on search versus tracking keywords on the streaming api - next, i&amp;#x2019;ll talk briefly about how the search api has changed over time - and then we&amp;#x2019;ll dig into the most recent change where we began ranking the top results beyond recency order. i&amp;#x2019;ll show you how i&amp;#x2019;ve modified one of our own search api clients to take advantage of that change
  2. simple definition: user provides a query by engaging with an api application, we provide the best stuff (currently tweets and trends) for that query Obviously the &amp;#x201C;best&amp;#x201D; stuff for twitter has a lot to do with how recent it is, so our primary focus is on the &amp;#x201C;here and now&amp;#x201D;
  3. Just to give you an idea of the parameters search operates under: - as ev told you yesterday we are doing more than 600M queries per day, seen up to 750M on a day recently - while realtime is our main focus, our index does contain hundreds of millions of tweets and we&amp;#x2019;ve roughly doubled its size in the last six months. - of course, the amount of tweets has grown even faster than we&amp;#x2019;ve increased that index size, so this only covers about a week of them right now, but that is something we&amp;#x2019;re currently working on expanding
  4. So obviously we&amp;#x2019;re operating a large scale, but what&amp;#x2019;s really interesting to me about the search API is the variety of applications you as developers have found for it. I&amp;#x2019;ve listed just a few here to illustrate what people are currently doing with the API.
  5. So that&amp;#x2019;s what people are doing with the search api, but the streaming api also supports tracking keywords and some location and language filtering. So, if you&amp;#x2019;re developing a new app, how do you decide which to use?
  6. The biggest difference between the search API and the track API is how you get new results matching your standing query. On the streaming API the push model makes this obvious: new results are sent to you as they come in. Since the focus of the search API is on apps that let the user manipulate the query (whether explicitly or implicitly), registering a standing query for every request makes less sense. Instead, the search API uses a polling model with a cursor. --- make sure you explain this diagram by pointing at it (or at least describing it). It took me a minute to get the visual presentation
  7. One question that comes up frequently is why we encourage apps to use this cursor to poll and how that helps us to support refreshes more efficiently, so here&amp;#x2019;s a diagram of what happens under the covers. A lot like the streaming API, when you make any query to search we actually do register that as a standing query, but only in one of our caching layers we call the timeline cache.
  8. Next I&amp;#x2019;d like to take a step back and talk briefly about the history of the search API and how our thinking about it has developed. twitter search and the API have been around for about two years now, and we made a lot of changes early on like supporting location search, but after that we had to shift our focus to scaling the system to support the growth in tweets and queries. It&amp;#x2019;s really just in the last six months that we&amp;#x2019;ve made enough progress with scaling and grown the search team enough to be able to focus more on relevance and figuring out what that means for twitter search.
  9. Our mission: ---- Under &amp;#x201C;many factors&amp;#x201D; you should note that it&amp;#x2019;s not always the popular users that show up here -- that seems to be an early misconception. Our algorithm looks to find things that are interesting from any user - things that &amp;#x201C;resonate,&amp;#x201D; to use a word that Dick talked about yesterday (good to tie it in to other things being said at Chirp). Rather than &amp;#x201C;not final&amp;#x201D; (which seems to imply there is a &amp;#x201C;final&amp;#x201D; step when we won&amp;#x2019;t be improving this) I&amp;#x2019;d say something like &amp;#x201C;First step of a long road of relevance improvements&amp;#x201D; (implying that we&amp;#x2019;ve got lots of ideas and we&amp;#x2019;ll be delivering cool stuff for a long way.
  10. right now at the top
  11. explain that this uses since_id
  12. we want to hear from you