SlideShare a Scribd company logo
1 of 13
TitleWave
Building better Stack Overflow
titles using data science
Tennesse Joyce
In 2019, 5,633 questions
per day were asked on
Stack Overflow, but only
70% were answered.
TitleWave
A Chrome extension that helps people choose
better titles for their Stack Overflow questions.
DEMO
Data Source
Data Source
Pre-
processing
Data Source
Pre-
processing
“Evaluate
title”
‘Good’
Titles
‘Bad’ Titles
Data Source
Pre-
processing
“Suggest a
title”
“Evaluate
title”
Data Source
Pre-
processing
“Suggest a
title”
“Evaluate
title”
Result: ⅓ of suggested titles
are better than the original.
Iterating on a title
Original (50.4%): Gmail API Pull out plain text email body
Iterating on a title
Original (50.4%): Gmail API Pull out plain text email body
Suggested (63.9%): How can I extract plain text from an email sent to
me from a specific source, without forwarding the email to myself?
Iterating on a title
Original (50.4%): Gmail API Pull out plain text email body
Suggested (63.9%): How can I extract plain text from an email sent to
me from a specific source, without forwarding the email to myself?
Edited (74.0%): How can I extract plain text from an email sent to me?
About me
Physics PhD, University
of Colorado Boulder

More Related Content

What's hot

12 on-page-checklist
12 on-page-checklist12 on-page-checklist
12 on-page-checklistJawad Shah
 
Best Digital Marketing Training course
Best Digital Marketing Training courseBest Digital Marketing Training course
Best Digital Marketing Training coursePARINITA Gupta
 
Managing content online
Managing content onlineManaging content online
Managing content onlineeyadfebc
 
Browsers and search engines
Browsers and search enginesBrowsers and search engines
Browsers and search engineskavithaJayalal
 
TF-IDF: an overlooked critical element of SEO
TF-IDF: an overlooked critical element of SEOTF-IDF: an overlooked critical element of SEO
TF-IDF: an overlooked critical element of SEOPJ Howland
 

What's hot (7)

Advanced seo gs v2
Advanced seo gs v2Advanced seo gs v2
Advanced seo gs v2
 
12 on-page-checklist
12 on-page-checklist12 on-page-checklist
12 on-page-checklist
 
Best Digital Marketing Training course
Best Digital Marketing Training courseBest Digital Marketing Training course
Best Digital Marketing Training course
 
Computer language - html links
Computer language - html   linksComputer language - html   links
Computer language - html links
 
Managing content online
Managing content onlineManaging content online
Managing content online
 
Browsers and search engines
Browsers and search enginesBrowsers and search engines
Browsers and search engines
 
TF-IDF: an overlooked critical element of SEO
TF-IDF: an overlooked critical element of SEOTF-IDF: an overlooked critical element of SEO
TF-IDF: an overlooked critical element of SEO
 

Recently uploaded

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 

Recently uploaded (20)

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 

Tennesse Joyce Insight demo TitleWave

Editor's Notes

  1. Hi, I’m Tennesse Joyce, and I built a Chrome extension called TitleWave that will improve the titles your Stack Overflow questions.
  2. Stack Overflow is ubiquitous in the programming world as a place where people can ask questions to a community other programmers. In 2019, over 5,000 questions a day were asked, but only 70% of those were answered. To increase your chance of getting an answer it’s really important to have a compelling title so that people actually click on your question. But this can be tough, especially for new users who aren’t familiar with the conventions on the website.
  3. To solve this problem, I built TitleWave, a Chrome extension that integrates directly into the Stack Overflow website and helps improve your title. Let’s see how it works. So this is the webpage on Stack Overflow where you can submit a new question, and I’ve just copied one that someone asked last week as an example. My Chrome extensions adds the two buttons right here. When I press ‘Evaluate title’, it tells me the probability that my question will get answered, just by looking at the title. And when I press Suggest a Title, it reads through the text down here, and summarizes it into what it thinks the title should be. That takes about a minute on my laptop, so I’m just going to paste in the output. When I press Evaluate Title again, we see that the suggested title has about a 14% higher chance of getting answered.
  4. So how does this work on the backend? First, I needed to collect a bunch of previous questions, which are available on the Stack Exchange Data Explorer, and you can just download that as a big XML file.
  5. I process that into a Pandas dataframe, and then I do some data cleaning with Regex to remove HTML tags and code.
  6. Now since this is such a huge dataset of almost 20 million questions, we can actually train a deep neural network like Google’s BERT to do feature extraction for us. This is a good choice because it doesn’t just look for keywords, it also considers phrasing of the title and how the different words fit together. I then put those features into a logistic regression to predict if each question gets answered or not, and lastly I backpropagate the error using Pytorch to fine-tune the neural network.
  7. Let’s see how that classifier performs on a test set. This plot shows the distribution of predicted probabilities for the two classes, answered and unanswered questions, and there are actually two main clusters. The good titles have about 80% chance of getting answered, whereas the bad titles are only 50-50. Also the proportion of answered questions increases from left to right, so that tells us the model is working. If you can move your title from bad to good with the help of this tool, that gives you a significant boost in your chance of getting an answer.
  8. For the “Suggest a title” button, I can’t use BERT because it doesn’t output text, it just encodes text into numbers. T5 is another model by Google that has both an encoder and a decoder that decode those numbers back into text at the end, so it’s a good choice for this task. I also fine-tune T5, but this time only on questions that have an accepted answer.
  9. The result is that about a third of the time the suggested title scores better than the original title, measured according to the BERT model. That means maybe around a third of the users on Stack Overflow could benefit from this tool.
  10. Putting it all together, this is the example title I used before in the demo. It only has a 50% chance of getting answered, so not great.
  11. The title suggested by T5 is already a big improvement, and often you can actually make it even better by editing it yourself.
  12. For example, the second part feels extraneous to me, and if I take that out, actually it increases the probability by another 10%. So next time you find yourself asking a question on Stack Overflow, consider using this Chrome Extension to take a more data-driven approach to choosing a title.
  13. My name is Tennesse Joyce, and I’m finishing my PhD in laser physics at CU Boulder. I do quantum simulations of light-matter interaction, and I’m very familiar with using Python for data analysis and visualization of those simulations. I’m looking to apply those skills to data science problems where I can have a potentially bigger impact.