Tennesse Joyce Insight demo TitleWave

•Download as PPTX, PDF•

0 likes•55 views

TennesseJoyce

Demo for my Insight project

Data & Analytics

TitleWave
Building better Stack Overflow
titles using data science
Tennesse Joyce

In 2019, 5,633 questions
per day were asked on
Stack Overflow, but only
70% were answered.

TitleWave
A Chrome extension that helps people choose
better titles for their Stack Overflow questions.
DEMO

Data Source
Pre-
processing
“Evaluate
title”

Data Source
Pre-
processing
“Suggest a
title”
“Evaluate
title”

Data Source
Pre-
processing
“Suggest a
title”
“Evaluate
title”
Result: ⅓ of suggested titles
are better than the original.

Iterating on a title
Original (50.4%): Gmail API Pull out plain text email body

Iterating on a title
Original (50.4%): Gmail API Pull out plain text email body
Suggested (63.9%): How can I extract plain text from an email sent to
me from a specific source, without forwarding the email to myself?

About me
Physics PhD, University
of Colorado Boulder

What's hot

Advanced seo gs v2Yvonne Dewerne

12 on-page-checklistJawad Shah

Best Digital Marketing Training coursePARINITA Gupta

Computer language - html linksDr. I. Uma Maheswari Maheswari

Managing content onlineeyadfebc

Browsers and search engineskavithaJayalal

TF-IDF: an overlooked critical element of SEOPJ Howland

What's hot (7)

Advanced seo gs v2

12 on-page-checklist

Best Digital Marketing Training course

Computer language - html links

Managing content online

Browsers and search engines

TF-IDF: an overlooked critical element of SEO

Recently uploaded

Edukaciniai dropshipping via API with DroFxolyaivanovalion

Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo

CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion

Introduction-to-Machine-Learning (1).pptxfirstjob4

100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate

Midocean dropshipping via API with DroFxolyaivanovalion

Halmar dropshipping via API with DroFxolyaivanovalion

Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth

Invezz.com - Grow your wealth with trading signalsInvezz1

Week-01-2.ppt BBB human Computer interactionfulawalesam

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Brighton SEO | April 2024 | Data StorytellingNeil Barnes

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh9953056974 Low Rate Call Girls In Saket, Delhi NCR

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor

Recently uploaded (20)

Edukaciniai dropshipping via API with DroFx

Schema on read is obsolete. Welcome metaprogramming..pdf

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改

CebaBaby dropshipping via API with DroFX.pptx

Introduction-to-Machine-Learning (1).pptx

100-Concepts-of-AI by Anupama Kate .pptx

Midocean dropshipping via API with DroFx

Halmar dropshipping via API with DroFx

Unveiling Insights: The Role of a Data Analyst

Invezz.com - Grow your wealth with trading signals

Week-01-2.ppt BBB human Computer interaction

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Brighton SEO | April 2024 | Data Storytelling

Generative AI on Enterprise Cloud with NiFi and Milvus

FESE Capital Markets Fact Sheet 2024 Q1.pdf

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati

Tennesse Joyce Insight demo TitleWave

1. TitleWave Building better Stack Overflow titles using data science Tennesse Joyce

2. In 2019, 5,633 questions per day were asked on Stack Overflow, but only 70% were answered.

3. TitleWave A Chrome extension that helps people choose better titles for their Stack Overflow questions. DEMO

4. Data Source

5. Data Source Pre- processing

6. Data Source Pre- processing “Evaluate title”

7. ‘Good’ Titles ‘Bad’ Titles

8. Data Source Pre- processing “Suggest a title” “Evaluate title”

9. Data Source Pre- processing “Suggest a title” “Evaluate title” Result: ⅓ of suggested titles are better than the original.

10. Iterating on a title Original (50.4%): Gmail API Pull out plain text email body

11. Iterating on a title Original (50.4%): Gmail API Pull out plain text email body Suggested (63.9%): How can I extract plain text from an email sent to me from a specific source, without forwarding the email to myself?

12. Iterating on a title Original (50.4%): Gmail API Pull out plain text email body Suggested (63.9%): How can I extract plain text from an email sent to me from a specific source, without forwarding the email to myself? Edited (74.0%): How can I extract plain text from an email sent to me?

13. About me Physics PhD, University of Colorado Boulder

Editor's Notes

Hi, I’m Tennesse Joyce, and I built a Chrome extension called TitleWave that will improve the titles your Stack Overflow questions.
Stack Overflow is ubiquitous in the programming world as a place where people can ask questions to a community other programmers. In 2019, over 5,000 questions a day were asked, but only 70% of those were answered. To increase your chance of getting an answer it’s really important to have a compelling title so that people actually click on your question. But this can be tough, especially for new users who aren’t familiar with the conventions on the website.
To solve this problem, I built TitleWave, a Chrome extension that integrates directly into the Stack Overflow website and helps improve your title. Let’s see how it works. So this is the webpage on Stack Overflow where you can submit a new question, and I’ve just copied one that someone asked last week as an example. My Chrome extensions adds the two buttons right here. When I press ‘Evaluate title’, it tells me the probability that my question will get answered, just by looking at the title. And when I press Suggest a Title, it reads through the text down here, and summarizes it into what it thinks the title should be. That takes about a minute on my laptop, so I’m just going to paste in the output. When I press Evaluate Title again, we see that the suggested title has about a 14% higher chance of getting answered.
So how does this work on the backend? First, I needed to collect a bunch of previous questions, which are available on the Stack Exchange Data Explorer, and you can just download that as a big XML file.
I process that into a Pandas dataframe, and then I do some data cleaning with Regex to remove HTML tags and code.
Now since this is such a huge dataset of almost 20 million questions, we can actually train a deep neural network like Google’s BERT to do feature extraction for us. This is a good choice because it doesn’t just look for keywords, it also considers phrasing of the title and how the different words fit together. I then put those features into a logistic regression to predict if each question gets answered or not, and lastly I backpropagate the error using Pytorch to fine-tune the neural network.
Let’s see how that classifier performs on a test set. This plot shows the distribution of predicted probabilities for the two classes, answered and unanswered questions, and there are actually two main clusters. The good titles have about 80% chance of getting answered, whereas the bad titles are only 50-50. Also the proportion of answered questions increases from left to right, so that tells us the model is working. If you can move your title from bad to good with the help of this tool, that gives you a significant boost in your chance of getting an answer.
For the “Suggest a title” button, I can’t use BERT because it doesn’t output text, it just encodes text into numbers. T5 is another model by Google that has both an encoder and a decoder that decode those numbers back into text at the end, so it’s a good choice for this task. I also fine-tune T5, but this time only on questions that have an accepted answer.
The result is that about a third of the time the suggested title scores better than the original title, measured according to the BERT model. That means maybe around a third of the users on Stack Overflow could benefit from this tool.
Putting it all together, this is the example title I used before in the demo. It only has a 50% chance of getting answered, so not great.
The title suggested by T5 is already a big improvement, and often you can actually make it even better by editing it yourself.
For example, the second part feels extraneous to me, and if I take that out, actually it increases the probability by another 10%. So next time you find yourself asking a question on Stack Overflow, consider using this Chrome Extension to take a more data-driven approach to choosing a title.
My name is Tennesse Joyce, and I’m finishing my PhD in laser physics at CU Boulder. I do quantum simulations of light-matter interaction, and I’m very familiar with using Python for data analysis and visualization of those simulations. I’m looking to apply those skills to data science problems where I can have a potentially bigger impact.

Tennesse Joyce Insight demo TitleWave

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Recently uploaded

Recently uploaded (20)

Tennesse Joyce Insight demo TitleWave

Editor's Notes