SlideShare a Scribd company logo
1 of 49
Push Data, Pull Data, Present Data
HELLO!
My name is Adrian
Software Engineer, Search & Discovery
Teachers Pay Teachers
Let’s look at data
(output is only a snippet of the top results)
Let’s explore more
What do those labels tell us?
('fast', 1),
('increase', 1),
('columnar', 1),
('store', 1),
('website', 1),
('product', 1),
('way', 1),
('business', 1),
('workflow', 1),
('framework', 1),
('task', 1),
('traffic', 1),
('bit', 1),
('day', 1),
('pull', 1),
('pipeline', 1),
[('web', 2),
('story', 2),
('lot', 2),
('site', 2),
('data', 2),
('cycle', 2),
('order', 2),
('load', 1),
('development', 1),
('code', 1),
('number', 1),
('hand-in-hand', 1),
('design', 1),
('morning', 1),
('click', 1),
('sum', 1),
NN: noun, common, singular or mass
('tracking', 1),
('anything', 1),
('database', 1),
('light', 1),
('analysis', 1),
('server', 1),
('thing', 1),
('amount', 1),
('architecture', 1),
('behavior',
1),('team', 1),
('piece', 1)]
[('of', 7),
('in', 6),
('on', 4),
('about', 4),
('that', 4),
('like', 3),
('for', 3),
('through', 3),
('if', 2),
('from', 1),
('whether', 1),
('into', 1),
('between',
1),
('per', 1),
('so', 1),
('with', 1),
('by', 1)]
IN: preposition or conjunction, subordinating
[('data', 16),
('icons', 3),
('users', 2),
('dashboards', 2),
('needs', 1),
('referrals', 1),
('things', 1),
('downloads', 1),
NNS: noun, common, plural
('sessions', 1),
('decisions', 1),
('scenarios', 1),
('tasks', 1),
('examples', 1),
('events', 1),
('issues', 1),
('logs', 1)]
What about the verbs?
[('have', 4),
('presenting', 4),
('be', 3),
('pulling', 3),
('want', 2),
('know', 2),
('tracking', 2),
('seeing', 1),
('shed', 1),
('using', 1),
('need', 1),
('check', 1),
('visualize', 1),
('thinking', 1),
('fix', 1),
VB, VBG
('build', 1),
('hopefully', 1),
('tell', 1),
('do', 1),
('processing', 1),
('deploy', 1),
('pushing', 1),
('coming', 1),
('aggregate', 1),
('changing', 1),
('present', 1),
('making', 1),
('make', 1)]
What do we know thus far?
NN, NNS are nouns.
The most common nouns
together sound like it’s a
text related to something
technical: 'web', 'story',
'lot', 'site', ‘data', 'cycle',
'order', 'data', 'icons',
'users', 'dashboards'
The verbs that we pulled
(VB, VBG) sound typical by
themselves but are useful.
Most common verbs:
'have', 'presenting', 'be',
'pulling', 'want', 'know',
'tracking'
Let’s try and make phrases
verb + noun combinations
verb + noun phrases
'have data',
'have icons',
'have users',
'have dashboards',
'have needs',
'presenting data',
'presenting icons',
'presenting users',
'presenting dashboards',
'presenting needs',
'be data',
'be icons',
'be users',
'be dashboards',
'be needs',
'pulling data',
'pulling icons',
'pulling users',
'pulling dashboards',
'pulling needs',
'want data',
'want icons',
'want users',
'want dashboards',
'want needs'
So what is the text?
Push, Pull, Present
Abstract text was pushed
to PyGotham and stored
More realistic example
would be collecting click
tracking data from a web
frontend.
Abstract text was pulled
once, but constantly
pulled from to manipulate,
extract, and sift through.
Typical jobs like data
cleansing and sorting
would be in the pull
process.
Matplotlib and standard
Python library utils were
used to present
Whether presenting slides
or a Jupyter Notebook, the
medium to present does
not matter.
Before we continue
Some background
Teachers Pay Teachers
Open marketplace for a community of millions
of educators
◉ Searches are all education-focused
◉ Sellers want to rank higher to sell better
◉ We can’t make search changes to please everyone,
but we listen carefully to our community
◉ We A/B test a lot
TpT Search Notes
Our data flow is simple...
Event
tracking
Event
data
Dashboard Analysis
An example of exclusion search
◉ “apples bananas -pumpkins”
◉ Searches are education focused; suffixes are searched for (-
ed, -ing, -ly, etc.)
◉ Looking through historical searches of 519,512,359 we
found only a small fraction of searches that included a dash
0.0003905085153132998
Only 202, 874 of the 519M historical searches had a dash
A/B tests
Decisions driven by data
Problem
Users are looking to click on the large file icon for a link, but it is just an
image and not a link.
Hypothesis
If we change out the large file icon,
users will click more on the Preview
button instead and have a better
user experience.
A/B Variants
A Variant: small icon B Variant: no icon
Clicks per A/B variant
Off Small Icon No icon
clicks 430,656 26,790 0
A/B tests in a nutshell
hypothesis test analyze
Simple Running Example
Strava
Fetch all weekday run activities that are less than 8km
References
◉ Bird, Steven, Edward Loper and Ewan Klein
(2009), Natural Language Processing with Python.
O’Reilly Media Inc.
◉ stravalib: Python Strava API client
https://github.com/hozn/stravalib
◉ Examples seen in this talk http://bit.ly/29zQDBb
(https://github.com/drincruz/PyGotham-2016)
We’re hiring!
https://www.teacherspayteachers.com/Careers
THANKS!
Any questions?
You can find me at
{github, twitter, irc, etc.} @drincruz

More Related Content

Similar to Push data, pull data, present data

Similar to Push data, pull data, present data (20)

Selling Text Analytics to your boss
Selling Text Analytics to your bossSelling Text Analytics to your boss
Selling Text Analytics to your boss
 
Building Analytics Infrastructure for Growing Tech Companies
Building Analytics Infrastructure for Growing Tech CompaniesBuilding Analytics Infrastructure for Growing Tech Companies
Building Analytics Infrastructure for Growing Tech Companies
 
GraphConnect 2014 SF: The Business Graph
GraphConnect 2014 SF: The Business GraphGraphConnect 2014 SF: The Business Graph
GraphConnect 2014 SF: The Business Graph
 
best Digital Marketing ppt for all......
best Digital Marketing ppt for all......best Digital Marketing ppt for all......
best Digital Marketing ppt for all......
 
Mozilla Foundation Metrics - presentation to engineers
Mozilla Foundation Metrics - presentation to engineersMozilla Foundation Metrics - presentation to engineers
Mozilla Foundation Metrics - presentation to engineers
 
Data Science in the Real World: Making a Difference
Data Science in the Real World: Making a Difference Data Science in the Real World: Making a Difference
Data Science in the Real World: Making a Difference
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
IIPGH Webinar 1: Getting Started With Data Science
IIPGH Webinar 1: Getting Started With Data ScienceIIPGH Webinar 1: Getting Started With Data Science
IIPGH Webinar 1: Getting Started With Data Science
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
Small Business SEO Tips and Strategies For 2013 - Chaosmap.com
Small Business SEO Tips and Strategies For 2013 - Chaosmap.comSmall Business SEO Tips and Strategies For 2013 - Chaosmap.com
Small Business SEO Tips and Strategies For 2013 - Chaosmap.com
 
Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...
Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...
Maximizing Big Data ROI via Best of Breed Technology Patterns and Practices -...
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 
Creating a Single View: Overview and Analysis
Creating a Single View: Overview and AnalysisCreating a Single View: Overview and Analysis
Creating a Single View: Overview and Analysis
 
Pivotal Open Source: Using Fluentd to gain insights into your logs
Pivotal Open Source:  Using Fluentd to gain insights into your logsPivotal Open Source:  Using Fluentd to gain insights into your logs
Pivotal Open Source: Using Fluentd to gain insights into your logs
 
kdd2015
kdd2015kdd2015
kdd2015
 
Boolean Search.. A Basic Level for internal KT/Reference material
Boolean Search.. A Basic Level for internal KT/Reference materialBoolean Search.. A Basic Level for internal KT/Reference material
Boolean Search.. A Basic Level for internal KT/Reference material
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
 
1 data science with python
1 data science with python1 data science with python
1 data science with python
 

Recently uploaded

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 

Push data, pull data, present data

  • 1. Push Data, Pull Data, Present Data
  • 2. HELLO! My name is Adrian Software Engineer, Search & Discovery Teachers Pay Teachers
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. (output is only a snippet of the top results)
  • 9. Let’s explore more What do those labels tell us?
  • 10. ('fast', 1), ('increase', 1), ('columnar', 1), ('store', 1), ('website', 1), ('product', 1), ('way', 1), ('business', 1), ('workflow', 1), ('framework', 1), ('task', 1), ('traffic', 1), ('bit', 1), ('day', 1), ('pull', 1), ('pipeline', 1), [('web', 2), ('story', 2), ('lot', 2), ('site', 2), ('data', 2), ('cycle', 2), ('order', 2), ('load', 1), ('development', 1), ('code', 1), ('number', 1), ('hand-in-hand', 1), ('design', 1), ('morning', 1), ('click', 1), ('sum', 1), NN: noun, common, singular or mass ('tracking', 1), ('anything', 1), ('database', 1), ('light', 1), ('analysis', 1), ('server', 1), ('thing', 1), ('amount', 1), ('architecture', 1), ('behavior', 1),('team', 1), ('piece', 1)]
  • 11.
  • 12. [('of', 7), ('in', 6), ('on', 4), ('about', 4), ('that', 4), ('like', 3), ('for', 3), ('through', 3), ('if', 2), ('from', 1), ('whether', 1), ('into', 1), ('between', 1), ('per', 1), ('so', 1), ('with', 1), ('by', 1)] IN: preposition or conjunction, subordinating
  • 13. [('data', 16), ('icons', 3), ('users', 2), ('dashboards', 2), ('needs', 1), ('referrals', 1), ('things', 1), ('downloads', 1), NNS: noun, common, plural ('sessions', 1), ('decisions', 1), ('scenarios', 1), ('tasks', 1), ('examples', 1), ('events', 1), ('issues', 1), ('logs', 1)]
  • 14. What about the verbs?
  • 15.
  • 16. [('have', 4), ('presenting', 4), ('be', 3), ('pulling', 3), ('want', 2), ('know', 2), ('tracking', 2), ('seeing', 1), ('shed', 1), ('using', 1), ('need', 1), ('check', 1), ('visualize', 1), ('thinking', 1), ('fix', 1), VB, VBG ('build', 1), ('hopefully', 1), ('tell', 1), ('do', 1), ('processing', 1), ('deploy', 1), ('pushing', 1), ('coming', 1), ('aggregate', 1), ('changing', 1), ('present', 1), ('making', 1), ('make', 1)]
  • 17. What do we know thus far? NN, NNS are nouns. The most common nouns together sound like it’s a text related to something technical: 'web', 'story', 'lot', 'site', ‘data', 'cycle', 'order', 'data', 'icons', 'users', 'dashboards' The verbs that we pulled (VB, VBG) sound typical by themselves but are useful. Most common verbs: 'have', 'presenting', 'be', 'pulling', 'want', 'know', 'tracking'
  • 18. Let’s try and make phrases verb + noun combinations
  • 19.
  • 20. verb + noun phrases 'have data', 'have icons', 'have users', 'have dashboards', 'have needs', 'presenting data', 'presenting icons', 'presenting users', 'presenting dashboards', 'presenting needs', 'be data', 'be icons', 'be users', 'be dashboards', 'be needs', 'pulling data', 'pulling icons', 'pulling users', 'pulling dashboards', 'pulling needs', 'want data', 'want icons', 'want users', 'want dashboards', 'want needs'
  • 21. So what is the text?
  • 22.
  • 23. Push, Pull, Present Abstract text was pushed to PyGotham and stored More realistic example would be collecting click tracking data from a web frontend. Abstract text was pulled once, but constantly pulled from to manipulate, extract, and sift through. Typical jobs like data cleansing and sorting would be in the pull process. Matplotlib and standard Python library utils were used to present Whether presenting slides or a Jupyter Notebook, the medium to present does not matter.
  • 25. Teachers Pay Teachers Open marketplace for a community of millions of educators
  • 26.
  • 27. ◉ Searches are all education-focused ◉ Sellers want to rank higher to sell better ◉ We can’t make search changes to please everyone, but we listen carefully to our community ◉ We A/B test a lot TpT Search Notes
  • 28. Our data flow is simple...
  • 30. An example of exclusion search ◉ “apples bananas -pumpkins” ◉ Searches are education focused; suffixes are searched for (- ed, -ing, -ly, etc.) ◉ Looking through historical searches of 519,512,359 we found only a small fraction of searches that included a dash
  • 31. 0.0003905085153132998 Only 202, 874 of the 519M historical searches had a dash
  • 32.
  • 34. Problem Users are looking to click on the large file icon for a link, but it is just an image and not a link. Hypothesis If we change out the large file icon, users will click more on the Preview button instead and have a better user experience.
  • 35. A/B Variants A Variant: small icon B Variant: no icon
  • 36.
  • 37.
  • 38. Clicks per A/B variant Off Small Icon No icon clicks 430,656 26,790 0
  • 39. A/B tests in a nutshell hypothesis test analyze
  • 41.
  • 42.
  • 43.
  • 44. Fetch all weekday run activities that are less than 8km
  • 45.
  • 46. References ◉ Bird, Steven, Edward Loper and Ewan Klein (2009), Natural Language Processing with Python. O’Reilly Media Inc. ◉ stravalib: Python Strava API client https://github.com/hozn/stravalib ◉ Examples seen in this talk http://bit.ly/29zQDBb (https://github.com/drincruz/PyGotham-2016)
  • 47.
  • 49. THANKS! Any questions? You can find me at {github, twitter, irc, etc.} @drincruz