SlideShare a Scribd company logo
1 of 83
Short URLs, Big Fun:
Understanding the World in Realtime


            Hilary Mason
         Chief Scientist, bitly

              @hmason
              h@bit.ly
http://www.pcworld.com/article/223409/move_over_dr_soong_
girls_can_build_android_apps_too.html




                        http://bit.ly/hOnbWg
[fireplace]
How do we change the world?
Can we understand the world, first?
Big Data
Data
10s of millions of URLs per day
100s of millions of clicks per day



10s of billions of URLs
encodes
{"g": "zalAU0",
 "i": "173.213.X.X",
"h": "zalAU0",
"l": "bitly",
"u": "http://www.amazon.com/Country-Life-Cal-Mag-
Potassium-Target-
Tablets/dp/B0001VUZ3A?SubscriptionId=AKIAJGA7
AAB6QE7WENSQ&tag=mycellrevi-
20&linkCode=sp1&camp=2025&creative=165953&cr
eativeASIN=B0001VUZ3A", "t": 1328266799,
"_id": "4f2bbe2f-0035d-063a1-3d1cf10a"}
decode
{"a": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)",
"c": "US",
"nk": 1,
"tz": "America/New_York",
"gr": "NY",
"g": "xNaZ9h",
"i": "98.118.X.X",
"h": "wXxuKW",
"k": "4eefe4be-003e4-X-X",
"l": "moma",
"al": "en-US", "hh":
"bit.ly",
"r":
"http://www.facebook.com/l.php?u=http%3A%2F%2Fbit.ly%2FwXxuKW&h=rAQG_ZZ2G
AQGQth1IOej-9_KHmbpEvh6FlllZjsAqg6A7Rw",
"u": "http://www.brainpickings.org/index.php/2012/02/02/jackson-pollock-father-letter/",
"t": 1328272481,
"hc": 1328232072,
"cy": "East Amherst",
"ll": [43.044101715087891, -78.694900512695312]}
a link
•   URL
•   Content
•   Ref distribution
•   Geo distribution
•   Language
•   Key phrases
•   Topic
Data Science?




Analytics   Science
Data Science?




Things you can                   Things you
just count.                      can’t.
Data scientists?

     engineering
                                math

                     nerds


           nerds               nerds



                     nerds
comp sci
                             hacking




                   awesome nerds
bitly science team!
What can we learn from a lot of
 people talking to each other?
A few things that we can count...
How do people use different
        devices?
What happens on the internet when
       society isn’t stable?
Revolution.
(Silly Things on the Internet)
the cutest kitten
A few things that we can count...

            cleverly.
What spoken languages are in a
           page?
raw data
"es"
"en-us,en;q=0.5"
"pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4"
"en-gb,en;q=0.5"
"en-US,en;q=0.5"
"es-es,es;q=0.8,en-us;q=0.5,en;q=0.3”
"de, en-gb;q=0.9, en;q=0.8"
entropy calculation
def ghash2lang(g, Ri, min_count=3, max_entropy=0.2):
  ""”
  returns the majority vote of a langauge for a given hash
  ""”
  lang = R.zrevrange(g,0,0)[0]
  # let's calculate the entropy!
  # possible languages
  x = R.zrange(g,0,-1)
  # distribution over those languages
  p = np.array([R.zscore(g,langi) for langi in x])
  p /= p.sum()
  # info content
  I = [pi*np.log(pi) for pi in p]
  # entropy: smaller the more certain we are! - i.e. the lower our surprise
  H = -sum(I)/len(I) #in nats!
  # note that this will give a perfect zero for a single count in one language
  # or for 5K counts in one language. So we also need the count..
               count = R.zscore(g,lang)
  if count < min_count and H > max_entropy:
      return lang, count
  else:
      return None, 1
http://4sq.com/96kc1O
What’s the context
 around a link?
[demo]
Things we have to think about.

           (science)
What’s a human?
normal click distributions
abnormal click distributions
Organic vs Inorganic?
AT SCALE
1. Research offline

2. Do fancy math – find the shortcuts

3. Design infrastructure

4. Re-design to run at scale and speed
Realtime Search
Realtime Search
Attributes calculated either at index time or
query time.

Rankings can vary by second.
[demo]
What are people paying
attention to right now?
actual rate of clicks on phrases
vs
expected rate of clicks on phrases
Dragoneye
We calculate clickrate with a sort of moving
average:




          where
Dragoneye
We represent as a sum of delta spikes.

This simplifies to:
Dragoneye
Choosing    is important.

It must be interpretable, and smooth (but not
too smooth).

We use a distribution for that is a function
that sums to 1. The function is 0 at the
origin.
[demo]
philosophy
simple math > fancy math
How do we know
when we’ve won?
How do we communicate what
  we’ve learned effectively?
Ask the crazy questions.
Thank you!




h@bit.ly
@hmason

More Related Content

Similar to Short URLs, Big Fun

Data visualization for development
Data visualization for developmentData visualization for development
Data visualization for developmentSara-Jayne Terp
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internettkisason
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013Ken Mwai
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionLucidworks
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Understanding Artificial Intelligence
Understanding Artificial Intelligence Understanding Artificial Intelligence
Understanding Artificial Intelligence St. Petersburg College
 
October hug
October hugOctober hug
October hughuguk
 
Python 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute BeginnersPython 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute BeginnersSai Linn Thu
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Artificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher CurrinArtificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher CurrinChristopher Currin
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Eli White
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...PyData
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Web technology: Web search
Web technology: Web searchWeb technology: Web search
Web technology: Web searchVictor de Boer
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Get connected with python
Get connected with pythonGet connected with python
Get connected with pythonJan Kroon
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data ScienceKrishna Sankar
 

Similar to Short URLs, Big Fun (20)

Data visualization for development
Data visualization for developmentData visualization for development
Data visualization for development
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internet
 
Just the basics_strata_2013
Just the basics_strata_2013Just the basics_strata_2013
Just the basics_strata_2013
 
Webinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with FusionWebinar: Modern Techniques for Better Search Relevance with Fusion
Webinar: Modern Techniques for Better Search Relevance with Fusion
 
Progressing and enhancing
Progressing and enhancingProgressing and enhancing
Progressing and enhancing
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Understanding Artificial Intelligence
Understanding Artificial Intelligence Understanding Artificial Intelligence
Understanding Artificial Intelligence
 
October hug
October hugOctober hug
October hug
 
Python 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute BeginnersPython 101 for Data Science to Absolute Beginners
Python 101 for Data Science to Absolute Beginners
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Artificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher CurrinArtificial Assistants: How can I help you? by Christopher Currin
Artificial Assistants: How can I help you? by Christopher Currin
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011
 
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
Dark Data: A Data Scientists Exploration of the Unknown by Rob Witoff PyData ...
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Web technology: Web search
Web technology: Web searchWeb technology: Web search
Web technology: Web search
 
Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Get connected with python
Get connected with pythonGet connected with python
Get connected with python
 
Machine Learning for dummies!
Machine Learning for dummies!Machine Learning for dummies!
Machine Learning for dummies!
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
 

More from Hilary Mason

PyCon 2011 Keynote
PyCon 2011 KeynotePyCon 2011 Keynote
PyCon 2011 KeynoteHilary Mason
 
Machine Learning for Web Data
Machine Learning for Web DataMachine Learning for Web Data
Machine Learning for Web DataHilary Mason
 
A Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime WebA Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime WebHilary Mason
 
IgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell ScriptIgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell ScriptHilary Mason
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in PythonHilary Mason
 
Have data? What now?!
Have data? What now?!Have data? What now?!
Have data? What now?!Hilary Mason
 
JWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAXJWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAXHilary Mason
 
Analytics for Virtual Worlds
Analytics for Virtual WorldsAnalytics for Virtual Worlds
Analytics for Virtual WorldsHilary Mason
 
Experiential Learning in Second Life
Experiential Learning in Second LifeExperiential Learning in Second Life
Experiential Learning in Second LifeHilary Mason
 
Virtual Worlds in Education
Virtual Worlds in EducationVirtual Worlds in Education
Virtual Worlds in EducationHilary Mason
 

More from Hilary Mason (10)

PyCon 2011 Keynote
PyCon 2011 KeynotePyCon 2011 Keynote
PyCon 2011 Keynote
 
Machine Learning for Web Data
Machine Learning for Web DataMachine Learning for Web Data
Machine Learning for Web Data
 
A Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime WebA Data-driven Look at the Realtime Web
A Data-driven Look at the Realtime Web
 
IgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell ScriptIgniteNYC: How to Replace Yourself With a Very Small Shell Script
IgniteNYC: How to Replace Yourself With a Very Small Shell Script
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in Python
 
Have data? What now?!
Have data? What now?!Have data? What now?!
Have data? What now?!
 
JWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAXJWU Guest Talk: JavaScript and AJAX
JWU Guest Talk: JavaScript and AJAX
 
Analytics for Virtual Worlds
Analytics for Virtual WorldsAnalytics for Virtual Worlds
Analytics for Virtual Worlds
 
Experiential Learning in Second Life
Experiential Learning in Second LifeExperiential Learning in Second Life
Experiential Learning in Second Life
 
Virtual Worlds in Education
Virtual Worlds in EducationVirtual Worlds in Education
Virtual Worlds in Education
 

Recently uploaded

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Recently uploaded (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Short URLs, Big Fun

Editor's Notes

  1. This is going to be a talk for people who love the internet.
  2. The true story of bitly, engineering, data science, loveHow to do data science at scaleBuilding teams and keeping people happyClever tricks
  3. Thomas Karaganis at MS Research 1% of new URLs per day
  4. Shortened links get shared on different platforms and methods
  5. Messages move across platforms in complicated ways
  6. We have a lot of growing up to do
  7. http://www.flickr.com/photos/wanderingnome/73328967/sizes/l/in/photostream/Philosphy, science, engineering, cool tricks
  8. Pow! Surprise! Here we are!
  9. …first, we understand it
  10. …first, we understand it
  11. Asking questions.
  12. http://www.flickr.com/photos/32443746@N07/4753829490/
  13. Egypt.
  14. Tunisia.
  15. …and we can do the same thing for geo data
  16. Studied offline, using hadoopBuild a supervised classifier over timeseriesBuild a random forest ensemble decision tree classifier
  17. The data system fortune cookie gameCreative commons: http://www.flickr.com/photos/mzn37/308048794/sizes/o/in/photostream/
  18. The simplification is important for three reasons:1) A continuous function of time that simplifies to \\phi2) it’s linear, so the sum of the click rates on each page with a phrase is the click rate per phrase3) IT’S FAST
  19. The 0 at the origin insures that we have seen sustained click rates on a phrase before we think it’s anything useful.