Scaling CREDIBLE Content
A Case Study
by Joe Griffin
Co-CEO, iAcquire
We do Scalable Content Marketing
Problem: Finding CREDIBLE Content
Producers at Scale is Tough
Within search results, information tied to verified online profiles will be ranked
higher than content without such verifi...
Okay, How Do We Find These Authors?
Operation Find and Rank Authors
- Round 1 – Low Scale
We Stalked Big Publishers That
Use Google Authorship
Nerdy Data Crawls Millions of
Websites & You Can Search HTML
Example – 55,000 Websites Found Using
the Google Maps API
Blekko Published a WebGrep on
Rel=Author – Top 500 Free
We Created Big Lists of Credible
Authors (like 100,000)
But, We Needed to Judge Their Relative
Proliferation…
Because Google Authorship is Becoming “People PageRank”
Necessity is the Mother of Invention
Operation Find and Rank Authors
- Round 2 – Big Scale
We Tapped into Big Data
& Built an Application
Big Data + Software Engineering = Scale
The Common Crawl Became a Seed Source
41 Million Domains, 4 Billion Pages and 210 Terabytes
GNIP Served as a Real-Time Engine
Firehoses from Twitter, Tumblr, WordPress, and more…
Example – You Can Pull Up to 2-Years
of Historical Tweets
Firehoses from Twitter, Tumblr, WordPress, and more…
We Used Amazon Cloud Search to Create
an Index of the Domains
Specifically URL’s that have authorship markup.
We Used GiraPh To Rank The Authors
Facebook Uses Giraph Too
The Result
VoiceGraph™	
  and VoiceRank™	
  ™
Follow iAcquire™	
  and ClearVoice™ for
Tools and Resources
You Stay Classy Manhattan
Upcoming SlideShare
Loading in...5
×

Scaling Credible Content

987

Published on

Learn how iAcquire scaled identification of credible content producers - with credibility being based on authorship proliferation.

Published in: Technology, Education
1 Comment
3 Likes
Statistics
Notes
No Downloads
Views
Total Views
987
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
1
Likes
3
Embeds 0
No embeds

No notes for slide

Scaling Credible Content

  1. 1. Scaling CREDIBLE Content A Case Study by Joe Griffin Co-CEO, iAcquire
  2. 2. We do Scalable Content Marketing
  3. 3. Problem: Finding CREDIBLE Content Producers at Scale is Tough
  4. 4. Within search results, information tied to verified online profiles will be ranked higher than content without such verification, which will result in most users naturally clicking on the top (verified) results. The true cost of remaining anonymous, then, might be irrelevance. – The New Digital Age by Eric Schmidt, Chairman of Google
  5. 5. Okay, How Do We Find These Authors?
  6. 6. Operation Find and Rank Authors - Round 1 – Low Scale
  7. 7. We Stalked Big Publishers That Use Google Authorship
  8. 8. Nerdy Data Crawls Millions of Websites & You Can Search HTML
  9. 9. Example – 55,000 Websites Found Using the Google Maps API
  10. 10. Blekko Published a WebGrep on Rel=Author – Top 500 Free
  11. 11. We Created Big Lists of Credible Authors (like 100,000)
  12. 12. But, We Needed to Judge Their Relative Proliferation… Because Google Authorship is Becoming “People PageRank”
  13. 13. Necessity is the Mother of Invention
  14. 14. Operation Find and Rank Authors - Round 2 – Big Scale
  15. 15. We Tapped into Big Data & Built an Application Big Data + Software Engineering = Scale
  16. 16. The Common Crawl Became a Seed Source 41 Million Domains, 4 Billion Pages and 210 Terabytes
  17. 17. GNIP Served as a Real-Time Engine Firehoses from Twitter, Tumblr, WordPress, and more…
  18. 18. Example – You Can Pull Up to 2-Years of Historical Tweets Firehoses from Twitter, Tumblr, WordPress, and more…
  19. 19. We Used Amazon Cloud Search to Create an Index of the Domains Specifically URL’s that have authorship markup.
  20. 20. We Used GiraPh To Rank The Authors Facebook Uses Giraph Too
  21. 21. The Result
  22. 22. VoiceGraph™  and VoiceRank™  ™
  23. 23. Follow iAcquire™  and ClearVoice™ for Tools and Resources
  24. 24. You Stay Classy Manhattan
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×