Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Lateral Keywords for Writers
(When the Google Keyword Planner isn’t enoug...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
About Ash
• SEO consultant, currently at
Suncorp Insurance (eight
brands)...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
KEYWORD RESEARCH BASICS
No longer enough
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Everybody’s doing it
• Most of us use the Google Keyword Planner to get a...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Search volume alone isn’t enough
• But we need a starting point.
5
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Give researched keywords to writers
• A keyword matrix ensures a good spr...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Search intent is important
• Intent can be Navigational, Informational,
C...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
CHECK OUT THE COMPETITION
So who is winning in your niche?
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Start with a ranking check
• Use your preferred rank-checking tool to see...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Count ranking keywords
• First get the count of keywords that rank.
10
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Derive the mean position
• Get the “average” position for each company.
11
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Invert the mean position
• “Inverting” means deducting the rank from 10, ...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Derive a “score”
• Score (say Allianz) = (C3*C5)+(D3*D5)
• Score = (6.1x2...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Ranking spreadsheet
• “Visibility” is important, but what is your way to
...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
The content writer’s dilemma
• The spreadsheet shows the
“winners”, not t...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
DEEP DIVE – TERM FREQUENCY
Looking for that lightbulb moment?
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Hat Tip to Eric Enge
• See his articles in Moz:
– Just Google “Eric Enge ...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Inverse Document Frequency
• Inverse Document Frequency – a measure of th...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
TF-IDF example
• Say, a document with 100 words contains the term
“cat” 3...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Term Frequency – Two ways to measure
• Term Frequency – how frequently th...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Getting back to Term Frequency…
21
• Search for your keyword.
• Visit the...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Get n-grams
• Use one of the old “keyword density” tools to get 1-
word, ...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
TF – one worksheet per keyword
• Six sets of n-grams on the left and de-d...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
TF – close-up
• Get a count of each word or phrase used by the top
five p...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
TF – close-up
• Next, do the TF number crunching, i.e.
𝑇𝐹(𝑡) = 0.5 + 0.5 ...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
TF – close-up
• Use conditional formatting to pick a range of TF values
a...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
The extracted words
• The pages I was analysing did not contain some
“obv...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
The future?
• Working on a web
version
• Takes minutes, not
hours.
28
Sli...
#pubcon
http://ash.nallawalla.com
@ashnallawalla
Summary
• Keyword research requires more than the Google
tool. Do lateral...
Upcoming SlideShare
Loading in …5
×

Using TF-IDF for Lateral Keyword Research

1,152 views

Published on

The full version of the slide deck from my Pubcon 2015 presentation. Term frequency can be used to isolate significant keywords used by your competitors.

Published in: Marketing
  • Be the first to comment

Using TF-IDF for Lateral Keyword Research

  1. 1. #pubcon http://ash.nallawalla.com @ashnallawalla Lateral Keywords for Writers (When the Google Keyword Planner isn’t enough) Presented by: Ash Nallawalla SEO Strategist, Suncorp Insurance
  2. 2. #pubcon http://ash.nallawalla.com @ashnallawalla About Ash • SEO consultant, currently at Suncorp Insurance (eight brands) • Moderator at Webmasterworld forums • Previously in enterprise SEO roles, notably NAB, ANZ Bank, Ubank, Optus and Yellow Pages 2
  3. 3. #pubcon http://ash.nallawalla.com @ashnallawalla KEYWORD RESEARCH BASICS No longer enough
  4. 4. #pubcon http://ash.nallawalla.com @ashnallawalla Everybody’s doing it • Most of us use the Google Keyword Planner to get a feel for the most searched terms. • Our competitors do that too. 4
  5. 5. #pubcon http://ash.nallawalla.com @ashnallawalla Search volume alone isn’t enough • But we need a starting point. 5
  6. 6. #pubcon http://ash.nallawalla.com @ashnallawalla Give researched keywords to writers • A keyword matrix ensures a good spread of keywords across the site and saves the writer from guessing keywords. 6
  7. 7. #pubcon http://ash.nallawalla.com @ashnallawalla Search intent is important • Intent can be Navigational, Informational, Commercial, Transactional. • Yes, check out some tools. 7
  8. 8. #pubcon http://ash.nallawalla.com @ashnallawalla CHECK OUT THE COMPETITION So who is winning in your niche?
  9. 9. #pubcon http://ash.nallawalla.com @ashnallawalla Start with a ranking check • Use your preferred rank-checking tool to see who is ranking for each keyword. • We want to check which company’s content is consistently coming up on Page 1 for a number of similar keywords. 9
  10. 10. #pubcon http://ash.nallawalla.com @ashnallawalla Count ranking keywords • First get the count of keywords that rank. 10
  11. 11. #pubcon http://ash.nallawalla.com @ashnallawalla Derive the mean position • Get the “average” position for each company. 11
  12. 12. #pubcon http://ash.nallawalla.com @ashnallawalla Invert the mean position • “Inverting” means deducting the rank from 10, so that a higher number denotes a higher rank. 12
  13. 13. #pubcon http://ash.nallawalla.com @ashnallawalla Derive a “score” • Score (say Allianz) = (C3*C5)+(D3*D5) • Score = (6.1x21)+(4.0x3) = 141 13
  14. 14. #pubcon http://ash.nallawalla.com @ashnallawalla Ranking spreadsheet • “Visibility” is important, but what is your way to measure it? • Which competitor is more visible? 14
  15. 15. #pubcon http://ash.nallawalla.com @ashnallawalla The content writer’s dilemma • The spreadsheet shows the “winners”, not the “losers”. We can see who is using the most searched phrases. • Others are using the same tools. • So what content are they using that you are not using? • (Note: Ranking involves many other factors and this is also about Selling!) 15
  16. 16. #pubcon http://ash.nallawalla.com @ashnallawalla DEEP DIVE – TERM FREQUENCY Looking for that lightbulb moment?
  17. 17. #pubcon http://ash.nallawalla.com @ashnallawalla Hat Tip to Eric Enge • See his articles in Moz: – Just Google “Eric Enge TF-IDF” for the URLs. (click the image below if you have the PPT) 17
  18. 18. #pubcon http://ash.nallawalla.com @ashnallawalla Inverse Document Frequency • Inverse Document Frequency – a measure of the “rareness” of a term, so we weigh down the stop words and scale up the rare ones. 𝐼𝐷𝐹(𝑡) = log[ 𝐶𝑜𝑢𝑛𝑡 𝑜𝑓 𝑡𝑒𝑟𝑚 𝑡 𝑜𝑛 𝑎 𝑝𝑎𝑔𝑒 𝑇𝑜𝑡𝑎𝑙 𝑐𝑜𝑢𝑛𝑡 𝑜𝑓 𝑡𝑒𝑟𝑚𝑠 𝑜𝑛 𝑎 𝑝𝑎𝑔𝑒 ] • Refer to Eric’s second article for more details. 18
  19. 19. #pubcon http://ash.nallawalla.com @ashnallawalla TF-IDF example • Say, a document with 100 words contains the term “cat” 3 times. • The TF is 3/100 x 0.5 + 0.5 = 0.515 • Google has, say, 30 trillion pages and the word “cat” appears in 1.7 billion pages. • The IDF is log(30,000,000,000,000 /1,700,000,000) or log(730,2.718281828) = 6.593044535 • The TF-IDF (or TF*IDF) weight is the product of: 0.515 x 6.593044535 = 3.395417935 19
  20. 20. #pubcon http://ash.nallawalla.com @ashnallawalla Term Frequency – Two ways to measure • Term Frequency – how frequently the term appears in a document (incl. stop words) 𝑻𝑭(𝒕) = 𝟎. 𝟓 + 𝟎. 𝟓 ∗ 𝑹𝒂𝒘 𝑻𝒆𝒓𝒎 𝑪𝒐𝒖𝒏𝒕 𝑪𝒐𝒖𝒏𝒕 𝒇𝒐𝒓 𝑴𝒐𝒔𝒕 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒕 𝑻𝒆𝒓𝒎 𝒐𝒏 𝑷𝒂𝒈𝒆 Or If Raw Term Count > 0, TF = 1+log10(Raw Term Count) If Raw Term Count = 0, TF = 0 20
  21. 21. #pubcon http://ash.nallawalla.com @ashnallawalla Getting back to Term Frequency… 21 • Search for your keyword. • Visit the page/s of the highest ranking company and the next four top rankers. • Note their URLs. • Note the URL of your own page. • Do a Term Frequency analysis and, perhaps Inverse Document Frequency analysis (TF-IDF).
  22. 22. #pubcon http://ash.nallawalla.com @ashnallawalla Get n-grams • Use one of the old “keyword density” tools to get 1- word, 2-word, and 3-word pairs from your site and the five competitors. • Collate, de-dupe n-grams and place in Eric’s spreadsheet. 22
  23. 23. #pubcon http://ash.nallawalla.com @ashnallawalla TF – one worksheet per keyword • Six sets of n-grams on the left and de-duped list in grey zone. 23 Eric’s spreadsheet
  24. 24. #pubcon http://ash.nallawalla.com @ashnallawalla TF – close-up • Get a count of each word or phrase used by the top five pages and by yours 24
  25. 25. #pubcon http://ash.nallawalla.com @ashnallawalla TF – close-up • Next, do the TF number crunching, i.e. 𝑇𝐹(𝑡) = 0.5 + 0.5 ∗ 𝑅𝑎𝑤 𝑇𝑒𝑟𝑚 𝐶𝑜𝑢𝑛𝑡 𝐶𝑜𝑢𝑛𝑡 𝑓𝑜𝑟 𝑀𝑜𝑠𝑡 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑡 𝑇𝑒𝑟𝑚 𝑜𝑛 𝑃𝑎𝑔𝑒 25
  26. 26. #pubcon http://ash.nallawalla.com @ashnallawalla TF – close-up • Use conditional formatting to pick a range of TF values and compare your TF column with the average TF of the competitors. • You will now see “significant” words to consider. 26
  27. 27. #pubcon http://ash.nallawalla.com @ashnallawalla The extracted words • The pages I was analysing did not contain some “obvious” words – this is the beauty of this technique. 27
  28. 28. #pubcon http://ash.nallawalla.com @ashnallawalla The future? • Working on a web version • Takes minutes, not hours. 28 Sliders Gems Beta: http://www.lateralkeywords.com
  29. 29. #pubcon http://ash.nallawalla.com @ashnallawalla Summary • Keyword research requires more than the Google tool. Do lateral keyword research. • Do consider Term Frequency at least. Also look into Inverse Document Frequency. • Download full PPT from: http://www.trainsem.com/pubcon 29 Ash Nallawalla • Twitter: @ashnallawalla • Email: ash@nallawalla.com • Web: http://ash.nallawalla.com

×