The goal of Penguin is to reduce the trust that Google has in sites that have cheated by creating unnatural backlinks to gain advantage in the Google search results.
While the primary focus of Penguin is on unnatural links. Links, though, are known to be by far the most important thing to look at.
5.
that have been trying to game Google by
buying or otherwise “unnaturally
acquiring” links to their website.
Google Penguin aims to
remove and devalue sites from the
search results
เพนกวินมีหน้าที่ กําจัดและลดคุณภาพของเว็บไซต์ที่
พยายามซื้อหรือทําลิงก์เข้ามาเว็บไซต์แบบผิดธรรมชาติ
6. Rolled out on April 24, 2012.
The goal of Penguin is to reduce the trust that Google
has in sites that have cheated by creating unnatural
backlinks to gain advantage in the Google search results.
While the primary focus of Penguin is on unnatural links. Links,
though, are known to be by far the most important thing to look at.
7. “Penguin is strictly algorithmic in
nature. It cannot be lifted by Google
manually, regardless of the reason why
those links might be pointing to a website.”
9. PENGUIN USES
ADVANCED STATISTICAL ANALYSIS.
PENGUIN IS NOT A TRUST RANK ALGORITHM.
PENGUIN DOES NOT USE MACHINE LEARNING.
PENGUIN ทํางานด้วยการวิเคราะห์สถิติ ไม่ใช่ MACHINE LEARNING
10. Penguin identifies link spam by link-based
statistics.
Let’s look at some examples together to wrap
our head around what it’s really all about.
11. 1. Percentage of inbound links containing an anchor text
https://www.enablerspace.com
Check out our Digital Services
This is an
anchor text
WEBSITE A: 100 LINKS TOTAL
- 20 LINKS WITH ANCHOR TEXT
- 80 LINKS WITHOUT ANCHOR TEXT
WEBSITE B: 100 LINKS TOTAL
- 60 LINKS WITH ANCHOR TEXT
- 40 LINKS WITHOUT ANCHOR TEXT
12. 2. Ratio of inlinks to home page VS inner pages
https://www.enablerspace.com
https://www.enablerspace.com/contact-us
WEBSITE A: 100 LINKS TOTAL
90 LINKS TO HOME PAGE
5 LINKS TO INNER PAGE A
5 LINKS TO INNER PAGE B
WEBSITE B: 100 LINKS TOTAL
30 LINKS TO HOME PAGE
30 LINKS TO INNER PAGE A
40 LINKS TO INNER PAGE B
13. 3. Ratio of outlinks to inlinks
CHECK OUR FRIENDS AT
WWW.TEST-1.COM
WWW.ENABLERSPACE.COM
(YOUR WEBSITE)
CHECK OUR FRIENDS AT
WWW.ENABLERSPACE.COM
WWW.TEST-1.COM
out-link
in-link
(backlink)
WEBSITE A:
- 100 BACKLINKS
- 0 OUTLINKS
WEBSITE B:
- 100 BACKLINKS
- 37 OUTLINKS
14. 4. Edge-reciprocity (high PageRank spam sites feature low
reciprocal link patterns)
A B C D E
F G H I J
A B C D E
F G H I J
K
L
M
N
O
WEBSITE A
WEBSITE B
15. Penguin is a Link Ranking algorithm, not web ranking
algorithm. Google’s patent on it in 2006:
“…a system that ranks pages on the web based on
distances between the pages, wherein the pages are
interconnected with links to form a link-graph. More
specifically, a set of high-quality seed pages are chosen
as references for ranking the pages in the link-graph,
and shortest distances from the set of seed pages to
each given page in the link-graph are computed.”
17. The closer you are to the origin of that juice (seed pages) the better
18.
19.
20. “Generally, it is desirable to use large number of seed pages…
Unfortunately, this variation of PageRank requires solving the entire
system for each seed separately. Hence, as the number of seed
pages increases, the complexity of computation increases
linearly, thereby limiting the number of seeds that can be
practically used.
Hence, what is needed is a method… for producing a ranking for
pages on the web using a large number of diversified seed pages
without the problems of the above-described techniques.”
“…it would be desirable to have a largest possible set of seeds
that include as many different types of seeds as possible.”
GOOGLE’S PATENT SAYS:
21.
22. If not associated in some ways with the Seed Set, the
website has no chance of ranking
Note: must also not be heavily associated with the spam cliques
SEED PAGESNON-SEED PAGES
1. Good sites tend not to link to bad sites.
2. Bad sites tend to link to good sites.
24. GOOD RULE OF THUMB
IS
YMYL
How do I know which sites are the seed pages?
25. There’s a map of the entire Internet commonly known as
the Link Graph and then there’s a smaller version of the
link graph that is populated by web pages that have had
spam pages filtered out.
This filtered version of the web is the
reduced link graph.
IMPORTANT!
source: searchenginejournal
27. TAKEAWAY 1: Sites that primarily have
inbound and outbound link relationships with
pages outside of the reduced link graph will
never get inside and consequently will be shut
out of the top ten ranking positions. Spam
links give no traction.
IMPORTANT!
source: searchenginejournal
Tier 2
Tier 3
Tier 4
Tier 5
28. TAKEAWAY 2: Because this algorithm stops
spam links from having any influence
(positive or negative), spam links have no
effect on high-quality sites. In this
algorithm, a link either helps a site rank or it
does not help a site rank.
IMPORTANT!
source: searchenginejournal
29. TAKEAWAY 3: The twin effects of identifying
spam sites and shutting them out are the
effects inherent in the concept of the
reduced link graph.
IMPORTANT!
source: searchenginejournal
30. The point of Penguin, is not to attach a spam
label on spam sites and a trusted label on
normal sites.
The point is to get to the reduced link graph.
The reduced link graph is the goal of Penguin
because it filters out the sites that are trying to
unfairly influence the algorithm.
IMPORTANT!
source: searchenginejournal
31. What’s interesting about the concept of a
reduced link graph is that it neatly fits into
the what we know about Penguin.
Penguin excludes sites from ranking. With
the Penguin algorithm, you are either in the
game or you are out of the game and have
no chance of ranking.
source: searchenginejournal