82. Part 1: Search console
Part 2: Data Studio
Part 3: APIs
Part 4: Data warehousing
83. Data studio for extracting
data
● Add a Google search
console data source
● Create a table for it.
● Download the table.
You’ll get everything in the
table.
84. Part 1: Search console
Part 2: Data Studio
Part 3: Python
Part 4: Data warehousing
85. Getting data from APIs
Pull down your analytics data.
● Daily_google_analytics_v3
● Getting search console
data from the API
86. Getting data from APIs
Pull down your analytics data.
● Daily_google_analytics_v3
● Getting search console
data from the API
Getting started with pandas:
● Pandas tutorial with
ranking data
87. Getting data from APIs
Pull down your analytics data.
● Daily_google_analytics_v3
● Getting search console
data from the API
Getting started with pandas:
● Pandas tutorial with
ranking data
As a workflow I’d highly
recommend Jupyter notebooks
for getting started.
● Why use jupyter
notebooks?
● SearchLove Video (paid)
88. SEO Pythonistas
A memorial and soon to be
collection of Hamlet’s excellent
work.
SEO Pythonistas - In loving
memory of Hamlet Batista
@DataChaz
89. Part 1: Search console
Part 2: Data Studio
Part 3: Python
Part 4: Data warehousing
114. Hi x
I’m {x} from {y} and we’ve been asked to do some log analysis to understand better how Google is behaving on the website and I was hoping you could help with some questions about the log set-up (as well as with getting the logs!).
What time period do we want?
What we’d ideally like is 3-6 months of historical logs for the website. Our goal is to look at all the different pages search engines are crawling on our website, discover where they’re spending their time, the status code errors they’re
finding etc.
We can absolutely do analysis with a month or so (we've even done it with just a week or two), but it means we lose historical context and obviously we're more likely to lose things on a larger side.
There are also some things that are really helpful for us to know when getting logs.
Do the logs have any personal information in?
We’re just concerned about the various search crawler bots like Google and Bing, we don’t need any logs from users, so any logs with emails, or telephone numbers etc. can be removed.
Can we get logs from as close to the edge as possible?
It's pretty likely you've got a couple different layers of your network that might log. Ideally we want those from as close to the edge as possible. This prevents a couple issues:
● If you've got caching going on, like a CDN or Varnish then if we get logs from after them, we won't see any of the requests they answer.
● If you've got a load balancer distributing to several servers sometimes the external IP gets lost (perhaps X-Forwarded-For isn't working), which we need to verify Googlebot or we accidentally only get logs from a couple
servers.
Are there any sub parts of your site which log to a different place?
Have you got anything like an embedded Wordpress blog which logs to a different location? If so then we’ll need those logs as well. (Although of course if you're sending us CDN logs this won't matter.)
How do you log hostname and protocol?
It's very helpful for us to be able to see hostname & protocol. How do you distinguish those in the log files?
Do you log HTTP & HTTPS to separate files? Do you log hostname at all?
This is one of the problems that's often solved by getting logs closer to the edge, as while many servers won't give you those by default, load balancers and CDN's often will.
Where would we like the logs?
In an ideal world, they would be files in an S3 bucket and we can draw them down from there. If possible, we'd also ask that multiple files aren't zipped together for upload, because that makes processing harder. (No problem with
compressed logs just, just zipping multiple log files into a single archive).
Is there anything else we should know?
Best,
{x}
115.
116.
117.
118.
119. ELK Stack
Pros
● Good for basic
monitoring.
● Your developers might
already have it.
Cons
● Quite hard to learn.
● Not great for analysis
past the basics.
120. AWS Athena
Pros
● If your logs are being
stored in AWS S3 it’s
very easy to set-up.
● Powerful analysis.
Cons
● Interface is clunky.
● SQL debugging isn’t
good.
121. BigQuery
Pros
● Best analysis platform.
● Easy to use.
● Excellent debugging.
Cons
● Someone will have to
actively load the data
into it.
123. Sampling your crawl
● Limit your crawl
percentage per
template.
i.e.
● 20% to product pages
● 30% to category pages
124. Low memory crawler
Runs locally on your machine
and allows you to crawl with a
very low memory footprint.
Doesn’t render JS or process
data however.
125. Run SF in the cloud
You can purchase a super high
memory computer in the
cloud, install SF on it and run it
at maximum speed.
142. Element Equals
Title Big Brown Shoe - £12.99 - Example.com
Status Code 200
H1 Big Brown Shoe
Canonical <link rel="canonical" href="https://example.com/product/big-brown-shoe" />
CSS Selector: #review-counter Any number
CSS Selector: #product-data {
"@context": "https://schema.org/",
"@type": "Product",
"name": "Big Brown Shoe",
"description": "The biggest brownest show you can find.",
"sku": "0446310786",
"mpn": "925872",
}
We are almost near the release of godzilla vs kong.
My finance is a huge fan of these movies, if you’ve not seen them, they are giant silly monster movies. Where cities are torn down and we all just live with this shit.
I think they have a very clear storyboading process. And anyone at any point who asks, is fire anyone who asks “how would this work” and replace them with someone who says “but what if he lived in atlantis”, “but what if dragons”
-------
“Godzilla fights his nemesis”
“What is his nemesis was a giant dragon”
“What if the giant dragon controlled all the other giant monsters”
“What if godzilla lived in atlantis”
“With a nuclear fountain”
They are just the master of escalation. They unfreeze and super dragon, another ancient lizard awakes to fight it, when it gets hurt it retreats to atlantis, where it heals with nuclear warheads.It’s joyful how huge and silly everything is. You hold your belief at the door, because in reality when things get large, everything that was previously easy suddenly becomes hard.
I do htink these movies are good example of how things become hard
Could we have a living version of king kong?I think King Kong is a really good example of the problems you get when you scale something up. SO RUN WITH ME for a minute.
Give me 3 minutes
We do have skull island
One they panicked and wasted half a day looking into this
2. They they had to filter it out of future reports