“URLs” vs “Links” vs “Pages”A URL (Uniform Resource Locator) is a type of URI (UniformResource Identifier), which contains instructions on how to get a“resource,” like a web page. ● Example URL: http://adnaan.comA link is something that you can click on. It’s the HTML that willget you to a URL.● Example Link HTML: ○ <a href="http://adnaan.com">Adnaans homepage</a>A page is the content pulled up when get whatever it is the URLhas pointed you to (i.e., all text, images, scripts, etc.)
Different URLs, Same Content. What’s Up?Any number of URLs could get you to the same content.Here are four URLs from POV’s Documentary Blog that will getyou to the same place:● http://www.pbs.org/pov/blog/2011/11/united-states-of-documentaries/● http://pbs.org/pov/blog/2011/11/united-states-of-documentaries/● http://www.pbs.org/pov/blog/2011/11/22/united-states-of-documentaries/● http://www.pbs.org/pov/blog/?p=891Sometimes the content appears identical, and sometimes it isidentical. Its hard to know, but the URL may offer clues.
PermalinksThe site will usually help you find the true URL, the “permalink” (alsocalled the “canonical URL”). Wordpress, for example, lets youdetermine your permalink structure and will also add hidden html codeto help crawlers point to the correct permalink.Why use a permalink?Google! Google may interpret multiple URLs as different pages andreduce their ranking in search results. Worse, Google might think thesame content is being copied and could penalize pages so much thatthey effectively aren’t seen by Google searchers.How to get a permalinkIf you aren’t automatically redirected, clicking on the title will usually getyou to the permalink (or you can right click and copy the destinationURL).
Does the "www." part matter?Yes, to search engines it does matter. A URLwith the www. and one without it might look likethey are from two different websites to a searchengine such as Google.This is one reason why sites redirect you to aURL that either does or doesn’t lead with www.For Google, all that matters is that a site isconsistent.
The Structure of a URLURL structure: scheme://domain:port/path?query#fragment_idQuery fragment structure: ?field=value&field=value&field=valueExample: http://www.google.com/?q=what+is+a+url&hl=enThis is a URL that you might get after searching for what is a url. Thescheme is http the domain is www.google.com, the first query field qhas the value what+is+a+url and the second query field hl has thevalue en. There is no port specified and no fragment_id in this URL.The optional query fragment is used for a “GET” request, traditionallyused to "get" data from a web server (like a keyword search in a searchengine or a search for a book based on a query of an ISBN number).You often see queries in URLs after submitting a form.
More on "GET" URLs"GET" URLs are now being used for other purposes, like trackingwhether a user is logged in, or tracking what pages a user visits.Example:http://www.shoppbs.org/category/index.jsp?categoryId=1412584&utm_source=facebook&utm_medium=post&utm_campaign=nova-dowThis URL contains a query section (bolded) that helps GoogleAnalytics track a marketing campaign (from Facebook for acampaign related to the PBS program NOVA) in addition toreturning a page containing a category of products (with thecategory ID 1412584).
Extra Junk:When Links Are Not PermalinksLinks may contain extra information in #fragment_id and/or ?query(such as tracking code) that isn’t part of the permalink. This informationmay remain in your browser’s URL bar as you continue to navigate asite. I call this “extra junk.” Its not (usually) a permalink.Don’t email/share/post URLs with “extra junk” on them. It can beincorrect and embarrassing for the sender, the receiver or both.It’s often possible to strip out extra junk via the #fragment_id and muchif not all of the ?query. You can usually strip out a part of the URL andtest it. If you get a 404 page or an unexpected page, then you canusually hit the back button to go the working URL. Many sites are nowusing #! in the middle of URLs. This makes it harder to determine thepermalink, but you can usually strip it out and some parts before it tofind the permalink.
Identifying Extra Junk● Look for # followed by a lot of random-looking characters. It’s probably tracking code.● Look for query fields that contain utm_. This is tracking code.● Look for search terms or text that looks like a URL in the query string. This is probably a way for the site to track what you did after you searched or navigated to this page. You can remove it.● Look for referral identifiers in the query string such as ref=. This is a way for webmasters to watch flow between pages. You can probably remove it.● Look for personal information in the query, such as your email or name.● Look for short query strings that just don’t seem necessary. For example ? hp at the end of a URL screams “you clicked on this link from the homepage”. Remove it and see that it has no impact.● Look for query sections that contain sessid. This is authentication code and not necessarily bad. You may be able to strip this out, but it’s possible the person you send this link to will need to log in to the page you are sending them to.
Secretly Good LinksNot all extra stuff is extra junk.For example, the printable version of the page might beonly accessible via the ?query fragmentA permalink to an article: http://movies.nytimes.com/2010/07/09/movies/09racing.htmlA permalink to its printable version: http://movies.nytimes.com/2010/07/09/movies/09racing.html?pagewanted=print
How to Fix Broken Links● Remove punctuation at the end of links: , or . or )● Remove “%20” (code for a space character). A space may have been added to the link. Sometimes a space gets added to the end of a URL in emails or while copying/pasting.● Is there capitals in your URL? It might make a difference.● Check for spelling error. The person who sent you the link might have typed it incorrectly.● Google the bad URL. The correct version might be the first hit.Note: Sometimes a site will always add extra junk even whenyou’ve typed in (or clicked on) a permalink. This is also done fortracking purposes. Don’t worry, if you’ve done everything else,you probably have the permalink URL. Don’t propagate the URL
URL Shorteners: BitlyLog in at http://bit.ly to create and customize trackable URLs.You can get stats on your Bitly link, or anyone’s custom Bitlylink by adding + to the end of the URL (e.g., for http://bit.ly/abc123, use http://bit.ly/abc123+). I go to the stats pagebefore clicking on links to reveal the destination URL and helpme decide whether I really want to go there before itpotentially tries to cause harm to my computer. Also, visitingthis info page will not generate a click or view on the Bitly link.
Other Link ShortenersMany link shorteners are actually powered by Bitly, such asto.pbs.org and nytim.es. These shortened links are moretrustworthy because, for these examples, you know theywill send you to pbs.org or nytimes.com.There are many shorteners used online, including:● t.co - Twitter● goo.gl - Google● fb.me - Facebook● ow.ly - Hootsuite (a social media management tool)Note: These could point to malicious URLs.
Links on Social MediaTwitterUse shortlinks to make tweets shorter (and more retweetable).Twitter converts all links to its own t.co URL, which is why you mightsee t.co on retweets. It’s how twitter tracks links and there’s nothingyou can do about it. If you have a t.co URL to share, try to figurewhat it expands to, shorten that, then share.Twitter will also truncate long URLs if you choose not to shorten.FacebookIt’s OK to use a link shortener (i.e., Bitly) or include campaign code(e.g., from Google Analytics) in your query string. Facebook knowshow to deal them then. But its a best practice to use full URLs, notshortened URLs within the text of your posts.
Be Wary of Links in Your Email● Links may not point to the URLs you expect them to.● If you don’t want newsletters to track you, don’t load images (used to see if you opened the email and open rates) and never click on a link (used to see what you clicked on, and overall clickthrough rates) -- pretty much every newsletter uses its own link shortener to track your clicks.● “Phishing” scams work when you click on the link that can track you. But just opening a spam email and loading an image is enough to track you. This is why you shouldn’t open spam links and you definitely should use an email reader that forces you to load images by default.● You can often find the page online without clicking the link in your email. Search for the title of the resource in a search engine, or search the site you expect the resource to come from.
Problems with Links in EmailsLong links might go onto a next line, leading to broken links as theemails are forwarded. So use a link shortener before you send a longlink. This has the added benefit that you can track clicks outside ofusing a newsletter service if you want to.Watch out for spaces in URLs, or its encoded version, %20. Spacesmay have been added that werent intended to be there. Often thespace or %20 will be at the end of the URL. Many sites will treat theseas different URLs, resulting in a “page not found” error. Try removingspaces or the %20 if a URL otherwise looks good and is giving you anerror when you try to load it.
Links & How Google WorksGoogles search engine is powered primarily by links.It’s a complication algorithm, but generally, the more sitesthat have links that point to a URL, the higher up that URLwill be search results for certain keywords.Links should contain useful text within them, becausesearch engines also examine the whole link, not just theURL. Link text on someone elses site that describes thecontent with keywords is better for you than Click here!.