DEEP WEB
WEB
• Surface Web – The surface web is the area of the internet that the average
person visits, such as visiting Facebook, Google, Amazon, or YouTube. These
areas can be accessed using a standard piece of software, such as a web
browser.
• Deep Web – The deep web is an area of the internet that is not indexable by
search engines and not linked to pages on the surface web.
• Dark Web – The dark web then is classified as a small portion of the deep web
that has been intentionally hidden and is inaccessible through standard web
browsers.
THE DEEP WEB
• Other names for this area of the internet are deep net, hidden web, or invisible
web.
• The term "deep web" was coined by Brightplanet in a 2001 white paper entitled
‘The Deep Web: Surfacing Hidden Value'.
• For most users, there are generally two different but related approaches to
access the deep web and darknet:
• Use special search engines accessed from regular browsers such as Internet
Explorer, Firefox, Chrome, Safari, etc.
• Use special search engines that can be accessed only from a TOR browser.
• Reasons that a web page is not crawlable :
 The web page could be password protected, which would prevent a web crawler from
accessing it.
 The web page is only allowed to be accessed a certain number of times, then it
becomes unavailable.
 A web page cannot be crawled is if the site’s robots.txt file explicitly says not to
crawl it.
 The last scenario that would cause a web page to be uncrawlable, is if the page is
simply hidden or not linked on any other page of the website.
SURFACE WEB VS. DEEP WEB
SURFACE WEB :
• Entries are statically generated
• Linked content (web crawled)
• Readily accessible through any
browser or search engine unlike the
deep web, which requires special
search engines, browsers, and proxies
to access.
DEEP WEB :
• Entries are dynamically generated
(submitted to a query or accessed
via form).
• Unlinked content
• Contextual web
• Private web
• Scripted content
• Non-html content
• Limited access content (anti-robot
protocols like captcha)
DEEP RESOURCES
• Dynamic web pages
• Returned in response to a submitted query or accessed only through a form.
• Unlinked contents
• Pages without any backlinks.
• Private web
• Sites requiring registration and login (password-protected resources).
• Limited access web
• Sites with captchas, no-cache pragma http headers.
• Scripted pages
• Page produced by java scripts, flash, AJAX etc.
• Non html contents
• Multimedia files e.g. images or videos
MYTHS OF DEEP WEB
• The deep web is only for underground criminals.
• The deep web is also used by privacy activists and ordinary people who don't want their
online activity tracked by government agencies or commercial data aggregators. In fact,
according to the TOR project, a branch of the U.S. Navy uses TOR and the dark web for
open source intelligence gathering; law enforcement uses it for security during sting
operations.
• You can't get to the deep web from a Google search.
• Now, search engines are much better at crawling sites that were once considered "deep
web." In fact, some deep web content is absent from search results because
google avoids crawling it.
• The deep web is 96% of the internet.
• Search engines continue to improve at crawling web pages that used to be considered
deep web content. So this statistic will permanently be in a state of flux.
CONTD.
• Deep web search is the same thing as the dark web.
• Forbes, Business Insider, and Daily Mail have confused the terms. Deep web search
terminology has almost nothing to do with content on the so-called dark web.
• You'll never use the deep web.
• On the contrary, you probably use it every day. When you want to pay your credit card bill,
you log into your online bank account. Your account information is considered deep web
content because Google won't index it.
ACCESSING DEEP WEB:
• DEEP WEB SEARCH
ENGINES
• TOR
DEEP WEB SEARCH ENGINES
• Used to access content on the deep web from the surface web.
• The content you can access with these search engines is limited, compared to the
content you can access with TOR.
• Some search engines that we can use for this are :
• The WWW Virtual Library
• Surfwax
• Icerocket
• These search engines have the capability to talk to the deep web hidden service via
TOR and its relays, resolve the .onion address.
TOR
• The Onion Router (TOR) is a free web browser, which is a variant of Firefox.
• Can run it on all the common platforms such as Windows, Mac OS X, and Linux.
• It provides a networking protocol that can keep the data being transmitted
across it anonymous.
• TOR creates an encrypted connection between your device and the TOR
network.
• TOR network allows people to access content that may be blocked in their
country.
CONTD.
• When using TOR :
• The packets go through several servers encrypted before reaching their destination.
• These servers are called TOR relays, which function as routers. There are thousands
of these servers across the world.
• When your packets get sent across the TOR network, it removes pieces of the header
that contain information that could identify where the packet is coming from or
where it is going.
• As your packets go from relay to relay, it decrypts just enough data to know which
tor relay the packet came from and where the next hop is. It does not decrypt any
additional information.
LEGAL ACTIVITY
• Journalists and dissidents to communicate with each other or for subjugated
persons to share their opinions without censorship.
• On some sites, survivors of abuse can discuss their experiences, name their
aggressors, or console their peers who would otherwise feel uncomfortable talking
about their experiences.
• Some countries subjugate their citizens on an arbitrary basis. The dark web offers
opportunities for people to form communities in a less extensively policed forum,
where they can share tips or plan to meet up in person.
• Access to books can also be restricted for a variety of factors, and the dark web
offers plenty of opportunities to read books that may either be doctored or entirely
prohibited in the analog world.
ILLEGAL ACTIVITY
• Popular destination for criminals to buy and sell information.
• Information that can be sold on the deep web are social security numbers, medical
records, credit card numbers, and other personally identifiable information (PII).
• The deep web is also used to sell :
• Drugs
• Display child pornography
• Trade weapons
• Hire hitmen.
• Forgery, gambling, abuse, hacking, sharing leaked data.
DARK WEB
• The dark web is an area that resides on the deep web.
• Several people confuse the deep web and the dark web thinking they are the same
thing.
• The dark web runs on the exact same infrastructure as the normal web – it is simply
explored in a different way, along different protocols.
• Intentionally hidden and is inaccessible through standard web browsers.
• The dark web is accessed only through special browsers, and by far the most
popular is TOR, which stands for The Onion Router.
• There are tons of hidden services available:
• Wikileaks – news hidden service.
• The Pirate bay and Sci-hub – search engines hidden services
• Bitcoin fog or Bitblender – financial hidden services
FUTURE OF DEEP WEB
• It will continue to become more secure
• It is likely for technological developments related to the dark web to improve the
stealthiness of darknets.
• Marketplaces will become stronger
• Trend micro foresees “the rise of new, completely decentralized marketplaces” that rely
on bitcoin’s blockchain technology.
• Bitcoins will become harder to track
• Cryptocurrencies go hand in hand with deep web marketplaces. We’ll see new, advanced
ways to make bitcoins even less traceable than they are now.
• More people will use it
• Increasing public awareness could mean “increased use of or interest in the dark web and
other similarly intended sites in the deep web”.
CONCLUSION
• Deep web is considered illegal by most of the people, but most of the people
don’t know the reality of deep web.
• Deep web is also used by organizations to keep their content private and
confidential.
• In the coming years the use of deep web will increase.
• While we use dark web, we must be vigil of the hazards.
REFERENCES
1. DANIEL SUI, JAMES CAVERLEE, DAKOTA RUDESILL. “SCIENCE + TECHNOLOGY
INNOVATION PROGRAM”. WILSON CENTER (AUGUST 2015): STIP 03.
2. MARCUS P. ZILLMAN. “DEEP WEB RESEARCH AND DISCOVERY RESOURCES 2019”.
3. WWW.NYTIMES.COM/2009/02/23/TECHNOLOGY/INTERNET/23SEARCH.HTML?TH&
EMC=TH
4. HTTPS://BRIGHTPLANET.COM/2014/03/CLEARING-CONFUSION-DEEP-WEB-VS-
DARK-WEB/
5. HTTPS://WWW.SANS.ORG/READING-ROOM/WHITEPAPERS/COVERT/OCEAN-
INTERNET-DEEP-WEB-37012
6. HTTPS://WWW.THEGLOBEANDMAIL.COM/TECHNOLOGY/TECH-NEWS/WHAT-IS-
THE-DARK-WEB-AND-WHO-USES-IT/ARTICLE26026082/
THANK YOU

Deep Web

  • 1.
  • 2.
    WEB • Surface Web– The surface web is the area of the internet that the average person visits, such as visiting Facebook, Google, Amazon, or YouTube. These areas can be accessed using a standard piece of software, such as a web browser. • Deep Web – The deep web is an area of the internet that is not indexable by search engines and not linked to pages on the surface web. • Dark Web – The dark web then is classified as a small portion of the deep web that has been intentionally hidden and is inaccessible through standard web browsers.
  • 4.
    THE DEEP WEB •Other names for this area of the internet are deep net, hidden web, or invisible web. • The term "deep web" was coined by Brightplanet in a 2001 white paper entitled ‘The Deep Web: Surfacing Hidden Value'. • For most users, there are generally two different but related approaches to access the deep web and darknet: • Use special search engines accessed from regular browsers such as Internet Explorer, Firefox, Chrome, Safari, etc. • Use special search engines that can be accessed only from a TOR browser.
  • 5.
    • Reasons thata web page is not crawlable :  The web page could be password protected, which would prevent a web crawler from accessing it.  The web page is only allowed to be accessed a certain number of times, then it becomes unavailable.  A web page cannot be crawled is if the site’s robots.txt file explicitly says not to crawl it.  The last scenario that would cause a web page to be uncrawlable, is if the page is simply hidden or not linked on any other page of the website.
  • 6.
    SURFACE WEB VS.DEEP WEB SURFACE WEB : • Entries are statically generated • Linked content (web crawled) • Readily accessible through any browser or search engine unlike the deep web, which requires special search engines, browsers, and proxies to access. DEEP WEB : • Entries are dynamically generated (submitted to a query or accessed via form). • Unlinked content • Contextual web • Private web • Scripted content • Non-html content • Limited access content (anti-robot protocols like captcha)
  • 7.
    DEEP RESOURCES • Dynamicweb pages • Returned in response to a submitted query or accessed only through a form. • Unlinked contents • Pages without any backlinks. • Private web • Sites requiring registration and login (password-protected resources). • Limited access web • Sites with captchas, no-cache pragma http headers. • Scripted pages • Page produced by java scripts, flash, AJAX etc. • Non html contents • Multimedia files e.g. images or videos
  • 8.
    MYTHS OF DEEPWEB • The deep web is only for underground criminals. • The deep web is also used by privacy activists and ordinary people who don't want their online activity tracked by government agencies or commercial data aggregators. In fact, according to the TOR project, a branch of the U.S. Navy uses TOR and the dark web for open source intelligence gathering; law enforcement uses it for security during sting operations. • You can't get to the deep web from a Google search. • Now, search engines are much better at crawling sites that were once considered "deep web." In fact, some deep web content is absent from search results because google avoids crawling it. • The deep web is 96% of the internet. • Search engines continue to improve at crawling web pages that used to be considered deep web content. So this statistic will permanently be in a state of flux.
  • 9.
    CONTD. • Deep websearch is the same thing as the dark web. • Forbes, Business Insider, and Daily Mail have confused the terms. Deep web search terminology has almost nothing to do with content on the so-called dark web. • You'll never use the deep web. • On the contrary, you probably use it every day. When you want to pay your credit card bill, you log into your online bank account. Your account information is considered deep web content because Google won't index it.
  • 10.
    ACCESSING DEEP WEB: •DEEP WEB SEARCH ENGINES • TOR
  • 11.
    DEEP WEB SEARCHENGINES • Used to access content on the deep web from the surface web. • The content you can access with these search engines is limited, compared to the content you can access with TOR. • Some search engines that we can use for this are : • The WWW Virtual Library • Surfwax • Icerocket • These search engines have the capability to talk to the deep web hidden service via TOR and its relays, resolve the .onion address.
  • 12.
    TOR • The OnionRouter (TOR) is a free web browser, which is a variant of Firefox. • Can run it on all the common platforms such as Windows, Mac OS X, and Linux. • It provides a networking protocol that can keep the data being transmitted across it anonymous. • TOR creates an encrypted connection between your device and the TOR network. • TOR network allows people to access content that may be blocked in their country.
  • 13.
    CONTD. • When usingTOR : • The packets go through several servers encrypted before reaching their destination. • These servers are called TOR relays, which function as routers. There are thousands of these servers across the world. • When your packets get sent across the TOR network, it removes pieces of the header that contain information that could identify where the packet is coming from or where it is going. • As your packets go from relay to relay, it decrypts just enough data to know which tor relay the packet came from and where the next hop is. It does not decrypt any additional information.
  • 14.
    LEGAL ACTIVITY • Journalistsand dissidents to communicate with each other or for subjugated persons to share their opinions without censorship. • On some sites, survivors of abuse can discuss their experiences, name their aggressors, or console their peers who would otherwise feel uncomfortable talking about their experiences. • Some countries subjugate their citizens on an arbitrary basis. The dark web offers opportunities for people to form communities in a less extensively policed forum, where they can share tips or plan to meet up in person. • Access to books can also be restricted for a variety of factors, and the dark web offers plenty of opportunities to read books that may either be doctored or entirely prohibited in the analog world.
  • 15.
    ILLEGAL ACTIVITY • Populardestination for criminals to buy and sell information. • Information that can be sold on the deep web are social security numbers, medical records, credit card numbers, and other personally identifiable information (PII). • The deep web is also used to sell : • Drugs • Display child pornography • Trade weapons • Hire hitmen. • Forgery, gambling, abuse, hacking, sharing leaked data.
  • 16.
    DARK WEB • Thedark web is an area that resides on the deep web. • Several people confuse the deep web and the dark web thinking they are the same thing. • The dark web runs on the exact same infrastructure as the normal web – it is simply explored in a different way, along different protocols. • Intentionally hidden and is inaccessible through standard web browsers. • The dark web is accessed only through special browsers, and by far the most popular is TOR, which stands for The Onion Router. • There are tons of hidden services available: • Wikileaks – news hidden service. • The Pirate bay and Sci-hub – search engines hidden services • Bitcoin fog or Bitblender – financial hidden services
  • 17.
    FUTURE OF DEEPWEB • It will continue to become more secure • It is likely for technological developments related to the dark web to improve the stealthiness of darknets. • Marketplaces will become stronger • Trend micro foresees “the rise of new, completely decentralized marketplaces” that rely on bitcoin’s blockchain technology. • Bitcoins will become harder to track • Cryptocurrencies go hand in hand with deep web marketplaces. We’ll see new, advanced ways to make bitcoins even less traceable than they are now. • More people will use it • Increasing public awareness could mean “increased use of or interest in the dark web and other similarly intended sites in the deep web”.
  • 18.
    CONCLUSION • Deep webis considered illegal by most of the people, but most of the people don’t know the reality of deep web. • Deep web is also used by organizations to keep their content private and confidential. • In the coming years the use of deep web will increase. • While we use dark web, we must be vigil of the hazards.
  • 19.
    REFERENCES 1. DANIEL SUI,JAMES CAVERLEE, DAKOTA RUDESILL. “SCIENCE + TECHNOLOGY INNOVATION PROGRAM”. WILSON CENTER (AUGUST 2015): STIP 03. 2. MARCUS P. ZILLMAN. “DEEP WEB RESEARCH AND DISCOVERY RESOURCES 2019”. 3. WWW.NYTIMES.COM/2009/02/23/TECHNOLOGY/INTERNET/23SEARCH.HTML?TH& EMC=TH 4. HTTPS://BRIGHTPLANET.COM/2014/03/CLEARING-CONFUSION-DEEP-WEB-VS- DARK-WEB/ 5. HTTPS://WWW.SANS.ORG/READING-ROOM/WHITEPAPERS/COVERT/OCEAN- INTERNET-DEEP-WEB-37012 6. HTTPS://WWW.THEGLOBEANDMAIL.COM/TECHNOLOGY/TECH-NEWS/WHAT-IS- THE-DARK-WEB-AND-WHO-USES-IT/ARTICLE26026082/
  • 20.

Editor's Notes

  • #4 The tip of the iceberg is seen at the surface of the ocean but it is a very small part of the whole iceberg, likewise.
  • #5 White Paper (in the UK) a government report giving information or proposals on an issue.
  • #13 National firewalls skipped using TOR
  • #20 There are many references, to name a few……..