RESEARCH INTHE
DEEPWEB
Outline
■ The SurfaceWebVS the Deep Web
■ DeepWeb vs DarkWeb
■ How to get deeper
■ DarkWeb access
■ Tools
SurfaceWeb
Anything that can be indexed by a
typical search engine.
DeepWeb
■ Quality content is 1,000 to 2,000 times greater than surface web
■ 95% of DeepWeb is accessible to public (no fees or subscription required)
■ The deep web is the internet that has not been indexed by commercial search engines such as Google and
Yahoo!.These search engines send out crawlers or spiders to index and catalog available web sites, but the
vast majority of the internet, 500X the surface net, includes many of the best resources which are not
indexed by these search engines. Such as:
– Unlinked pages:Spiders can’t find web pages that are not linked to others.
– Private intranets and web pages, password protected and more.
– Non HTML content: e.g., content rich, lots of different file type
– Library & Government Databases -How StuffWorks.com
Types of deep web
■ InvisibleWeb: Material that can be, but is not included in search engine results. EG:
new material added and not yet picked up.
■ PrivateWeb: Sites intentionally excluded from search engine results. Ex: password
protected
■ Subscriptions websites & databases: ESPN Insider, (Library Databases) Proquest
databases, EbscoHost. -How StuffWorks.com
How to get deeper
■ Directories
– http://www.dmoz.org/
■ Specialized Search Engines
– http://biznar.com/biznar/desktop/en/search.html
– http://www.wolframalpha.com/
– http://archive.org/web/web.php
■ Academic Databases
– Jstor
– Science Direct
■ Specialized Databases
– https://www.archives.gov/
■ GovernmentWebsites
– https://www.data.gov/
– https://www.govinfo.gov/
The DarkWeb
■ The Dark web: An area of the web which information is hidden deep and anonymously.
■ The DarkWeb then is a small portion of the DeepWeb that has been intentionally
hidden and is inaccessible through standard web browsers.
"The deepWeb may be a shadow land of untapped potential, but with a bit of skill and
some luck, you can illuminate a lot of valuable information that many people worked to
archive. On the dark Web, where people purposely hide information, they'd prefer it if
you left the lights off.
The dark Web is a bit like the Web's id. It's private. It's anonymous. It's powerful. It
unleashes human nature in all its forms, both good and bad.
The bad stuff, as always, gets most of the headlines.You can find illegal goods and
activities of all kinds through the dark Web.That includes illicit drugs, child pornography,
stolen credit card numbers, human trafficking, weapons, exotic animals, copyrighted
media and anything else you can think of.Theoretically, you could even, say, hire a hit
man to kill someone you don't like.
But you won't find this information with a Google search.These kinds ofWeb sites
require you to use special software, such asThe Onion Router, more commonly known
asTor.
Tor is software that installs into your browser and sets up the specific connections you
need to access dark Web sites.Critically,Tor is an encrypted technology that helps
people maintain anonymity online. It does this in part by routing connections through
servers around the world, making them much harder to track." -How StuffWorks.com
“Tor is a program you can run on your computer that helps keep you safe on
the Internet. It protects you by bouncing your communications around a
distributed network of relays run by volunteers all around the world: it
prevents somebody watching your Internet connection from learning what
sites you visit, and it prevents the sites you visit from learning your physical
location.This set of volunteer relays is called the Tor network (Tor Project)”
HowTor works
■ TheTor network is a volunteer-operated servers that individuals use to improve
privacy and security.
■ Tor connects through masked virtual links instead of a direct connection.
■ Tor directs Internet traffic through a free, worldwide, volunteer network consisting of
more than seven thousand relays to conceal a user's location and usage from anyone
conducting network surveillance or traffic analysis (How StuffWorks).
The Onion Router
“Onion routing is a technique for anonymous communication over a computer network. In
an onion network, messages are encapsulated in layers of encryption, analogous to layers
of an onion.The encrypted data is transmitted through a series of network nodes called
onion routers, each of which "peels" away a single layer, uncovering the data's next
destination.When the final layer is decrypted, the message arrives at its destination.The
sender remains anonymous because each intermediary knows only the location of the
immediately preceding and following nodes” (Tor (anonymity network)
Hidden Service
“Tor makes it possible for users to hide their locations while offering various kinds of
services, such as web publishing or an instant messaging server. UsingTor "rendezvous
points," otherTor users can connect to these hidden services, each without knowing the
other's network identity (Tor project).”
Why
■ Privacy
– Protect from traffic analysis
■ Traffic analysis is the process of intercepting and examining messages in order to deduce
information from patterns in communication. It can be performed even when the messages
are encrypted and cannot be decrypted.
– Personal Privacy from government privacy abuse
– Identity thieves
– Unscrupulous marketers
– Protect communication from corporations
– Skirt censorship
■ Whistleblowing & news leaks
– Chinese journalist communicate onTor
Tips
■ 1.) Don’t Download anything.
■ 2.) Don’t go to illegal sites (Duh).
■ 3.) Never pay with any credit card.
■ 4.) However, if you are going to use it because of personal privacy concerns, also be
aware of the dangerous e.g., illegal material, viruses, and slow websites!
References
■ How the DeepWebWorks:
http://computer.howstuffworks.com/internet/basics/how-the-deep-web-works.htm
■ Who usesTor:
https://www.torproject.org/about/torusers.html.en
■ Tor (anonymity network)
https://en.wikipedia.org/wiki/Tor_%28anonymity_network%29

Research in the deep web

  • 1.
  • 2.
    Outline ■ The SurfaceWebVSthe Deep Web ■ DeepWeb vs DarkWeb ■ How to get deeper ■ DarkWeb access ■ Tools
  • 3.
    SurfaceWeb Anything that canbe indexed by a typical search engine.
  • 7.
    DeepWeb ■ Quality contentis 1,000 to 2,000 times greater than surface web ■ 95% of DeepWeb is accessible to public (no fees or subscription required) ■ The deep web is the internet that has not been indexed by commercial search engines such as Google and Yahoo!.These search engines send out crawlers or spiders to index and catalog available web sites, but the vast majority of the internet, 500X the surface net, includes many of the best resources which are not indexed by these search engines. Such as: – Unlinked pages:Spiders can’t find web pages that are not linked to others. – Private intranets and web pages, password protected and more. – Non HTML content: e.g., content rich, lots of different file type – Library & Government Databases -How StuffWorks.com
  • 8.
    Types of deepweb ■ InvisibleWeb: Material that can be, but is not included in search engine results. EG: new material added and not yet picked up. ■ PrivateWeb: Sites intentionally excluded from search engine results. Ex: password protected ■ Subscriptions websites & databases: ESPN Insider, (Library Databases) Proquest databases, EbscoHost. -How StuffWorks.com
  • 9.
    How to getdeeper ■ Directories – http://www.dmoz.org/ ■ Specialized Search Engines – http://biznar.com/biznar/desktop/en/search.html – http://www.wolframalpha.com/ – http://archive.org/web/web.php ■ Academic Databases – Jstor – Science Direct ■ Specialized Databases – https://www.archives.gov/ ■ GovernmentWebsites – https://www.data.gov/ – https://www.govinfo.gov/
  • 10.
    The DarkWeb ■ TheDark web: An area of the web which information is hidden deep and anonymously. ■ The DarkWeb then is a small portion of the DeepWeb that has been intentionally hidden and is inaccessible through standard web browsers.
  • 11.
    "The deepWeb maybe a shadow land of untapped potential, but with a bit of skill and some luck, you can illuminate a lot of valuable information that many people worked to archive. On the dark Web, where people purposely hide information, they'd prefer it if you left the lights off. The dark Web is a bit like the Web's id. It's private. It's anonymous. It's powerful. It unleashes human nature in all its forms, both good and bad. The bad stuff, as always, gets most of the headlines.You can find illegal goods and activities of all kinds through the dark Web.That includes illicit drugs, child pornography, stolen credit card numbers, human trafficking, weapons, exotic animals, copyrighted media and anything else you can think of.Theoretically, you could even, say, hire a hit man to kill someone you don't like. But you won't find this information with a Google search.These kinds ofWeb sites require you to use special software, such asThe Onion Router, more commonly known asTor. Tor is software that installs into your browser and sets up the specific connections you need to access dark Web sites.Critically,Tor is an encrypted technology that helps people maintain anonymity online. It does this in part by routing connections through servers around the world, making them much harder to track." -How StuffWorks.com
  • 13.
    “Tor is aprogram you can run on your computer that helps keep you safe on the Internet. It protects you by bouncing your communications around a distributed network of relays run by volunteers all around the world: it prevents somebody watching your Internet connection from learning what sites you visit, and it prevents the sites you visit from learning your physical location.This set of volunteer relays is called the Tor network (Tor Project)”
  • 14.
    HowTor works ■ TheTornetwork is a volunteer-operated servers that individuals use to improve privacy and security. ■ Tor connects through masked virtual links instead of a direct connection. ■ Tor directs Internet traffic through a free, worldwide, volunteer network consisting of more than seven thousand relays to conceal a user's location and usage from anyone conducting network surveillance or traffic analysis (How StuffWorks).
  • 15.
    The Onion Router “Onionrouting is a technique for anonymous communication over a computer network. In an onion network, messages are encapsulated in layers of encryption, analogous to layers of an onion.The encrypted data is transmitted through a series of network nodes called onion routers, each of which "peels" away a single layer, uncovering the data's next destination.When the final layer is decrypted, the message arrives at its destination.The sender remains anonymous because each intermediary knows only the location of the immediately preceding and following nodes” (Tor (anonymity network)
  • 16.
    Hidden Service “Tor makesit possible for users to hide their locations while offering various kinds of services, such as web publishing or an instant messaging server. UsingTor "rendezvous points," otherTor users can connect to these hidden services, each without knowing the other's network identity (Tor project).”
  • 24.
    Why ■ Privacy – Protectfrom traffic analysis ■ Traffic analysis is the process of intercepting and examining messages in order to deduce information from patterns in communication. It can be performed even when the messages are encrypted and cannot be decrypted. – Personal Privacy from government privacy abuse – Identity thieves – Unscrupulous marketers – Protect communication from corporations – Skirt censorship ■ Whistleblowing & news leaks – Chinese journalist communicate onTor
  • 25.
    Tips ■ 1.) Don’tDownload anything. ■ 2.) Don’t go to illegal sites (Duh). ■ 3.) Never pay with any credit card. ■ 4.) However, if you are going to use it because of personal privacy concerns, also be aware of the dangerous e.g., illegal material, viruses, and slow websites!
  • 26.
    References ■ How theDeepWebWorks: http://computer.howstuffworks.com/internet/basics/how-the-deep-web-works.htm ■ Who usesTor: https://www.torproject.org/about/torusers.html.en ■ Tor (anonymity network) https://en.wikipedia.org/wiki/Tor_%28anonymity_network%29