REYKJAVIK UNIVERSITY




The Future Of The Web
     The Semantic Web
         Bergur Páll Gylfason
          Bergurg08@ru....
TABLE OF CONTENTS
Introduction...............................................................................................
INTRODUCTION
The World Wide Web is getting 22 years old this year and it has gone through a lot of changes. At
first it wa...
everyone to use and as always when something is free and “good enough” the users switched to
HTTP. And as we have seen ove...
Exabyte’s by 2013. (10) That means that the data created in five years is more than all the data
created from the beginnin...
The Semantic Web will make tasks like these so much faster and easier. Instead of multiple searches
you would rather searc...
Finally they have made SPARQL standard so that people can make a Query which means technologies
and protocols that can ret...
Resources, Processes or Values like Western Union had could also fall behind. But there is really no
accurate way to tell ...
THE TECHNOLOGY BEHIND THE SEMANTIC WEB
LINKED DATA
We have already talked about the Semantic Web as a Web of Data, data of...
THE LINKED OPEN DATA PROJECT
The objective of the project is to identify and connect all the existing data sets that are a...
location. However, it is also important to be able to record information about many things that,
unlike Web pages, do not ...
characteristic of the subject in this case it is creator is called the predicate. Last we have the value of
the predicate ...
different words are used to describe the same thing in different data sets, or when a bit of extra
knowledge may lead to t...
QUERIES SPARQL
We have this huge Web of data which is growing very fast and we need some way of getting the
information fr...
EXAMPLES OF SEMANTIC WEBS
LINKED DATA SEARCH ENGINE
Semantic search engines can be divided into two groups Human Oriented ...
films The Departed and The Aviator. These are just two examples of what they are doing, there is a
lot more.



BBC MUSIC
...
A mashup is a web page or application that uses or combines data or functionality from two or many
more external sources t...
BIBLIOGRAPHY
1. The History Of World Wide Web. Wikipedia. [Online] [Cited: Mars 2, 2010.]
http://en.wikipedia.org/wiki/His...
16. Internet World Statistics. Internet World Statistics. [Online] September 30, 2009. [Cited: March
18, 2010.] http://www...
34. Perez, Sarah. BBCs Semantic Music Project. Read Write Web. [Online] January 21, 2009. [Cited:
March 24, 2010.] http://...
Upcoming SlideShare
Loading in...5
×

Sw semantic web

670

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
670
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Sw semantic web"

  1. 1. REYKJAVIK UNIVERSITY The Future Of The Web The Semantic Web Bergur Páll Gylfason Bergurg08@ru.is March 25, 2010
  2. 2. TABLE OF CONTENTS Introduction............................................................................................................................................. 2 History of the world wide web ................................................................................................................ 2 History of the Semantic Web .............................................................................................................. 4 The Concept Behind The Semantic Web ................................................................................................. 4 What is the Semantic web? ................................................................................................................. 4 How does the Semantic web work? .................................................................................................... 5 What is Semantic Web doing for you? ................................................................................................ 6 Disruption ............................................................................................................................................ 6 What happens after web 3.0? ............................................................................................................. 7 The Technology behind the Semantic Web............................................................................................. 8 Linked Data .......................................................................................................................................... 8 The Linked Open Data Project ......................................................................................................... 9 URIS ..................................................................................................................................................... 9 Uri Reference -URIref .................................................................................................................... 10 XML .................................................................................................................................................... 10 Resource Description Framework RDF .............................................................................................. 10 Vocabularies and Ontologies ............................................................................................................. 11 Friend of a Friend FOAF ................................................................................................................. 12 Queries SPARQL................................................................................................................................. 13 Examples of Semantic Webs ................................................................................................................. 14 Linked Data Search Engine ................................................................................................................ 14 Medical – HealthBase ........................................................................................................................ 14 DBPedia ............................................................................................................................................. 14 BBC Music .......................................................................................................................................... 15 eTourism ............................................................................................................................................ 15 EveryBlock – Mashup Site ................................................................................................................. 15 Conclusion ............................................................................................................................................. 16 Bibliography........................................................................................................................................... 17 1
  3. 3. INTRODUCTION The World Wide Web is getting 22 years old this year and it has gone through a lot of changes. At first it was just documents with text and hyperlinks and then came photos, videos, search engines, social networks and so what comes next? I think the next step is that the web is getting much more Semantic. The Semantic Web is all about connecting data and making it readable by machines, we have all this useful information in different databases all over the world, medical-, Pharmaceutical -, government-, personal information and so much more and none of it is connected. What if we could connect all that information? How would we do it? We use Semantic technologies to connect the data and make machines understand it. To do that we use technologies like RDF, OWL and SPARQL. It is totally going to change how people use the World Wide Web, how they shop, how they make travel arrangements, how they search. It is going to change how people interact and use the Web. In this paper I’m going to try to find out what the Semantic Web is, how does it work, where it stands and what opportunities it will bring for people and organizations. First I will look at the history of the web, and then I will find out what it is all about quickly how it works and what possibilities it will bring, and then I will go into the technical details of the semantic HISTORY OF THE WORLD WIDE WEB If it wouldn’t have been for Tim Berners-Lee, A scientist at CERN the Web would not be where it is today. He saw that scientist had no way of sharing information easily. So he had the idea of creating „a large hypertext database with typed links“ (1) aka the World Wide Web (WWW) (will be referred as the web in this paper), but people had little believe in that idea at the time, so it wasn’t until 1990 when he had developed all the tools required for a working web (HTTP, HTML, Web browser, Web Server and the first Web pages) that people saw the huge potential Picture 1: The first photograph on the Web (43) that the Web had. From there on the internet use was growing each day. But Berners-Lee browser was very primitive, and in 1993 there came new browsers such as Cello from Microsoft and Mosaic which was the foundation for Netscape whom would own the market for the years to come. (1) In 1994 Berners-Lee founded the World Wide Web Consortium (W3C) which is the main international standards organization for the Web (2) . That was absolutely necessary because, as happens with all new technologies that are being developed, people who are working on it have different ideas how to implement the same thing and therefore they come up with different formats so there often is inconsistency in how people to things e.g. VHS and Betamax, Blue-Ray and HD-DVD. But in the Webs case there was inconsistency in the HTML. At that time the Gopher protocol (3) ruled the market and was in a format war with HTML. Which they lost because they were greedy, they started to make people pay for using Gopher while Berners-Lee made his Web protocol free for 2
  4. 4. everyone to use and as always when something is free and “good enough” the users switched to HTTP. And as we have seen over time that when things are open and free more people use it and there is more variety and faster evolution of the technology. In 1996 normal companies started to get involved with the Web, they saw the potential in advertising their products for free where everybody could see them. That was the start of the so called Dot-com boom (4), it was a stock market bubble which popped in 2001. The cause of the dot-com bubble was mostly because companies had daring business policies, growth over profit. They were hoping if they built up their customer base, their profits would also rise. Investors pumped money into the internet market with false and Picture 2: The Dot-Com Bubble (44) hyped up hopes of profit mostly in e-commerce. But when the bubble burst in 2001 there was a short downtime in the Web business and many companies went bankrupt. But not everybody because in this time came the biggest internet companies today such as: Google, E-bay and Amazon. In this time also started the social network mania that would put its mark over the next ten years. First there came MySpace which was the most popular social network in 2006 (5) and then came Facebook which was the most used social network in the world in 2009 (6). The era after the dot-com boom has been called web 2.0 by many people because of apposed of how content was distributed in Web 1.0(the web from birth to the dot com boom) it dramatically changed. Web 1.0 was very primitive and linear. There was a webmaster that controlled the webpage and him and some experts generated content on the webpage. So there where millions of users and growing, but only a thousands of webmasters and experts, all that the users could do was browse what the webmasters generated, the web was read-only for the users and therefore content was very limited. The Web 2.0 is all about gathering information the main thing that changed in Web 2.0 is that users started to generate content themselves. With Picture 3: Difference between Web 1.0 and Web 2.0 sites like Wikipedia where they trusted the users to put (45) in accurate information and it has grown incredibly huge with over 3.000.000 articles (7). Blogs also played an important role where people could easily get up their own blog with only a three clicks, where they could say whatever they wanted. And YouTube where users have uploaded more than 80 million videos and it would take you more than 600 years to see them all (8). This is causing us to stretch the limits of the web, because of the huge amount of data being generated. In 2007 the total volume of digital information that is created and replicated globally is 281 billion gigabytes (281 Exabyte’s) in 2007. (9) And it is supposed to grow to astonishing 667 3
  5. 5. Exabyte’s by 2013. (10) That means that the data created in five years is more than all the data created from the beginning of the web. That also means it gets harder to find the information you need with the keyword based search engines and by browsing. HISTORY OF THE SEMANTIC WEB As most people know, Tim Berners-Lee invented the Web but the web as it is today is not his original vision of the web. The web didn’t quite evolve the way Berners-Lee envisioned it because he thought of the Web as a Semantic Web from the beginning. Even though the Web didn’t evolve as he would have hoped he continued to fight for it by publishing materials and making statements about the evolution of the Semantic Web. In 1998 he started defining a roadmap for the semantic web or an attempt to give a high-level plan of the architecture of the Semantic web (11) and in 1999 he said: “I have a dream for the Web in which computers become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.” (12) THE CONCEPT BEHIND THE SEMANTIC WEB WHAT IS THE SEMANTIC WEB? The first thing to know about the Semantic Web is that it is a Web of data, where the data becomes part of the Web. This is in contrast to the Web as we know it today it. Today it is full of information in the form of documents and we use search engines to search these documents, but they still have to be read and interpreted by humans to get the information from the documents. So while computers present you with the information it can‘t understand it. For Example when you are looking for the band Oasis on Google, it shows on the first page Oasis clothing store, OASIS (Organization for the Advancement of Structured Information Standards), Oasis gift shop and Oasis Hong Kong Airlines. That is not at all what i was looking for. If it were a Semantic Search i would search for Oasis where Oasis is a type of rock band and i would only get results with Oasis the band. This would save a lot of time. Let’s take another example. You‘ve decided to take your wife on a romantic date – a movie and a restaurant. You know she loves romantic comedies and a good steak. What you would do is go to Google to find what movies are showing at theaters close to you, an then you spend some time trying to find a good movie by reading descriptions and reviews. Then you check out the steakhouses in close perimeters to the theatres. Of course you want to make sure the steakhouse is good so you try to find reviews about them. This is going to take you a lot of searches, a lot of visits to webpages and the worst way much time. 4
  6. 6. The Semantic Web will make tasks like these so much faster and easier. Instead of multiple searches you would rather search for a complex sentence in the Semantic Web like: “I want to see a romantic comedy and then go to a steakhouse in close proximity. The Semantic Web will analyze your response, search the web for possible combinations, and then organize the results to be best suited for you. And this will only take you a few minutes as the Semantic Web will do all the heavy lifting for you. All this is going to get possible by linking data together and turning the Web into an open database and making it understandable my machines. And that is only going to happen if people release their raw data on the internet. They are working on this in the Linking Open Data Project (13). HOW DOES THE SEMANTIC WEB WORK? The W3 (World Wide Web Consortium) is an international standard community that is led by the inventor of the web Tim Berners-Lee. It develops standards for the web to ensure the long term growth on the web. They have made standards for HTML, HTTP, XML and much more. (2) They have already made some standards for Linked Data, Ontologies and Queries. To be able to Link Data together W3 made the standard RDF which is the core of the semantic web making possible to make metadata (data about data) about resources on the web so that machines can understand them. What the resource is and how it connects to another resources. For Example you want to make RDF about yourself and you have the properties that you are a type of Person with the full name John Doe and mailbox johndoe@ru.is. Now the machine knows that you are a type of Person with some specific name and some mailbox. They have also made standards for Ontologies which define the concepts and relations used to describe and represent an area of concern. They are used to classify the terms that can be used in a particular application, characterize possible relationships and to help data integration when ambiguities may exist on terms in different data sets or when an extra bit of knowledge may lead to the discovery of new relationship. It is best to describe it by an example: if you have the person Picture 4: The Technology Stack for The Semantic above John Doe and then someone else makes a Person Web (42) but doesn’t use the term mailbox instead he uses the term email. Then an extra definition should be, describing the fact that the relationship “email” is the same as “mailbox”. This Extra piece of information is an extremely simple ontology. They can also be extremely complicated, with many ontologies connected together to describe e.g. The Aquatic world. To do this they have made standards like RDF and RDF Schemas, SKOS(Simple Knowledge Organization System), OWL(Web Ontology Language) and RIF(Rule Interchange Format). (14) We will take a closer look at the most common one OWL later. 5
  7. 7. Finally they have made SPARQL standard so that people can make a Query which means technologies and protocols that can retrieve information from the Web of Data. The Web of Data is represented in RDF so we need some RDF specific query language. This is provided by SPARQL and it makes it possible to get results through HTTP or SOAP. When people use SPARQL they can extract complex information which is returned in a table format. This table can be incorporated into another Web page, using this approach SPARQL provides a powerful tool to build, for example, complex mash-up sites or search engines that include data streaming from the Semantic Web. (15) WHAT IS SEMANTIC WEB DOING FOR YOU? The Semantic Web is going to change how you use the web. It is going to make it much simpler and thus making your life simpler because peoples internet use is growing fast, it has grown 380% since the year 2000 (16) and people are starting to think that the web access is a fundamental right (17). The purpose of the computer is to do the repetitive and boring things for you, doing the boring, complex and hard work is exactly what the Semantic Web is supposed to do. It is going to take fewer clicks to get to the data you are looking for. Collect your Interests and share them with others with an application like Twine (18) you share your interests with other people and they share with you. You can organize your travel plans better with applications like TripIt (19) which lets you combine bookings made from different web pages into a single site. Finally you can pinpoint the exact news you want to see. (20 bls. 22) There are a lot of opportunities for companies and governments in getting involved in the Semantic Web. International banks could drill into accounts, transactions and financial histories without requiring many months of expensive IT projects to do so. Financial institutions could assess risk with greater accuracy because of more relevant data available. Pharmaceutical companies could lower the cost of drug development if they could easily combine open-source web data with their own data and therefore save money by using information other people has found instead of doing it yourself. The Semantic Web could also make huge impact on things like National Security, Disaster Preparedness and Military Operations. The Government collects enormous volumes of data every day and by linking them to gather more effectively, they can see national security threads forming before they become a reality. Disasters rarely happen when you’ve planned for them, so being able to access all data quickly and being able to mash it up on the fly to get more out of the information could save a lot of lives when there is little time to get information. (20 bls. 55) DISRUPTION The Semantic Web will no doubt bring some Disruption to the market, making web companies with bad business models go bankrupt. And companies that are doing nothing instead of investing in innovation might fall behind. Picture 5: The Technology Life Cycle (41) Or the companies that have inconvenient 6
  8. 8. Resources, Processes or Values like Western Union had could also fall behind. But there is really no accurate way to tell if there will be any Disruption because I think the semantic web is still early in the Early Adaptors stage in the Technology Life Cycle. They are still waiting for the killer app and for more people to join the Semantic Web. There are so many possibilities that we don’t see yet that might cause disruption. I think only time will tell. WHAT HAPPENS AFTER WEB 3.0? If Web 2.0is about web application and social networking, and web 3.0 is about incorporating the semantics of data interpreted by machines. What happens next? Nova Spivack a technology visionary and entrepreneur in web development and developer of Twine thinks that the web develops in 10 year cycles and that 2010- 2020 is Web 3.0 and that it is supposed to lay the groundwork for web 4.0 that is scheduled for 2020-2030. Just like Web 1.0 Picture 6: The evolution of the web according to Nova Spivack (21) laid the groundwork for web 2.0. He thinks that the Web 4.0 will be something like WebOS and will work like middleware, where the web will start functioning like an OS or what he calls, “The Intelligent Web”. This might work as your own personal assistant. (21) Nova Spivack isn’t the only one with a vision about the future. Raymond Kurzweil is an inventor and a pioneer in text-to-speech synthesis, speech recognition technology and more also predicts that there will be a WebOS by 2029. But he thinks it will be parallel to the human brain and he said: “Intelligent machines will combine the subtle and supple skills that humans now excel in (essentially our powers of pattern recognition) with ways in which machines are already superior, such as remembering trillions of facts accurately, searching quickly through vast databases, and downloading skills and knowledge.” (21) 7
  9. 9. THE TECHNOLOGY BEHIND THE SEMANTIC WEB LINKED DATA We have already talked about the Semantic Web as a Web of Data, data of all possible types from personal data to financial data to pharmaceutical data and just about all kinds of data you can think of. Linked Data is about using the Web to create typed links between data from different sources with collection of Semantic web technologies e.g. RDF and OWL which we will examine later. Berners-Lee made set of rules (22) known as the Linked Data principles. They are about publishing data on the web in a way that all published data becomes part of a single global data space. They provide a basic recipe for publishing and connecting data: 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things To be able to make the Web of Data we need huge amount of raw data openly available on the web. The amount of raw data on the web has been rapidly growing over the years from Universities and corporations to governments. Berners-Lee held a Ted conference in March 2009 asking people to put raw data out on the web. (23) People did what he asked and started to put more data free and what happened? Other people reused the data and did some interesting Picture 7: Water available for white people (24) things as Berners –Lee showed in March 2010 Ted Conference (24). For example a lawyer in Zanesville Ohio who took the information from the Government to check out which houses had water and in which houses lived black people. He found out that there was only water with the white people. The County had to pay $10.9 million in compensations. Another example is when the earthquake in Haiti happened, the Google map was not very good. So GeoEye a satellite image company put images of Fiji free on the Web. Then people started to update the Google map according to GeoEye images. It even showed blocked roads, refugee’s camps, hospital ship and other very important things. This became the best map to use if you were involved in relief work in Haiti and probably saved some lives. (24) Picture 8: Haiti map after people on the web updated it (24) 8
  10. 10. THE LINKED OPEN DATA PROJECT The objective of the project is to identify and connect all the existing data sets that are available under open licenses, converting them to linked data with RDF and linked data principles, and publishing them on the web. In the beginning there where mostly researchers in university research labs and small companies involved in the project when it was founded in January 2007, but since then it has grown fast. Several big organizations have got involved like US government, BBC, Flickr and New York Times. The reason because it is growing so fast is that it is open, anyone can publish a dataset and connect it to other datasets. (25) In Picture 9 is the Linked Data Project in July 2009 where each cloud is a dataset and the lines show the connections between them, the darker the line the more connections are between the dataset. Picture 9: The Linked Open Data Project (13) URIS When we surf the web we use Uniform Resource Identifiers, because we use a uniform system of identifiers and each item is a resource and we use them to identify items on the web. The URI is the foundation of the Web, it holds it together. Almost everyone knows one type of URI although they don’t know it is a URI, that’s URL (Uniform Resource Locator). Which is an address that lets you visit a webpage? (26) URL is a character string that identifies a web resource by representing its network 9
  11. 11. location. However, it is also important to be able to record information about many things that, unlike Web pages, do not have network location or URL. That is where URI comes in because it is more general form of identifier. All URIs share the property that different persons or organizations can independently create them, and use them to identify things. A URI can be created to refer to anything that needs to be referred to in a statement. For Example: network accessible things (e.g. document, image, a service or other resources), things those are not network-accessible such as human being, books in a library and abstract concepts that do not physically exist, such as the concept of an „author“ (27). U R I R E F E R E N C E - U RI R E F Uriref is another type of string that represents a URI, and represents the resource identified by that URI. It is a URI, together with an optional fragment identifier at the end separated by #. For example, the URI reference http://www.example.org/index.html#section2 consists of the URI http://www.example.org/index.html and the fragment identifier Section2. URIrefs can contain Unicode characters allowing many languages to be reflected in URIrefs (27). XML XML was designed to be a simple way to send documents across the Web. It allows anyone to design their own document format and then write a document in that format. These document formats can include markup to enhance the meaning of the document’s content. This markup is “machine- readable” that is, programs can read and understand it. Which is the whole idea with the Semantic Web, make the web readable by machines. Instead of only one application being able to use it, they can be used by many applications, where each application interprets the markup the best way for it. For Example if words were marked as “emphasized” a normal web browser might display them bold and a voice browser might use higher volume. (26) XML is used in combination with other semantic technologies like RDF to connect data. RESOURCE DESCRIPTION FRAMEWORK RDF RDF or Resource Description Framework dose exactly what the name indicates, it is a language to provide a simple way to describe resources on the Web. RDF is at the core of the Semantic because it makes statements about things, for example if you make a statement in English that John Doe created a particular webpage: „http://www.example.org/index.html has a creator whose value is John Doe” Then you have three parts of this statement. First you have the thing, the statement is Picture 10: A Simple RDF Statement (18) about called the subject and in this case it is the webpage. Next you have the part that identifies the property or 10
  12. 12. characteristic of the subject in this case it is creator is called the predicate. Last we have the value of the predicate and that is the name John Doe. (27) While English is good for communicating between humans, it is not good for communicating between machines. RDF is about making statements that a machine can process and understand. To do so we need two things: A system of machine processable identifiers for identifying a subject, predicate or an object in a statement without any possibility of confusion with a similar-looking identifier that might be used by someone else on the web. Secondly we need a machine-processable language for representing these statements and exchanging them between machines. Those are both in Picture 11: A more Complicated RDF Statement (18) place in the web today. RDF uses the Uniform Resource Identifier (URI) because it is so generic to identify the subject, predicates and objects in statements. RDF defines a resource as anything that is identifiable by a URI reference, so using URIrefs allows RDF to describe practically anything, and state relationships between those things as well. Statements can also have a string instead of an URIref for the Object but not subjects or predicate but then they are called Literals. To represent RDF statements in machine-processable way it uses XML. RDF defines a specific XML markup language, referred to as RDF/XML, for use in representing RDF information, and for exchanging it between machines. (27) VOCABULARIES AND ONTOLOGIES Computers don’t understand our language, they don’t see the connections between words and things as we do. For example if you have an uncle you know that it is the brother of one of your parents. But a machine doesn’t know that you must have a parent that has a brother to have a cousin and that is exactly what ontologies do. It makes the machine understand and know these things that are so natural for us humans. For example if we had the ontology in picture 7, and someone would ask some Star Wars semantic website: “who is the father of Lela Organa and where is she from?” The website would know from its ontology and linked data that Lela is the daughter of Anakin Skywalker and she is from Alderaan. Vocabularies and Ontologies are used to express extra constraints and logical relationships among resources. For example to help data integration for example when 11 Picture 12: Small Ontology about Star Wars (37)
  13. 13. different words are used to describe the same thing in different data sets, or when a bit of extra knowledge may lead to the discovery of new relationships. (28) The best way to dig deeper into the Ontologies is to take a look at one of the standards w3 has been working on. Let’s take a look at Web Ontology Language or OWL. OWL is a language for expressing ontologies. It is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. OWL is a computational logic-based language such that knowledge expressed in OWL can be reasoned with by computer programs. OWL documents, known as ontologies, can be published in the Web and may refer to or be referred from other OWL ontologies. (29) Ontology is a set of precise descriptive statements about some part of the world usually called domain of interest. Precise descriptions prevent misunderstanding in human communication and they ensure that software behaves in a uniform, predictable way and works well with other software. To precisely describe a domain of interest, you come up with a set of central terms often called vocabulary and describe what they mean. Both with a natural language definition and how this term in connected to other terms. This is called a terminology and it combined with the vocabulary is an essential part of a typical OWL document. (29) To understand how knowledge is represented in OWL we first must check out some fundamental notions. Axioms are the basic statements that OWL ontology expresses. Entities are elements used to refer to real world objects. For example “Star Wars is a Movie” or “Kenny is Spenny’s Uncle”. Expressions are combinations of entities to form complex descriptions from basic ones. For Example we have the classes “female” and “professor” could be combined to describe the class of female professors. (29) Each OWL ontology is a collection of basic statements like “Coca Cola is a Drink” or “it is Cloudy” and these statements are Axioms. These statements can both be true or false. This distinguishes them from entities and expressions. An important feature of OWL is that a statement is true when the other statements are. For Example a set of statements A entails a statement b if in any state of affair wherein all statements from A are true, also b is true. (29) FRIEND OF A F R I E N D FO AF FOAF is a machine-readable ontology describing person, their activities and their relations to other people and object and is a part of the open linked data project. It is considered to be the first Social Semantic Web application. Anyone can use FOAF to describe himself by creating their own FOAF profile. FOAF allows groups of people to describe social networks without the need for centralized database. Computers my use these FOAF profiles to find, for example, to list all people both you and a friend of yours know. 12
  14. 14. QUERIES SPARQL We have this huge Web of data which is growing very fast and we need some way of getting the information from the data. To do that we use queries, just like relational databases uses SQL and XML uses XQuery to get information from the data. But in the Semantic Web we use RDF-Specific query language SPARQL that makes it possible to send queries and receive result through HTTP or SOAP. SPARQL queries are based on triple patterns, it provides patterns against RDF triplets. These triple patterns are similar to RDF triplets, except that one or more of the constituent resource references are variables. A SPARQL engine would return the resources for all triples that match these patterns. (15) SPARQL allows users to write queries that consist of triple patterns with conjunctions (and), and disjunctions (or). In SPARQL the query is actually specifying a pattern in the data that should be matched in a result set. Given a particular triple pattern in a query, a SPARQL processor considers sets of triplets in the target RDF model that matches the pattern. Let’s take an example: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX books:http://www.dummies.com/books# SELECT ?title WHERE { ?book rdf:type books:Books . ?book books:author http://me.jtpollock.us/foaf.rdf#me . ?book dc:title ?title . } ORDER BY ?title This query looks for books with the author Jeff Pollock and orders the results in a list. In the Where clause, we are specifying triple patterns. The first pattern matches on all RDF instances that are of the rdf:type Book, The second pattern matches on all those RDF instances that have a book: author relationship to Jeff Pollock. These two patterns are a conjunction (and). The “?” in front of the word book, indicates a variable, the thing we are looking for. And finally we order the results by title. This will give us all the books with this specific author. (20 bls. 230-232) 13
  15. 15. EXAMPLES OF SEMANTIC WEBS LINKED DATA SEARCH ENGINE Semantic search engines can be divided into two groups Human Oriented Search engines and application oriented search engines. The human oriented search engines are similar to Google with keyword based search, but rather than simply provide links to the pages the Semantic search will provide a more detailed interface where the user can exploit the underlying structure of the data. It provides the option to search for object, concepts and documents where each one lead to a different results. The application oriented search was developed to serve the needs of applications built on top of linked data. They provide APIs through which linked data application can discover RDF documents on the web. (25) MEDICAL – HEALTHBASE HealthBase is like your own personal doctor, it’s a new semantic web page that allows you to search for a condition, treatments and drugs and it performs semantic search on all other health-related sites on the web. That means it doesn’t just take a look at the titles of the articles and give you he results, it reads into the actual text to deliver some useful information to you without all the disturbing advertisement and junk the follows. The page is very simple and useful. The Picture 13: HealthBase (30) HealthBase database drills into over 10 million health documents from all over like, WebMD and Yahoo Health, so you are getting the same information as you would get in the other places but you get them all categorized without the ads, animations and all the extra garbage. Because HealthBase doesn’t build its own resources but reuses other sites resource, it is only as good as the sites it searches. (30) DBPEDIA DBPedia is a project developed by OpenLink Software and Universities of Leipzig and Berlin. Its objective is to extract structured information from the information in Wikipedia. Then the structured data is made available on the Web. It is one of the most important datasets in the Linked Open Data Project as it describes more than 2.9 million things with more than 3.7 million interlinks with other datasets like Freebase, GeoNames and CIA World Fact Book. DBPedia includes at least 292,000 persons, 339,000 places, 8,000 music albums, 44,000 films, 15,000 video games, 119,000 organizations, 130,000 species and 4400 diseases. (31) And it is growing as Wikipedia grows. With Wikipedia you only have keyword search but with DBPedia you are going to be able to search for queries like “Give me all Italian musicians from the 18th century”. You can also go through the Wikipedia where you choose your search criteria (32) For Example you can search for all films that Martin Scorsese has directed that starred Leonardo DiCaprio and Alec Baldwin. That gives you the 14
  16. 16. films The Departed and The Aviator. These are just two examples of what they are doing, there is a lot more. BBC MUSIC BBC has been developing a new project about semantically link web pages about artist and singers whose songs are played on BBC radio stations. Within these pages, collections of data are enhanced and interconnected with semantic metadata, letting music fans explore connections between artists that they may have not known existed. The Semantic technology adds additional context to data about the artists which can Picture 14: BBC Music Covering David Bowie (40) include anything from previous bands, venues played, instrument played and more. Most of the information comes from MusicBrainz (33) and DBPedia (31). MusicBrainz is an open content metadatabase that lists information for over 400,000 artists. BBC uses the information from MusicBrainz about the artists and then adds additional information from DBPedia, like the biography. By reusing the content that is already on the open web saves money, time and energy. (34) ETOURISM Today the web is a big showcase for cities that want to build and expand their tourism industry. With all the information available on the web people are starting to plan their trips on the web in advance, so cities are competing against each other to offer the best information and services through the tourism on their web sites. While most cities just have a few options like “a Weekend in Vestmannaeyjar” or “Sport Week in Manchester” the City of Zaragoza in Spain had developed a web application called CRUZAR that uses expert knowledge (in form of Picture 15: Ontology for CRUZAR (35) rules and ontologies) and a comprehensive repository of relevant data gets from databases about events and places of interests in the city and builds unique route for each visitor based on his hobbies and interest. Let’s say you want to see a football game, have some spicy food, look at some museums and go on a nice and hot beach, the CRUZAR will make the best route for you. (35) EVERYBLOCK – MASHUP SITE 15
  17. 17. A mashup is a web page or application that uses or combines data or functionality from two or many more external sources to create a new service. And with semantic technology this is becoming much more common. A good example is EveryBlock (36) it shows Civic information like building permits, crimes, restaurants inspections, and also news articles from major newspapers, TV and radio stations and blogs. It even includes fun stuff like pictures from Flickr, user reviews of businesses and lost and found listing. It shows all these information on a map you can choose a town, district, Zip code or even street address to see what is going on in your neighborhood. It is pretty awesome to be able to see how much crime is in your area, what restaurants didn’t pass health inspections and even checking if there is some Hollywood filming in your neighborhood. They use semantic search engines to crawl the web and government data for information. (36) CONCLUSION It’s getting pretty clear that the Semantic Web is the next step in the web evolution. Finally the original idea of the Web of Berners-Lee is coming to life. It has taken over ten years to design and build good standards like RDF and OWL for the Semantic Web but it is not over yet. There are still no standards for Logic, Truth and Proof. I didn’t find much information about those aspects of the Semantic Web in my research so I think there are still some years until they got that covered. In the meantime we just work on linking more data. The Open Linked Data Project is doing a good job connecting datasets which is very good because it is a fundamental thing for the Semantic Web: “To have lots and lots and lots of Data” as Berners-Lee said. The amount of connected data is growing each year so people are picking up and putting their data out there. This is a clue that the Web is already changing to a more open, more accessible and a better Web. I found some very interesting things that people are doing with the linked data as I showed in the examples. Search Engines, Self Diagnosing Medical Site, and a music site connecting information from different places, Tourism site making a special route for each person, and a site that can show how much crime is in your neighborhood! Those are some pretty cool sites and I Think this is just the tip of the iceberg of what is going to happen, there are probably lots of other ideas in the development out there in some basement s or universities. It will be very interesting following this technology in the next few years because it is going to be huge. Just imagine if you have all this linked data out there, in one big database. The possibilities would be endless. 16
  18. 18. BIBLIOGRAPHY 1. The History Of World Wide Web. Wikipedia. [Online] [Cited: Mars 2, 2010.] http://en.wikipedia.org/wiki/History_of_the_World_Wide_Web. 2. W3. W3 All standards. W3. [Online] 2010. [Cited: March 18, 2010.] http://www.w3.org/TR/. 3. Gopher Protocol . Wikipedia. [Online] June 2009. [Cited: March 23, 2010.] http://en.wikipedia.org/wiki/Gopher_(protocol). 4. Dot.Com Bubble. Wikipedia. [Online] [Cited: March 23, 2010.] http://en.wikipedia.org/wiki/Dot- com_boom. 5. MySpace. Wikipedia. [Online] [Cited: March 2, 2010.] http://en.wikipedia.org/wiki/Myspace. 6. Facebook. Wikipedia. [Online] [Cited: March 2, 2010.] http://en.wikipedia.org/wiki/Facebook. 7. Wikipedia Size Comparison. Wikipedia. [Online] [Cited: March 2, 2010.] http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons. 8. white, phillip. How many videos are on youtube. associatedcontent. [Online] July 9, 2009. [Cited: March 2, 2010.] http://www.associatedcontent.com/article/1927414/how_many_videos_are_on_youtube.html http://logicerror.com/semanticWeb-long. 9. Paul, Ryan. Study: amount of digital info > global storage capacity. Ars Technica. [Online] March 12, 2008. [Cited: March 25, 2010.] http://arstechnica.com/old/content/2008/03/study-amount-of- digital-info-global-storage-capacity.ars. 10. Data, data everywhere. Economist. [Online] Febuary 25, 2010. [Cited: March 25, 2010.] http://www.economist.com/specialreports/displaystory.cfm?story_id=15557443. 11. Berners-Lee, Tim. Semantic Web Road Map. w3. [Online] Okt 14, 1998. [Cited: March 25, 2010.] http://www.w3.org/DesignIssues/Semantic.html. 12. Questioning Semantic Web History. Zimbio. [Online] Jan 22, 2009. [Cited: March 25, 2010.] http://www.zimbio.com/Semantic+Web+-+Web+3.0/articles/3/Questioning+Semantic+Web+history. 13. Linking Open Data. W3. [Online] http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/LinkingOpenData. 14. Ontologies. W3. [Online] 2010. [Cited: March 18, 2010.] http://www.w3.org/standards/semanticweb/ontology. 15. Query. W3. [Online] 2010. [Cited: March 18, 2010.] http://www.w3.org/standards/semanticweb/query.html. 17
  19. 19. 16. Internet World Statistics. Internet World Statistics. [Online] September 30, 2009. [Cited: March 18, 2010.] http://www.internetworldstats.com/stats.htm. 17. Asay, Matt. Is Internet access a fundimental right. news.cnet.com. [Online] may 6, 2009. [Cited: march 18, 2010.] http://news.cnet.com/8301-13505_3-10234555-16.html. 18. Twine. Twine. [Online] [Cited: March 24, 2010.] http://www.twine.com. 19. TripIt. TripIt. [Online] [Cited: March 25, 2010.] http://www.tripit.com/. 20. Pollock, Jeffrey T. Semantic Web For Dummies. Indianapolis, Indiana : Wiley Publishing, Inc, 2009. 21. Callari, Ron. Web 4.0 Trip Down The Rabbit Hole Or Brave New World. zmogo. [Online] June 3, 2009. [Cited: March 25, 2010.] http://www.zmogo.com/web/web-40trip-down-the-rabbit-hole-or- brave-new-world/. 22. Linked Data. w3.org. [Online] June 18, 2009. [Cited: March 17, 2010.] http://www.w3.org/DesignIssues/LinkedData.html. 23. Berners-Lee, Tim. On The Web Next. Ted.com. [Online] March 2009. [Cited: March 17, 2010.] http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html. 24. —. _the_year_open_data_went_worldwide.html. Ted.com. [Online] March 2010. http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.html. 25. Christian Bizer, Tom Heath, Tim Berners-Lee. Linked Data - The Story So Far. s.l. : Special Issue on Linked Data, International Journal on Semantic Web and Information Systems, 2009. 26. Swartz, Aaron. The Semantic Web In Breadth. Logicerror. [Online] [Cited: March 2, 2010.] http://logicerror.com/semanticWeb-long. 27. RDF Primer. W3. [Online] February 10, 2004. [Cited: March 2, 2010.] http://www.w3.org/TR/rdf- primer/. 28. W3C Semantic Web Frequently Asked Questions. W3C. [Online] November 12, 2009. [Cited: March 22, 2010.] http://www.w3.org/2001/sw/SW-FAQ#whont. 29. OWL 2 Primer. W3. [Online] October 27 , 2009. [Cited: March 22, 2010.] http://www.w3.org/TR/owl2-primer/. 30. Dannen, Chris. Self Diagnosing Web. FastCompany. [Online] FastCompany, September 2, 2009. [Cited: March 14, 2010.] http://www.fastcompany.com/blog/chris-dannen/techwatch/self- diagnosing-web. 31. DBPedia. Wikipedia. [Online] [Cited: March 24, 2010.] http://en.wikipedia.org/wiki/DBpedia. 32. DBpedia Faceted Browser. DBPedia. [Online] Open Link Software, November 16, 2009. [Cited: March 24, 2010.] http://dbpedia.neofonie.de/browse/. 33. musicbrainz. musicbrainz. [Online] [Cited: March 24, 2010.] http://musicbrainz.org/. 18
  20. 20. 34. Perez, Sarah. BBCs Semantic Music Project. Read Write Web. [Online] January 21, 2009. [Cited: March 24, 2010.] http://www.readwriteweb.com/archives/bbcs_semantic_music_project.php. 35. Case Study: CRUZAR — An application of semantic matchmaking for eTourism in the city of Zaragoza. w3 case studies. [Online] August 2008. [Cited: March 24, 2010.] http://www.w3.org/2001/sw/sweo/public/UseCases/Zaragoza-2/. 36. EveryBlock. EveryBlock. [Online] [Cited: March 24, 2010.] http://www.everyblock.com/. 37. Wilson, Tracy V. How Semantic Web Works. How Stuff Works. [Online] [Cited: March 22, 2010.] http://computer.howstuffworks.com/semantic-web4.htm. 38. Data Set Sizes. w3.com. [Online] March 9, 2010. [Cited: March 24, 2010.] http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics. 39. http://www.pharmasurveyor.com/. http://www.pharmasurveyor.com/. [Online] http://www.pharmasurveyor.com/. 40. BBC Music. BBC. [Online] [Cited: March 24, 2010.] www.bbc.co.uk/music. 41. Technology Life Cycle. WIkipedia. [Online] [Cited: March 24, 2010.] http://en.wikipedia.org/wiki/Technology_lifecycle. 42. Semantic Technology Primer. Semantic-Confrence. [Online] 2006,2007. [Cited: March 18, 2010.] http://www.semantic-conference.com/primer.html. 43. Les Horribles Cernettes. Wikipedia. [Online] [Cited: March 2, 2010.] http://en.wikipedia.org/wiki/Les_Horribles_Cernettes. 44. Dot Com Bubble. Wikipedia. [Online] [Cited: March 2, 2010.] http://en.wikipedia.org/wiki/Dot_Com_Bubble. 45. MsJosay. The Difference between Web 2.0 and Web 1.0. hubpages. [Online] [Cited: March 2, 2010.] http://hubpages.com/hub/The-Difference-between-Web-20-and-Web-10. 19

×