Semantic web Document


Published on

This is the report on the semantic web technology - Abhijit Chandrasen Manepatil

Published in: Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Semantic web Document

  2. 2. OCT- NOV 2012 CERTIFICATE This is to certify that the project report entitled “SEMANTIC WEB” Submitted by Manepatil Abhijit Chandrasen Roll No: SKN CE-19is a bonafide work carried out by him under the supervision of Prof. P.R.Baraptre and it is approved for the partial fulfillment of the requirement ofUniversity of Pune, for the award of the degree of Master of ComputerEngineeringThe seminar work has not been earlier submitted to any other institute oruniversity for the award of degree or diploma.Prof. P.R. Barapatre Prof. A. M. Kanthe Dr. J.S. Inamdar (Seminar Guide) P.G. Co-ordinator Principal, Comp. Engg. Dept SKN-SIT, LonavalaPlace : PuneDate : 2
  3. 3. ACKNOWLEDGEMENT I express my sincere thanks to Prof. P. R. Barapatre whose supervision, inspirationand valuable discussion has helped me tremendously to complete this Seminar. His guidanceproved to be the most valuable to overcome all the hurdles in the fulfillment of this Seminar. I am grateful to Prof. A. M. Kanthe P.G. Co-ordinator of Computer Department forhis co-operation and inspiration. I would also like to express my appreciation and thanks to all my friends whoknowingly or unknowingly have assisted and encouraged me throughout my hard work. 3
  4. 4. INDEX Topics Page No.1. Introduction _______________________________________ 082. History ___________________________________________ 10 2.1 Web 1.0 ________________________________________ 10 2.2 Web 2.0 ________________________________________ 123. Web 3.0- A Basic Introduction _________________________ 144. The Semantic Web Vision _____________________________ 165. A Layered Approach _________________________________ 186. Key Components ____________________________________ 21 6.1URI ____________________________________________ 21 6.2 RDF___________________________________________ 22 6.3 RDFS __________________________________________ 23 6.4 OWL __________________________________________ 26 6.5 Microformat _____________________________________ 277. Practical Illustration __________________________________ 308. Difference between Web 1.0, Web 2.0 and Web 3.0 ________ 329. Challenges _________________________________________ 3410.Project Implementation _______________________________ 3511.Conclusion _________________________________________ 3812. References _________________________________________ 39 4
  5. 5. FIGURE INDEX Figure Page No.1. Figure 1 Web 1.0 Example …………………… 112. Figure 2 Web 2.0 Example …………………… 133. Figure 3 Layered Approach of Semantic Web .. 184. Figure 4 RDF Example ………………………… 255. Figure 5 Traditional Web Model ……………… 306. Figure 6 Semantic Web Model ……………....... 31 5
  6. 6. ABSTRACT The Semantic Web is an evolving development of the World Wide Web in which themeaning (semantics) of information and services on the web is defined, making it possible forthe web to "understand" and satisfy the requests of people and machines to use the webcontent. At its core, the semantic web comprises a set of design principle. collaborativeworking groups, and a variety of enabling technologies. Some elements of the semantic web are expressed as prospective futurepossibilities that are yet to be implemented or realized. Other elements of the semantic web areexpressed in formal specifications. Some of these include Resource Description Framework(RDF), a variety of data interchange formats (e.g. RDF/XML, N3, Turtle, N-Triples), andnotations such as RDF Schema (RDFS) and the Web Ontology Language (OWL), all of whichare intended to provide a formal description of concepts, terms, and relationships within agiven knowledge . The key components of semantic web technology are as follows: 1. OWL: The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies endorsed by the World Wide Web Consortium. They are characterized by formal semantics and RDF/XML-based serializations for the Semantic Web. OWL has attracted both academic, medical and commercial interest. 2. Resource Description Format: The Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax formats. 3. RDF Schema: RDF Schema (various abbreviated as RDFS, RDF(S), RDF-S, or RDF/S) is an extensible knowledge representation language, providing basic elements for the description of ontologies, otherwise called Resource Description Framework (RDF) vocabularies, intended to structure RDF resources. 4. Microformat: A microformat (sometimes abbreviated μF) is a web-based approach to semantic markup that seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes, in web pages and other contexts that support (X)HTML, such as RSS. This approach allows information intended for end-users . 6
  8. 8. 1. INTRODUCTION Currently the focus of a W3C working group, the Semantic Web vision wasconceived by Tim Berners-Lee, the inventor of the World Wide Web. The World Wide Webchanged the way we communicate, the way we do business, the way we seek information andentertainment – the very way most of us live our daily lives. Calling it the next step in Webevolution, Berners-Lee defines the Semantic Web as ―a web of data that can be processeddirectly and indirectly by machines.‖ In the Semantic Web data itself becomes part of the Web and is able to beprocessed independently of application, platform, or domain. This is in contrast to the WorldWide Web as we know it today, which contains virtually boundless information in the form ofdocuments. We can use computers to search for these documents, but they still have to be readand interpreted by humans before any useful information can be extrapolated. Computers canpresent you with information but can’t understand what the information is well enough todisplay the data that is most relevant in a given circumstance. The Semantic Web, on the otherhand, is about having data as well as documents on the Web so that machines can process,transform, assemble, and even act on the data in useful ways. Imagine this scenario. You’re a software consultant and have just received anew project. You’re to create a series of SOAP-based Web services for one of your biggestclients. First, you need to learn a bit about SOAP, so you search for the term using yourfavorite search engine. Unfortunately, the results you’re presented with are hardly helpful.There are listings for dish detergents, facial soaps, and even soap operas mixed into the results.Only after sifting through multiple listings and reading through the linked pages are you ableto find information about the W3C’s SOAP specifications. Because, of the different semantic associations of the word ―soap,‖ the resultsyou receive are varied in relevance and you still have to do a lot of work to find theinformation you’re looking for. However, in a Semantic Web-enabled environment, you coulduse a Semantic Web agent to search the Web for ―SOAP‖ where SOAP is a type of technologyspecification used in Web services. This time, the results of your search will be relevant. YourSemantic Web agent can also search your corporate network for the SOAP specification anddiscover if your colleagues have completed similar projects or have posted SOAP-related 8
  9. 9. research on the network. Based on the semantic information available for SOAP, your agentalso presents you with a list of related technologies. Now you know that WSDL, XML, andURI are all technologies related to SOAP, and that you’ll need to do some research on them,too, before beginning your project. Armed with the information returned by your SemanticWeb agent, you read the related technology specifications and send emails to the colleagueswho have made SOAP-related materials available on the network to ask for their input beforestarting your new project. 9
  10. 10. 2. HISTORY 2.1 Web 1.0: Web 1.0 (1991-2003) is a retronym which refers to the state of the World Wide Web, and any website design style used before the advent of the Web 2.0 phenomenon. Web 1.0 began with the release of the WWW to the public in 1991, and is the general term that has been created to describe the Web before the "bursting of the Dot-com bubble" in 2001, which is seen by many as a turning point for the internet. 2.1.1 WEB 1.0 DESIGN ELEMENTS Some typical design elements of a Web 1.0 site include: Static pages instead of dynamic user-generated content. The use of framesets. Proprietary HTML extensions such as the <blink> and <marquee> tags introduced during the first browser war. Online guestbook. GIF buttons, typically 88x31 pixels in size promoting web browsers and other products. HTML forms sent via email. A user would fill in a form, and upon clicking submit their email client would attempt to send an email containing the forms details. 10
  11. 11. 2.1.2Web 1.0 Example:Figure 1. Web 1.0 ExampleWikipedia is an example of web 1.0 because the site allows the user to only view pagesor search information at best, but the user interaction is minimum and the site isbasically static. 11
  12. 12. 2.2 Web 2.0 : The term "Web 2.0" (2004–present) is commonly associated with webapplications that facilitate interactive information sharing, interoperability, user-centereddesign and collaboration on the World Wide Web. Examples of Web 2.0 include web-basedcommunities, hosted services, web applications, social-networking sites, video-sharing sites,wikis, blogs, mashups, and folksonomies. A Web 2.0 site allows its users to interact with otherusers or to change website content, in contrast to non-interactive websites where users arelimited to the passive viewing of information that is provided to them. Although the term suggests a new version of the World Wide Web, it doesnot refer to an update to any technical specifications, but rather to cumulative changes in theways software developers and end-users use the Web.2.2.1 Web 2.0 Characteristics : Web 2.0 websites allow users to do more than just retrieve information. They canbuild on the interactive facilities of "Web 1.0" to provide "Network as platform" computing,allowing users to run software-applications entirely through a browser. Users can own the dataon a Web 2.0 site and exercise control over that data .These sites may have an "Architecture ofparticipation" that encourages users to add value to the application as they use it. The concept of Web-as-participation-platform captures many of thesecharacteristics. Bart Decrem, a founder and former CEO of Flock, calls Web 2.0 the"participatory Web and regards the Web-as-information-source as Web 1.0. The impossibility of excluding group-members who don’t contribute to the provisionof goods from sharing profits gives rise to the possibility that rational members will prefer towithhold their contribution of effort and free-ride on the contribution of others.] This requireswhat is sometimes called Radical Trust by the management of the website. According to Bestthe characteristics of Web 2.0 are: rich user experience, user participation, dynamic content,metadata, web standards and scalability. Further characteristics, such as openness, freedom]and collective intelligence] by way of user participation, can also be viewed as essentialattributes of Web 2.0. 12
  13. 13. 2.2.2 Web 2.0 Examples:Figure 2 Web 2.0 ExamplesFacebook is a social networking site and it is a prominent example of web 2.0. This site allowsuser to make friends, write them messages, chat with them , upload and share photos etc.activities. 13
  14. 14. 3. Web 3.0- A Basic Introduction: The Semantic Web is a mesh of information linked up in such a way as to be easilyprocessable by machines, on a global scale. You can think of it as being an efficient way ofrepresenting data on the World Wide Web, or as a globally linked database. The Semantic Web was thought up by Tim Berners-Lee, inventor of the WWW,URIs, HTTP, and HTML. There is a dedicated team of people at the World Wide Webconsortium (W3C) working to improve, extend and standardize the system, and manylanguages, publications, tools and so on have already been developed. However, SemanticWeb technologies are still very much in their infancies, and although the future of the projectin general appears to be bright, there seems to be little consensus about the likely direction andcharacteristics of the early Semantic Web. Whats the rationale for such a system? Data that is generally hidden away in HTMLfiles is often useful in some contexts, but not in others. The problem with the majority of dataon the Web that is in this form at the moment is that it is difficult to use on a large scale,because there is no global system for publishing data in such a way as it can be easilyprocessed by anyone. For example, just think of information about local sports events, weatherinformation, plane times, Major League Baseball statistics, and television guides... all of thisinformation is presented by numerous sites, but all in HTML. The problem with that is that, issome contexts, it is difficult to use this data in the ways that one might want to do so. The Semantic Web is a web of data. There is lots of data we all use every day, andit is not part of the web. I can see my bank statements on the web, and my photographs, and Ican see my appointments in a calendar. But can I see my photos in a calendar to see what I wasdoing when I took them? Can I see bank statement lines in a calendar? Why not? Because we dont have a web of data. Because data is controlled byapplications, and each application keeps it to itself. The Semantic Web is about two things. It is about common formats forintegration and combination of data drawn from diverse sources, where on the original Webmainly concentrated on the interchange of documents. It is also about language for recordinghow the data relates to real world objects. That allows a person, or a machine, to start off in 14
  15. 15. one database, and then move through an unending set of databases which are connected not bywires but by being about the same thing. 15
  16. 16. 4.0 The Semantic Web Vision Today’s Web The World Wide Web has changed the way people communicate with each other andthe way business is conducted. It lies at the heart of a revolution which is currentlytransforming the developed world towards a knowledge economy, and more broadly speaking,to a knowledge society. This development has also changed the way we think of computers.Originally they were used for computing numerical calculations. Currently their predominantuse is information processing, typical applications being data bases, text processing, andgames. At present there is a transition of focus towards the view of computers as entry pointsto the information highways. Most of today’s Web content is suitable for human consumption.Even Web content that is generated automatically from data bases is usually presented withoutthe original structural information found in data bases. Typical uses of the Web today involvehumans seeking and consuming information, searching and getting in touch with otherhumans, reviewing the catalogs of online stores and ordering products by filling out forms, andviewing adult material. These activities are not particularly well supported by software tools.Apart from the existence of links which establish connections between documents, the mainvaluable, indeed indispensable, kind of tools are search engines. Keyword-based searchengines, such as AltaVista, Yahoo and Google, are the main tool for using today’s Web. It isclear that the Web would not have been the huge success it was, were it not for search engines.However there are serious problems associated with their use. Here we list the main ones: •High recall, low precision: Even if the main relevant pages are retrieved, they are of little use ifanother 28,758 mildly relevant or irrelevant documents were also retrieved. Too much caneasily become as bad as too little.• Low or no recall: Often it happens that we don’t get any answer for our request, or thatimportant and relevant pages are not retrieved. Although low recall is a less frequent problemwith current search engines, it does occur. This is often due to the third problem:• Results highly sensitive to vocabulary: Often we have to use semantically similar keywordsto get the results we wish; in these cases the relevant documents use different terminologyfrom the original query. This behaviour is unsatisfactory, since semantically similar queriesshould return similar results. 16
  17. 17. • Results are single Web pages: If we need information that is spread over various documents,then we must initiate several queries to collect the relevant documents, and then we mustmanually extract the partial information and put it together.Interestingly, despite obvious improvements in search engine technology, the difficultiesremain essentially the same. It seems that the amount of Web content outgrows thetechnological progress. But even if a search is successful, it is the human who has to browseselected retrieved documents to extract the information he is actually looking for. In otherwords, there is not much support for retrieving the information (for some limited exceptionssee the next section), an activity that can be very time-consuming. Therefore the terminformation retrieval, used in association with search engines, is somewhat misleading,location finder might be a more appropriate term. Also, results of Web searches are not readilyaccessible by other software tools; search engines are often isolated applications. The mainobstacle for providing a better support to Web users is that, at present, the meaning ofWebcontent is not machine accessible. Of course there are tools that can retrieve texts, split theminto parts, check the spelling, decompose them, put them together in various ways, and counttheir words. But when it comes to interpreting sentences and extracting useful informationfor users, the capabilities of current software is still very limited. 17
  18. 18. 5.0 A Layered Approach : Fig. 3 : A Layered Approach of Semantic Web The development of the Semantic Web proceeds in steps, each step building a layer ontop of another. The pragmatic justification for this approach is that it is easier to achieveconsensus on small steps, while it is much harder to get everyone on board if too much isattempted. Usually there are several research groups moving in different directions; thiscompetition of ideas is a major driving force for scientific progress. However, from anengineering perspective there is a need to standardize. So if most researchers agree on certainissues and disagree on others, it makes sense to fix the points of agreement. This way, even ifthe more ambitious research efforts should fail, there will be at least partial positive outcomes.Once a a standard has been established, many more groups and companies will adopt it,instead of waiting to see which of the alternative research lines will be successful in the end.The nature of the Semantic Web is such that companies and single users must build tools, addcontent and use that content. We cannot wait until the full Semantic Web vision materializes –it may take another 10 years for it to be realized to its full extent (as envisioned today, of 18
  19. 19. course!). In building one layer of the SemanticWeb on top of another, there are someprinciplesthat should be followed:1. Downward compatibility: Agents fully aware of a layer should also be able to interpret anduse information written at lower levels. For example, agents aware of the semantics of OWLcan take full advantage of information written in RDF and RDF Schema.2. Upward partial understanding: On the other hand, agents fully aware of a layer should takeat least partial advantage of information at higher levels. For example, an agent aware only ofthe RDF and RDF Schema semantics can interpret knowledge written in OWL partly, bydisregarding those elements that go beyond RDF and RDF Schema.Figure shows the ―layer cake‖ of the Semantic Web, which is due to Tim Berners-Lee anddescribes the main layers of the Semantic Web designand vision. At the bottom we find XML,a language that lets one write structuredWeb documents with a user-defined vocabulary. XMLis particularly suitable for sending documents across the Web. RDF is a basic data model, likethe entity-relationship model, for writing simple statements about Web objects (resources).The RDF data model does not rely on XML, but RDF has an XML-based syntax. Therefore inFigure it is located on top of the XML layer.RDF Schema provides modelling primitives for organizingWeb objects into hierarchies. Keyprimitives are classes and properties, subclass and subproperty relationships, and domain andrange restrictions. RDF Schema is based on RDF. RDF Schema can be viewed as a primitivelanguage for writing ontologies. But there is a need for more powerful ontology languages thatexpand RDF Schema and allow the representations of more complex relationships betweenWeb objects. The logic layer is used to enhance the ontology language further, and to allow towrite application-specific declarative knowledge. The proof layer involves the actual deductiveprocess, as well as the representation of proofs in Web languages (from lower levels) and proofvalidation. Finally trust will emerge through the use of digital signatures, and other kind ofknowledge, based on recommendations by agents we trust, or rating and certification agenciesand consumer bodies. Sometimes the word Web of Trust is used, to indicate that trust will beorganised in the same distributed and chaotic way as theWWWitself. Being located at the topof the pyramid, trust is a high-level and crucial concept: The Web will only achieve its fullpotential when users have trust in its operations (security) and the quality of informationprovided. 19
  20. 20. Description: The basic architecture of semantic web contains Identifiers (Uniform ResourceIdentifiers) and character code as Unicode. Above this layer is the Syntax layer, defining thesyntactical relationship and the base here is XML. Above this layer is the Data Interchangelayer with RDF defining the same. Above it the query handling part is handled by SPARQLand the taxonomies is determined by RDFS. The Ontologies are governed by OWL and rulesby RIF/SWRL. Above it is the unifying logic and the proof layer. All the aforementionedlayers were encrypted using Cryptology. Above these is the Trust layer. A brief description of all the aforementioned layers and components shall be given in theupcoming segments of the report. 20
  21. 21. 6. Key Components Semantic Web has five main components which help in accomplishing the requiredtask and define the functioning of the web: 6.1 Uniform Resource Identifier : A URI is simply a Web identifier: like the strings starting with "http:" or "ftp:" thatyou often find on the World Wide Web. Anyone can create a URI, and the ownership of themis clearly delegated, so they form an ideal base technology with which to build a global Webon top of. In fact, the World Wide Web is such a thing: anything that has a URI is consideredto be "on the Web". A URI may be classified as a locator (URL), or a name (URN), or both. A UniformResource Name (URN) functions like a persons name, while a Uniform Resource Locator(URL) resembles that persons street address . In other words: the URN defines an itemsidentity, while the URL provides a method for finding it. The URI syntax consists of a URI scheme name followed by a colon character,and then by a scheme-specific part. The specifications that govern the schemes determine thesyntax and semantics of the scheme-specific part, although the URI syntax does force allschemes to adhere to a certain generic syntax that, among other things, reserves certaincharacters for special purposes (without always identifying those purposes). The URI syntaxalso enforces restrictions on the scheme-specific part, in order to, for example, provide for adegree of consistency when the part has a hierarchical structure. Percent encoding can addextra information to a URI. A URI reference is another type of string that represents a URI, and (in turn)represents the resource identified by that URI. Informal usage does not often maintain thedistinction between a URI and a URI reference, but protocol documents should not allow forambiguity. A URI reference may take the form of a full URI, or just the scheme-specific portionof one, or even some trailing component thereof – even the empty string. An optional fragmentidentifier, preceded by #, may be present at the end of a URI reference. The part of the 21
  22. 22. reference before the # indirectly identifies a resource, and the fragment identifier identifiessome portion of that resource. In order to derive a URI from a URI reference, software converts the URI referenceto absolute form by merging it with an absolute base URI according to a fixed algorithm. Thesystem treats the URI reference as relative to the base URI, although in the case of an absolutereference, the base has no relevance. The base URI typically identifies the documentcontaining the URI reference, although this can be overridden by declarations made within thedocument or as part of an external data transmission protocol. If the base URI includes afragment identifier, it is ignored during the merging process. If a fragment identifier is presentin the URI reference, it is preserved during the merging process. Web document markup languages frequently use URI references to point to otherresources, such as external documents or specific portions of the same logical document. 6.2 RDF: The Resource Description Framework (RDF) is a family of World Wide WebConsortium (W3C) specifications originally designed as a metadata data model. It has come tobe used as a general method for conceptual description or modeling of information that isimplemented in web resources, using a variety of syntax formats. The RDF data model is similar to classic conceptual modeling approaches suchas Entity-Relationship or Class diagrams, as it is based upon the idea of making statementsabout resources (in particular Web resources) in the form of subject-predicate-objectexpressions. These expressions are known as triples in RDF terminology. The subject denotesthe resource, and the predicate denotes traits or aspects of the resource and expresses arelationship between the subject and the object. For example, one way to represent the notion"The sky has the color blue" in RDF is as the triple: a subject denoting "the sky", a predicatedenoting "has the color", and an object denoting "blue". RDF is an abstract model with severalserialization formats (i.e., file formats), and so the particular way in which a resource or tripleis encoded varies from format to format. A collection of RDF statements intrinsically represents a labeled, directed multi-graph. As such, an RDF-based data model is more naturally suited to certain kinds ofknowledge representation than the relational model and other ontological models. However, in 22
  23. 23. practice, RDF data is often persisted in relational database or native representations also calledTriplestores, or Quad stores if context (i.e. the named graph) is also persisted for each RDFtriple. As RDFS and OWL demonstrate, additional ontology languages can be built upon RDF. The subject of an RDF statement is either a Uniform Resource Identifier (URI) or ablank node, both of which denote resources. Resources indicated by blank nodes are calledanonymous resources. They are not directly identifiable from the RDF statement. Thepredicate is a URI which also indicates a resource, representing a relationship. The object is aURI, blank node or a Unicode string literal. In Semantic Web applications, and in relatively popular applications of RDF likeRSS and FOAF (Friend of a Friend), resources tend to be represented by URIs thatintentionally denote, and can be used to access, actual data on the World Wide Web. But RDF,in general, is not limited to the description of Internet-based resources. In fact, the URI thatnames a resource does not have to be dereference able at all. For example, a URI that beginswith "http:" and is used as the subject of an RDF statement does not necessarily have torepresent a resource that is accessible via HTTP, nor does it need to represent a tangible,network-accessible resource — such a URI could represent absolutely anything. However,there is broad agreement that a bare URI (without a # symbol) which returns a 300-level codedresponse when used in an http GET request should be treated as denoting the internet resourcethat it succeeds in accessing. 6.3 RDFS: RDF Schema (various abbreviated as RDFS, RDF(S), RDF-S, or RDF/S) is anextensible knowledge representation language, providing basic elements for the description ofontologies, otherwise called Resource Description Framework (RDF) vocabularies, intended tostructure RDF resources. The first version was published by the World-Wide Web Consortium(W3C) in April 1998, and the final W3C recommendation was released in February 2004.Many RDFS components are included in the more expressive language Web OntologyLanguage (OWL). For Example: rdfs:Class declares a resource as a class for other resources. 23
  24. 24. A typical example of an rdfs:Class is foaf:Person in the Friend of a Friend (FOAF)vocabulary. An instance of foaf:Person is a resource that is linked to the class using therdf:type predicate, such as in the following formal expression of the natural languagesentence : John is a Person.Ex:John rdf:type foaf:PersonThe definition of rdfs:Class is recursive: rdfs:Class is the rdfs:Class of any rdfs:Class.rdfs:subClassOf allows to declare hierarchies of classes.For example, the following declares that Every Person is an Agent:foaf:Person rdfs:subClassOf foaf:Agent Hierarchies of classes support inheritance of a property domain and range from a classto its subclasses. The RDF Schema specification describes rdf:Property as the class of RDFproperties. Each member of the class is an RDF predicate.rdfs:domain of an rdf:predicate declares the class of the subject in a triple whose secondcomponent is the predicate.rdfs:range of an rdf:predicate declares the class or datatype of the object in a triple whosesecond component is the predicate. For example, the following declarations are used to express that the propertyex:employer relates a subject, which is of type foaf:Person, to an object, which is of typefoaf:Organization:ex:employer rdfs:domain foaf:Personex:employer rdfs:range foaf:OrganizationGiven the previous two declarations, the following triple requires that ex:John is necessarily afoaf:Person, and ex:CompanyX is necessarily a foaf:Organization:ex:John ex:employer ex:CompanyX 24
  25. 25. rdfs:subPropertyOf is an instance of rdf:Property that is used to state that all resources relatedby one property are also related by another. Example Statement: ―Abhijit stays in Pune.‖Figure 4. RDF Example RDF Triple: (Abhijit, stays in, Pune) This can be mapped to a schema which contains the classes ‖ Citizen ‖ and ‖ Country‖. A Citizen ―abc‖ stays in a country ‖ X‖ , then ―X’ also involves ―abc‖. The class citizen has subclasses ―Voting citizen ‖ and ‖ non voting citizen‖ and the country class has subclasses ‖ states ‖ which inturn has subclasses ‖ city ‖ , ―town‖ , ‖ taluka‖ represented by the ―subclassof ‖ property. 25
  26. 26. The rectangle represents properties, ellipses in the RDFS layer represents classes while ellipses in the RDF layer represents instances. The domain and range enforce constraints on the subject and objects of a property. So, the above diagram suggests that the subject ( Abhijit Thatte ) is a ―type‖ of voting citizen , object (Pune) is a ―type‖ of a city and the relationship between them is ‖ stays in‖ or ―resides in‖ 6.4 OWL: The Web Ontology Language (OWL) is a family of knowledge representationlanguages for authoring ontologies endorsed by the World Wide Web Consortium. They arecharacterised by formal semantics and RDF/XML-based serializations for the Semantic Web.OWL has attracted both academic, medical and commercial interest. In October 2007, a new W3C working group was started to extend OWL withseveral new features as proposed in the OWL 1.1 member submission. This new version,called OWL 2, soon found its way into semantic editors such as Protégé and semanticreasoners such as Pellet, RacerPro and FaCT++. W3C announced the new version on 27October 2009. The OWL family contains many species, serializations, syntaxes and specificationswith similar names. This may be confusing unless a consistent approach is adopted. OWL andOWL2 will be used to refer to the 2004 and 2009 specifications, respectively. Full speciesnames will be used, including specification version (for example, OWL2 EL). When referringmore generally, OWL Family will be used. The data described by an ontology in the OWL family is interpreted as a set of"individuals" and a set of "property assertions" which relate these individuals to each other. Anontology consists of a set of axioms which place constraints on sets of individuals (called"classes") and the types of relationships permitted between them. These axioms providesemantics by allowing systems to infer additional information based on the data explicitly 26
  27. 27. provided. A full introduction to the expressive power of the OWL is provided in the W3CsOWL Guide. Example: An ontology describing families might include axioms stating that a "hasMother" propertyis only present between two individuals when "hasParent" is also present, and individuals ofclass "HasTypeOBlood" are never related via "hasParent" to members of the"HasTypeABBlood" class. If it is stated that the individual Harriet is related via "hasMother"to the individual Sue, and that Harriet is a member of the "HasTypeOBlood" class, then it canbe inferred that Sue is not a member of "HasTypeABBlood". 6.5 Microformat: A microformat (sometimes abbreviated μF) is a web-based approach to semanticmarkup that seeks to re-use existing HTML/XHTML tags to convey metadata and otherattributes, in web pages and other contexts that support (X)HTML, such as RSS. This approachallows information intended for end-users (such as contact information, geographiccoordinates, calendar events, and the like) to also be automatically processed by software. Although the content of web pages is technically already capable of "automatedprocessing," and has been since the inception of the web, such processing is difficult becausethe traditional markup tags used to display information on the web do not describe what theinformation means. Microformats are intended to bridge this gap by attaching semantics, andthereby obviate other, more complicated, methods of automated processing, such as naturallanguage processing or screen scraping. The use, adoption and processing of microformatsenables data items to be indexed, searched for, saved or cross-referenced, so that informationcan be reused or combined. Current microformats allow the encoding and extraction of events, contactinformation, social relationships and so on. More are being developed. Version 3 of the Firefox 27
  28. 28. browser, as well as version 8 of Internet Explorer are expected to include native support formicroformats. Microformats emerged as part of a grassroots movement to make recognizable dataitems (such as events, contact details or geographical locations) capable of automatedprocessing by software, as well as directly readable by end-users Link-based microformatsemerged first. These include vote links that express opinions of the linked page, which can betallied into instant polls by search engines. As the microformats community grew, CommerceNet, a nonprofit organization thatpromotes electronic commerce on the Internet, helped sponsor and promote the technology andsupport the microformats community in various ways. CommerceNet also helped co-found community site. Neither CommerceNet nor is a standards body. The microformatscommunity is an open wiki, mailing list, and Internet relay chat (IRC) channel. Most of theexisting microformats were created at the wiki and associated mailing list,by a process of gathering examples of web publishing behaviour, then codifying it. Some othermicroformats (such as rel=nofollow and unAPI) have been proposed, or developed, elsewhere.Example:In this example, the contact information is presented as follows: <div> <div>Joe Doe</div> <div>The Example Company</div> <div>604-555-1234</div> <a href=""></a> </div>With hCard microformat markup, that becomes: <div class="vcard"> <div class="fn">Joe Doe</div> <div class="org">The Example Company</div> <div class="tel">604-555-1234</div> <a class="url" href=""></a> 28
  29. 29. </div> Here, the formatted name (fn), organisation (org), telephone number (tel) and webaddress(url) have been identified using specific class names and the whole thing is wrapped inclass="vcard", which indicates that the other classes form an hCard (short for "HTML) andare not merely coincidentally named. Other, optional, hCard classes also exist. It is nowpossible for software, such as browser plug-ins, to extract the information, and transfer it toother applications, such as an address book. 29
  30. 30. 7. Practical Illustration Of Semantic Web Application: If we suppose that a certain Professor Anjali Sharma wishes tomake a web page for her own encompassing a faculty page, a research page, a blog site and astaff listing page then using traditional web modelling the pages would look like so:Figure 5. Traditional Web Model A Faculty Page A research PageProf. xyz A Blog Site A Staff Listing PageNow, if she decides to use semantic web instead of the traditional web model then thecomplexity and presentability of the web pages would increase immensely. So we can linkProfessor Sharma’s faculty page to her research. Then link data in her blog to both of these.And link profile data to her staff listing. And her staff listing could show some of the otheracademics she works with. With her research page showing her links with worldwide researchcollaborators. Who also know one of her colleagues. Who comment on Professor Sharma’sblog regularly. With all this data being able to be displayed simply it provides a much richeruser experience and offers information that previously might not have been exposed.The web page would now look like: 30
  31. 31. Figure 6 Semantic Web Model The straight lines show the relationship between various web pages, researchers,staff and other web entities. The inter twined relationship shows the complex relation betweendata that can be viewed and the entities. 31
  32. 32. 8. Difference between Web 1.0, Web 2.0 and Web 3.0:Web 1.0: The Internet before 1999, experts call it Read-Only era. The average internet usersrole was limited only to reading the information presented to him. The best examples aremillions of static websites which mushroomed during boom. There was no activecommunication or information flow from consumer of the information to producer of theinformation.Web 2.0: The lack of active interaction of common user with the web lead to the birth ofWeb 2.0. The year 1999 marked the beginning of a Read-Write-Publish era with notablecontributions from LiveJournal (Launched in April, 1999) and Blogger (Launched in August,1999). Now even a non-technical user can actively interact & contribute to the web usingdifferent blog platforms. This era empowered the common user with a few new concepts viz.Blog, Social-Media & Video-Streaming. Publishing your content is only a few clicks away!Few remarkable developments of Web 2.0 are Twitter, YouTube, eZineArticles, Flickr andFacebook.Web 3.0: It seems we have everything whatever we had wished for in Web 2.0, but it is waybehind when it comes to intelligence. Perhaps a six year old child has a better analyticalabilities than the existing search technologies! Keyword based search of web 2.0 resulted in aninformation overload. The following attributes are going to be a part of Web 3.0: contextual Search Tailor made Search Personalized Search Evolution of 3D Web Deductive ReasoningThough Web is yet to see something which can be termed as fairly intelligent but the efforts toachieve this goal has already began. 2 weeks back the Official Google Blog mentioned abouthow Google search algorithm is now getting intelligent as it can identify many synonyms.For example Pictures & Photos are now treated as similar in meaning. From now onwards yoursearch query GM crop will not lead you to GM (General Motors) website. Why? Cause, first 32
  33. 33. by synonym identification Google will understand that GM may mean General Motors orGenetically Modified. Then by context i.e. by the keyword crop it will deduce that the userwants information on genetically modified crops and not on General Motors. Similarly, GMcar will not lead you to genetically modified crop. Try out yourself to check how this newlyadded artificial intelligence works in Google. Also, there are many websites built on Web 3.0which personalizes your search. The web is indeed getting intelligent. 33
  34. 34. 9. Challenges : 1. Vastness: The World Wide Web contains at least 48 billion pages as of this writing (August 2, 2009). The SNOMED CT medical terminology ontology contains 370,000 class names, and existing technology has not yet been able to eliminate all semantically duplicated terms. Any automated reasoning system will have to deal with truly huge inputs. 2. Vagueness: These are imprecise concepts like "young" or "tall". This arises from the vagueness of user queries, of concepts represented by content providers, of matching query terms to provider terms and of trying to combine different knowledge bases with overlapping but subtly different concepts. Fuzzy logic is the most common technique for dealing with vagueness. 3. Uncertainty: These are precise concepts with uncertain values. For example, a patient might present a set of symptoms which correspond to a number of different distinct diagnoses each with a different probability. Probabilistic reasoning techniques are generally employed to address uncertainty. 4. Inconsistency: These are logical contradictions which will inevitably arise during the development of large ontologies .Deductive reasoning fails catastrophically when faced with inconsistency, because "anything follows from a contradiction―. 5. Deceit: This is when the producer of the information is intentionally misleading the consumer of the information. Cryptography techniques are currently utilized to alleviate this threat. 34
  35. 35. 10. Project Implementation:This section provides some example projects and tools, but is very incomplete. The choice ofprojects is somewhat arbitrary but may serve illustrative purposes. It is also remarkable that inthis early stage of the development of semantic web technology, it is already possible tocompile a list of hundreds of components that in one way or another can be used in building orextending semantic webs.A). DBPEDIADBpedia is an effort to publish structured data extracted from Wikipedia: the data is publishedin RDF and made available on the Web for use under the GNU Free Documentation License,thus allowing Semantic Web agents to provide inferencing and advanced querying over theWikipedia-derived dataset and facilitating interlinking, re-use and extension in other data-sources.B). FOAFA popular application of the semantic web is Friend of a Friend (or FoaF), which uses RDF todescribe the relationships people have to other people and the "things" around them. FOAFpermits intelligent agents to make sense of the thousands of connections people have with eachother, their jobs and the items important to their lives; connections that may or may not beenumerated in searches using traditional web search engines. Because the connections are sovast in number, human interpretation of the information may not be the best way of analyzingthem.FOAF is an example of how the Semantic Web attempts to make use of the relationshipswithin a social context.C). GOODRELATIONS FOR E-COMMERCEA huge potential for Semantic Web technologies lies in adding data structure and typed linksto the vast amount of offer data, product model features, and tendering / request for quotationdata. 35
  36. 36. The GoodRelations ontology is a popular vocabulary for expressing product information,prices, payment options, etc. It also allows expressing demand in a straightforward fashion.GoodRelations has been adopted by BestBuy, Yahoo, OpenLink Software, OReilly Media, theBook Mashup, and many others.D). SIOCThe SIOC Project - Semantically-Interlinked Online Communities provides a vocabulary ofterms and relationships that model web data spaces. Examples of such data spaces include,among others: discussion forums, weblogs, blogrolls / feed subscriptions, mailing lists, sharedbookmarks, image galleries.E). SIMILESemantic Interoperability of Metadata and Information in unLike EnvironmentsSIMILE is a joint project, conducted by the MIT Libraries and MIT CSAIL, which seeks toenhance interoperability among digital assets, schemata/vocabularies/ontologies, meta data,and services.F). NEXTBIOA database consolidating high-throughput life sciences experimental data tagged andconnected via biomedical ontologies. Nextbio is accessible via a search engine interface.Researchers can contribute their findings for incorporation to the database. The databasecurrently supports gene or protein expression data and is steadily expanding to support otherbiological data types.G). LINKING OPEN DATADatasets in the Linking Open Data project, as of Sept 2008Class linkages within the Linking Open Data datasetsThe Linking Open Data project is a W3C-led effort to create openly accessible, andinterlinked, RDF Data on the Web. The data in question takes the form of RDF Data Sets 36
  37. 37. drawn from a broad collection of data sources. There is a focus on the Linked Data style ofpublishing RDF on the Web.H). OPENPSIOpenPSI the (OpenPSI project) is a community effort to create UK government linked dataservice that supports research. It is a collaboration between the University of Southampton andthe UK government, lead by OPSI at the National Archive and is supported by JISC funding.I). (heritage-plus) is a Belgian project aimed at disclosing all types of heritagefrom the provinces of Limburg and Flemish Brabant and the city of Leuven to the public byapplying semantic web technology. uses RDF/XML, OWL and SKOS todescribe relationships to heritage types, concepts, objects, people, place and time. Data arenormalized and enriched by means of thesauri (AAT) and an ontology (CIDOC CRM),available for input, conversion and is a regional aggregator for EuropeanaLocal (Europeana) and an example ofhow semantic web technology is applied within the heterogeneous context of heritage. 37
  38. 38. 11. CONCLUSION Semantic Web is the future of Internet. Semantic web is expected to re write the internet as we know it and change the way we search information on net. The searches will become personalized and the results will be more accurate and more relevant. The use of Resource Description Format and Microformats will help in the advent of this technology. Although there are many challenges that have to be overcome in order to do so but the possibility of this technology overcoming and replacing the traditional web model seem bright currently. The traditional model of internet does not allow for intelligent searches and takes a lot of time because of the irrelevant searches being displayed too. Semantic Web can overcome all these problems to provide a better and rich user experience to consumers all over the globe. The next generation of web will better connect people and will further advent the information technology revolution. 38
  39. 39. 13. REFERENCES IEEE Internet Computing The Semantic Web: The Roles of XML and RDF  Stefan Decker And Sergey Melnik Stanford University IEEE INTELLIGENT SYSTEMS Ontology Languages for the Semantic Web  Asunción Gómez-Pérez and Oscar Corcho, Universidad Politécnica de Madrid IEEE Published by the IEEE Computer Society:  Semantics Scales Up Beyond Search in Web 3.0  Amit Sheth • Kno.e.sis, Wright State University  November/December 2011 T. Berners-Lee. Semantic Web Road Map.  39