Your SlideShare is downloading. ×
[DSBW Spring 2010] Unit 10: XML and Web And beyond
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

[DSBW Spring 2010] Unit 10: XML and Web And beyond

1,671
views

Published on

[DSBW Spring 2010] Unit 10: XML and Web And beyond

[DSBW Spring 2010] Unit 10: XML and Web And beyond

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,671
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1.
    • XML
      • DTD, XMLSchema
      • XSL, Xquery
    • Web Services
      • SOAP, WSDL
      • RESTful Web Services
    • Semantic Web
      • Introduction
      • RDF, RDF Schema, OWL, SPARQL
    Unit 10: XML and Web and Beyond
  • 2.
    • “ ... is a simple, very flexible text format derived from SGML (ISO 8879). Originally designed to meet the challenges of large-scale electronic publishing, XML is also playing an increasingly important role in the exchange of a wide variety of data on the Web and elsewhere. ”
    • W3 Consortium
    • XML …
      • is not a solution but a tool to build solutions
      • is not a language but a meta-language that require interoperating applications that use it to adopt clear conventions on how to use it
      • is a standardized text format that is used to represent structured information
    e X tensible M arkup L anguage
  • 3. SGML, XML and their applications HyTime HTML XHTML SMIL SOAP WML SGML XML Meta-Markup Language Markup Language Application
  • 4.
    • The document has exactly one root element
    • The root element can be preceded by an optional XML declaration
    • Non-empty elements are delimited by both a start-tag and an end-tag.
    • Empty elements are marked with an empty-element (self-closing) tag
    • Tags may be nested but must not overlap
    • All attribute values are quoted with either single (') or double (") quotes
          • <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?>
          • <address>
          • <street>
          • <line>123 Pine Rd.</line>
          • </street>
          • <city name=&quot;Lexington&quot;/>
          • <state abbrev=&quot;SC&quot;/>
          • <zip base=&quot;19072&quot; plus4=&quot;&quot;/>
          • </address>
    Well-Formed XML Documents
  • 5.
    • Are well-formed XML documents
    • Are documents that conform the rules defined by certain schemas
    • Schema: define the legal building blocks of an XML document. It defines the document structure with a list of legal elements. Two ways to define a schema:
      • DTD: Document Type Definition
      • XML Schema
    Valid XML Documents
  • 6.
      • <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?>
      • <!DOCTYPE address [
      • <!ELEMENT address (street, city, state, zip)>
      • <!ELEMENT street line+>
      • <!ELEMENT line (#PCDATA)>
      • <!ELEMENT city (#PCDATA)>
      • <!ELEMENT state (#PCDATA)>
      • <!ELEMENT zip (#PCDATA)> ]>
      • <address> ... </address>
      • <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?>
      • <!DOCTYPE address SYSTEM &quot;http://dtd.mycompany.com/address.dtd&quot;>
      • <address> ... </address>
    DTD Example: Embedded and External Definitions
  • 7.
    • DTD is not integrated with Namespace technology so users cannot import and reuse code
    • DTD does not support data types other than character data
    • DTD syntax is not XML compliant
    • DTD language constructs are no extensible
    DTD Limitations
  • 8.
    • <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?>
    • <xsd:schema xmlns:xsd=&quot;http://www.w3.org/2000/10/XMLSchema&quot; elementFormDefault=&quot;qualified&quot;>
    • <xsd:import namespace=&quot; &quot;/>
    • <xsd:element name=&quot;address&quot;>
    • <xsd:complexType>
    • <xsd:sequence>
    • <xsd:element name=&quot;street&quot;>
    • <xsd:complexType>
    • <xsd:all maxOccurs=&quot;unbounded&quot;>
    • <xsd:element name=&quot;line&quot; type=&quot;xsd:string&quot;/>
    • </xsd:all>
    • </xsd:complexType>
    • </xsd:element>
    • <xsd:element name=&quot;city&quot; type=&quot;xsd:string&quot;/>
    • <xsd:element name=&quot;state&quot; type=&quot;xsd:string&quot;/>
    • <xsd:element name=&quot;zip&quot; type=&quot;xsd:string&quot;/>
    • </xsd:sequence>
    • </xsd:complexType>
    • </xsd:element>
    • </xsd:schema>
    XML Schema: Example
  • 9.
    • Using a programming language and the SAX API.
      • SAX is a lexical, event-driven interface in which a document is read serially and its contents are reported as &quot;callbacks&quot; to various methods on a handler object of the user's design
    • Using a programming language and the DOM API.
      • DOM allows for navigation of the entire document as if it were a tree of &quot;Node&quot; objects representing the document's contents.
    • Using a transformation engine and a filter
      • XSLT, XQuery, etc
    Processing XML Documents
  • 10.
    • Alternative/complement to HTML
      • XML + CSS, XML + XSL, XHTML
    • Declarative application programming/configuration
      • Configuration files, descriptors, etc.
    • Data exchange among heterogeneous systems
      • B2B, e-commerce: ebXML
    • Data Integration from heterogeneous sources
      • Schema mediation
    • Data storage and processing
      • XML Databases, XQuery (XPath)
    • Protocol definition
      • SOAP, WAP, WML, etc.
    XML Uses
  • 11.
    • XSL serves the dual purpose of
      • transforming XML documents
      • exhibiting control over document rendering
    • XSL consists of two parts:
      • XSL Transformations (XSLT):
        • An XML language for transforming XML documents
        • It uses the XML Path Language (XPath) to search and transverse the element hierarchy of XML documents
      • XSL Formatting Objects (XSL-FO):
        • An XML language for specifying the visual formatting of an XML document.
        • It is a superset of the CSS functionally designed to support print layouts.
    eXtensible Stylesheet Language: XSL
  • 12.
    • <bib>
    • <book year=&quot;1994&quot;>
    • <title>TCP/IP Illustrated</title>
    • <author><last>Stevens</last><first>W.</first></author>
    • <publisher>Addison-Wesley</publisher>
    • <price>65.95</price>
    • </book>
    • <book year=&quot;1992&quot;>
    • <title>Advanced Programming in the Unix environment</title>
    • <author><last>Stevens</last><first>W.</first></author>
    • <publisher>Addison-Wesley</publisher>
    • <price>65.95</price>
    • </book>
    • <book year=&quot;2000&quot;>
    • <title>Data on the Web</title>
    • <author><last>Abiteboul</last><first>Serge</first></author>
    • <author><last>Suciu</last><first>Dan</first></author>
    • <publisher>Morgan Kaufmann Publishers</publisher>
    • <price>39.95</price>
    • </book>
    • </book>
    • </bib>
    XQuery (XML Query): Example ( source )
  • 13.
    • <results>
    • { let $a := doc(&quot;http://bstore1.example.com/bib/bib.xml&quot;)//author
    • for $last in distinct-values($a/last),
    • $first in distinct-values($a[last=$last]/first)
    • order by $last, $first
    • return
    • <author>
    • <name>
    • <last> { $last } </last><first> { $first } </first>
    • </name>
    • { for $b in doc(&quot;http://bstore1.example.com/bib.xml&quot;)/bib/book
    • where some $ba in $b/author
    • satisfies ($ba/last = $last and $ba/first=$first)
    • return $b/title }
    • </author> }
    • </results>
    XQuery (XML Query): Example ( query ) For each author, retrieve its last, first names as well as the title of its books, ordered by last, first names
  • 14.
    • <results>
    • <author>
    • <name>
    • <last> Abiteboul </last><first> Serge </first>
    • </name>
    • <title>Data on the Web</title>
    • </author>
    • <author>
    • <name>
    • <last> Stevens </last><first> W. </first>
    • </name>
    • <title>TCP/IP Illustrated</title>
    • <title>Advanced Programming in the Unix environment</title>
    • </author>
    • <author>
    • <name>
    • <last> Suciu </last><first> Dan </first>
    • </name>
    • <title>Data on the Web</title>
    • </author>
    • </results>
    XQuery (XML Query): Example ( result )
  • 15.
    • People and communities have data stores and applications to share
    • Vision :
      • Expand the Web to include more machine-understandable resources
      • Enable global interoperability between resources you know should be interoperable as well as those you don't yet know should be interoperable
    • Key Web technologies:
    • Web Services: Web of Programs
      • Standards for interactions between programs, linked on the Web
      • Easier to Expose and Use services (and data they provide)
    • Semantic Web: Web of Data
      • Standards for things, relationships and descriptions, linked on the Web
      • Easier to Understand, Search for, Share, Re-Use, Aggregate, Extend information
    A Smarter Web Is Possible
  • 16. Web Services
    • “ A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP -messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards” . Web Services Glossary, W3C, http://www.w3.org/TR/ws-gloss/
    UDDI: Universal Description, Discovery and Integration
  • 17.
    • SOAP is a simple XML based protocol to let applications exchange information over HTTP.
    • A SOAP message is a XML document containing the following elements:
      • A required Envelope element that identifies the XML document as a SOAP message
      • An optional Header element that contains header information
      • A required Body element that contains call and response information
      • An optional Fault element that provides information about errors that occurred while processing the message
    Simple Object Access Protocol (SOAP)
  • 18.
      • POST /InStock HTTP/1.1
      • Host: www.stock.org
      • Content-Type: application/soap+xml; charset=utf-8
      • Content-Length: nnn
      • <?xml version=&quot;1.0&quot;?>
      • <soap:Envelope
      • xmlns:soap=&quot;http://www.w3.org/2001/12/soap-envelope&quot;
      • soap:encodingStyle=&quot;http://www.w3.org/2001/12/soap-encoding&quot;>
      • <soap:Body xmlns:m=&quot;http://www.stock.org/stock&quot;>
      • <m:GetStockPrice>
      • <m:StockName>IBM</m:StockName>
      • </m:GetStockPrice>
      • </soap:Body>
      • </soap:Envelope>
    SOAP Request: Example
  • 19.
      • HTTP/1.1 200 OK
      • Content-Type: application/soap; charset=utf-8
      • Content-Length: nnn
      • <?xml version=&quot;1.0&quot;?>
      • <soap:Envelope
      • xmlns:soap=&quot;http://www.w3.org/2001/12/soap-envelope&quot;
      • soap:encodingStyle=&quot;http://www.w3.org/2001/12/soap-encoding&quot;>
      • <soap:Body xmlns:m=&quot;http://www.stock.org/stock&quot;>
      • <m:GetStockPriceResponse>
      • <m:Price>34.5</m:Price>
      • </m:GetStockPriceResponse>
      • </soap:Body>
      • </soap:Envelope>
    SOAP Response: Example
  • 20. Web Services Description Language (WSDL)
    • A WSDL document describes a web service using these major elements:
      • <portType>: The operations performed by the web service
      • <message>: The messages used by the web service
      • <types>: The data types used by the web service
      • <binding>: The communica-tion protocols used by the web service
    • <definitions>
      • <types>
      • type definition ......
      • </types>
      • <message>
      • message definition ...
      • </message>
      • <portType>
      • port definition ....
      • </portType>
      • <binding>
      • binding definition ..
      • </binding>
    • </definitions>
  • 21.
        • <message name=“getStockPriceRequest&quot;>
        • <part name=&quot;StockName&quot; type=&quot;xs:string&quot;/>
        • </message>
        • <message name=“getStockPriceResponse&quot;>
        • <part name=&quot;Price&quot; type=&quot;xs:float&quot;/>
        • </message>
        • <portType name=“StockMarket&quot;>
        • <operation name=“getStockPrice&quot;>
        • <input message=&quot;getStockPriceRequest&quot;/>
        • <output message= &quot;getStockPriceTermResponse&quot;/>
        • </operation>
        • </portType>
    WSDL Document: Example (fragment)
  • 22.
    • The overhead associated to SOAP makes it impractical in high-traffic scenarios
    • Representational State Transfer (REST): architectural style for networked systems based on the following principles:
      • Application state and functionality are abstracted into resources
      • Every resource is uniquely addressable by an URI
      • Client-Server: Clients pull resource representations
      • Stateless: each request from client to server must contain all needed information.
      • Uniform interface: all resources are accessed with a generic interface (HTTP-based)
      • Interconnected resource representations
      • Layered components - intermediaries, such as proxy servers, cache servers, to improve performance, security
    RESTful Web Services
  • 23.
    • A RESTful web service is a simple web service implemented using HTTP and the principles of REST.
    • A RESTful web service is a collection of resources. Its definition comprises:
      • The URI for the web service as a whole (<baseURI>)
      • A URI scheme to address individual resources, e.g. <baseURI>/<ID>
      • The MIME type of the data supported by the web service (JSON, XML)
      • The set of operations supported by the web service using HTTP methods:
        • POST: To create a resource on the server
        • GET: To retrieve the current state of the resource
        • PUT: To change the state of a resource or to update it
        • DELETE: To remove or delete a resource
    RESTful Web Services (cont.)
  • 24. RESTful WS: Example (adapted from Wikipedia )
  • 25.
    • “ The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help. One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption , and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web. Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form ” .
    • &quot;If HTML and the Web made all the online documents look like one huge book , RDF, schema, and inference languages will make all the data in the world look like one huge database &quot;
    • Tim Berners-Lee
    Semantic Web = The Web of Data
  • 26. The Current Web (1/2)
    • Resources:
      • Identified by URI's
      • untyped
    • Links:
      • href, src, ...
      • limited, non-descriptive
    • Users:
      • A lot of information, but its meaning must be interpreted and deduced from the content as it has been done since millenniums
    • Machines:
      • They don’t understand.
  • 27.
    • The Public Web
      • The web found when searching and browsing
      • At least 21 billion pages indexed by standard search engines
    • The Deep Web
      • Large data repositories that require their own internal searches.
      • About 6 trillion documents not indexed by standard search engines.
    • The Private Web
      • Password-protected sites and data: corporate intranets, private networks, susbscription-based services, etc.
      • About 3 trillion documents not indexed by standard search engines.
    The Current Web (2/2)
  • 28. The Semantic Web
    • Resources:
      • Globally identified by URIs
      • or locally (Blank)
      • Extensible
      • Relational
    • Links:
      • Identified by URIs
      • Extensible
      • Relational
    • Users:
      • More an better information
    • Machines:
      • More processable information (Data Web)
  • 29.
    • Make web resources more accessible to automated processes
    • Extend existing rendering markup with semantic markup
      • Metadata (data about data) annotations that describe content/function of web accessible resources
    • Use Ontologies to provide vocabulary for annotations
      • “ Formal specification” accessible to machines
    • A prerequisite is a standard web ontology language
      • Need to agree common syntax before we can share semantics
      • Syntactic web based on standards such as HTTP and HTML
    Semantic Web: How?
  • 30. Metadata annotations
  • 31.
    • Is it semantic?
      • Are the terms unambiguous and tagged in royalty-free format, governed by a nonprofit organization, that all software programs can understand?
    • Is it on the web?
      • Is it online using a common name space that makes it easily findable?
      • Is it shared among collaborators or companies?
      • Does it use the information already online to get smarter as more people use the system?
    The Semantic Web “Acid Test” (by D. Siegel)
  • 32. Semantic Web: W3C Standards and Tools
    • RDF ( R esource D escription F ramework): simple data model to describe resources and their relationships
    • RDF Schema: is a language for declaring basic class and types for describing the terms used in RDF, that allows defining class hierarchies
    • SPARQL : S PARQL P rotocol and R DF Q uery L anguage
    • OWL : W eb O ntology L anguage . Allows enriching the description of properties and classes, including, among others, class disjunction, association cardinality, richer data types, property features (eg. symmetry), etc.
  • 33.
    • RDF is graphical formalism ( + XML syntax + semantics)
      • for representing metadata
      • for describing the semantics of information in a machine- accessible way
    • RDF Statements are <subject, predicate, object> triples that describe properties of resources :
    • <Carles,hasColleague,Ernest>
    • XML representation:
    • <Description about =&quot;some.uri/person/carles_farre&quot;>
    • <hasColleague resource =&quot;some.uri/person/ernest_teniente&quot;/>
    • < /Description >
    R esource D escription F ramework (RDF)
  • 34.
    • RDF Schema allows you to define vocabulary terms and the relations between those terms
      • it gives “extra meaning” to particular RDF predicates and resources
      • this “extra meaning”, or semantics, specifies how a term should be interpreted
    • Examples:
      • <Person, type , Class >
      • <hasColleague, type , Property >
      • <Professor, subClassOf ,Person>
      • <Cristina, type ,Professor>
      • <hasColleague, range ,Person>
      • <hasColleague, domain ,Person>
    RDF Schema
  • 35.
    • RDFS too weak to describe resources in sufficient detail
      • No localized range and domain constraints
        • Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants
      • No existence/cardinality constraints
        • Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents
      • No transitive, inverse or symmetrical properties
        • Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical
    • Difficult to provide reasoning support
      • No “native” reasoners for non-standard semantics
      • May be possible to reason via FO axiomatization
    Problems with RDFS
  • 36.
    • OWL is RDF(S), adding vocabulary to specify:
      • Relations between classes
      • Cardinality
      • Equality
      • More typing of and characteristics of properties
      • Enumerated classes
    • Three species of OWL
      • OWL full is union of OWL syntax and RDF
      • OWL DL restricted to FOL fragment (≅ SHIQ Description Logic)
      • OWL Lite is “easier to implement” subset of OWL DL
    • OWL DL Benefits from many years of DL research
      • Well defined semantics
      • Formal properties well understood (complexity, decidability)
      • Known reasoning algorithms
      • Implemented systems (highly optimised)
    Web Ontology Language (OWL)
  • 37.
    • Person ⊓ ∀ hasChild.(Doctor ⊔ ∃ hasChild.Doctor)
    • <owl:Class>
    • <owl:intersectionOf rdf:parseType=&quot; collection&quot;>
    • <owl:Class rdf:about=&quot;#Person&quot;/>
    • <owl:Restriction>
    • <owl:onProperty rdf:resource=&quot;#hasChild&quot;/>
    • <owl:toClass>
    • <owl:unionOf rdf:parseType=&quot;collection&quot;>
    • <owl:Class rdf:about=&quot;#Doctor&quot;/>
    • <owl:Restriction>
    • <owl:onProperty rdf:resource=&quot;#hasChild&quot;/>
    • <owl:hasClass rdf:resource=&quot;#Doctor&quot;/>
    • </owl:Restriction>
    • </owl:unionOf>
    • </owl:toClass>
    • </owl:Restriction>
    • </owl:intersectionOf>
    • </owl:Class>
    OWL in RDF(S) notation: Example
  • 38.
    • Designed to query collections of triples…
    • … and to easily traverse relationships
    • Vaguely SQL-like syntax (SELECT, WHERE)
    • “ Matches graph patterns”
    • SELECT ?sal WHERE { emps:e13954 HR:salary ?sal }
    SPARQL Protocol And RDF Query Language
  • 39. SQL vs SPARQL
    • SELECT hire_date
    • FROM employees
    • WHERE salary >= 21750
    emps:e13954 HR:name 'Joe' emps:e13954 HR:hire-date 2000-04-14 emps:e13954 HR:salary 48000 emps:e10335 HR:name ‘Mary' emps:e10335 HR:hire-date 1998-11-23 emps:e10335 HR:salary 52000 … SELECT ?hdate WHERE { ?id HR:salary ?sal ?id HR:hire_date ?hdate FILTER ?sal >= 21750 } EMP_ID NAME HIRE_ DATE SALARY 13954 Joe 2000-04-14 48000 10335 Mary 1998-11-23 52000 … … … … 04182 Bob 2005-02-10 21750
  • 40. Semantic Web Services
    • The main aim is to enable highly flexible Web services architectures, where new services can be quickly discovered, orchestrated and composed into workflows by
      • creating a semantic markup of Web services that makes them machine understandable and use-apparent is necessary
      • developing an agent technology that exploits this semantic markup to support automated Web service composition and interoperability
    WWW URI, HTML, HTTP Semantic Web RDF, RDF(S), OWL Dynamic Web Services UDDI, WSDL, SOAP Static Semantic Web Services
  • 41.
    • KAPPEL, Gerti et al. Web Engineering , John Wiley & Sons, 2006. Chapter 14.
    • SHKLAR, Leon and ROSEN, Rich. Web Application Architecture: Principles, Protocols and Practices, 2 nd Edition . John Wiley & Sons, 2009. Chapters 5 and 13.
    • SIEGEL, David. Pull. The Power of the Semantic Web to Transform Your Business . Portfolio (Penguin Group), 2009.
    • RAY, Kate. Web 3.0 (video) http://vimeo.com/11529540
    • www.w3.org
    • www.w3schools.com
    References