• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Carrot: An appetizing hybrid of XQuery and XSLT
 

Carrot: An appetizing hybrid of XQuery and XSLT

on

  • 841 views

Balisage paper link: http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.html

Balisage paper link: http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.html

Statistics

Views

Total Views
841
Views on SlideShare
839
Embed Views
2

Actions

Likes
0
Downloads
10
Comments
0

2 Embeds 2

http://twitter.com 1
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Carrot: An appetizing hybrid of XQuery and XSLT Carrot: An appetizing hybrid of XQuery and XSLT Presentation Transcript

    • Carrot
      An appetizing hybrid of XQuery and XSLT
      Evan Lenz
      Software Developer, Community
      MarkLogic Corporation
    • On reinventing the wheel
      XSLT-UK 2001
      “XQuery: Reinventing the wheel?”
      http://xmlportfolio.com/xquery.html
    • Disclaimer
      My own personal project& opinions
    • Praises for XSLT
      Template rules are really elegant and powerful
      It’s mature in its set of features
      Powerful modularization features (<xsl:import>)
    • Praises for XQuery
      Concise syntax
      Highly composable syntax
      An element constructor is an expression
      So you can write things like: foo/<bar/>
    • Gripes about XSLT
      Two layers of syntax, which can’t be freely composed
      You can’t nest an instruction inside an expression
      E.g., you can’t apply templates inside an XPath expression
      Verbose syntax
      In general
      In particular, for function definitions and parameter-passing
    • Gripes about XQuery
      Conflation of modules and namespaces
      Don’t like being forced to use namespace URIs
      Distinction between main and library modules
      You can’t reuse a main module
      Reuse requires refactoring
      Lack of a distinction between:
      the default namespace for XPath, and
      the default namespace for element constructors
      No template rules!
    • A lot in common
      The same data model (XPath 2.0)
      Much of the same syntax (XPath 2.0)
    • Feeling boxed-in
      XSLT’s lack of composability
      XQuery’s lack of template rules
      Don’t like having to pick between two languages all the time
      Solution: add a third one. ;-)
      And I’m calling it…
    • YAASFXSLT
      “Yet Another Alternative Syntax For XSLT”
      Sam Wilmott’s RXSLT
      http://www.wilmott.ca/rxslt/rxslt.html
      Paul Tchistopolskii’sXSLScript
      http://markmail.org/message/niumiluelzho6bmt
      XSLTXT
      http://savannah.nongnu.org/projects/xsltxt
    • Actually: “Carrot”
      More than just an alternative syntax
      Carrot combines:
      the friendly syntax and composability of XQuery expressions
      the power and flexibility of template rules in XSLT
      A “host language” for XQuery expressions
    • Motivation & inspiration
      My boxed-in feelings
      OH: “I will never write code in XML.”
      James Clark’s element constructor proposal back in 2001
      http://www.jclark.com/xml/construct.html
      Haskell
    • Overall design approach
      95% of semantics defined by reference to XQuery and XSLT
      90% of syntax defined by reference to XQuery
    • Haskell similarities
      Haskell defines functions using equations:
      foo = "bar"
      Carrot defines variables, functions, and rules similarly:
      $foo := "bar";
      my:foo() := "bar";
      ^foo(*) := "bar";
    • Haskell similarities
      Haskell defines functions using equations:
      foo = "bar"
      Carrot defines variables, functions, and rules similarly:
      $foo := "bar";
      my:foo() := "bar";
      ^foo(*) := "bar";
      Everything on the RHS is an XQuery expression
      (plus a few extensions)
    • Intro by example
    • Intro by example
      A rule definition in XSLT:
      <xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
    • Intro by example
      A rule definition in XSLT:
      <xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
      A rule definition in Carrot:
      ^(para) := <p>{^()}</p>;
    • Intro by example
      A rule definition in XSLT:
      <xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
      A rule definition in Carrot:
      ^(para) := <p>{^()}</p>;
    • Intro by example
      A rule definition in XSLT:
      <xsl:template match="para"><p> <xsl:apply-templates/></p></xsl:template>
      A rule definition in Carrot:
      ^(para) := <p>{^()}</p>;
    • Intro by example
      A rule definition in XSLT:
      <xsl:template match="para"> <p><xsl:apply-templates/> </p></xsl:template>
      A rule definition in Carrot:
      ^(para) := <p>{^()}</p>;
    • Intro by example
      This:
      ^()
      Is short for this:
      ^(node())
      Just as, in XSLT, this:
      <xsl:apply-templates/>
      Is short for this:
      <xsl:apply-templates select="node()"/>
    • Intro by example
      Another rule definition in Carrot:
      ^toc(section) := <li>{ ^toc() }</li>;
      The same rule definition in XSLT:
      <xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
    • Intro by example
      Another rule definition in Carrot:
      ^toc(section) := <li>{ ^toc() }</li>;
      The same rule definition in XSLT:
      <xsl:templatematch="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
    • Intro by example
      Another rule definition in Carrot:
      ^toc(section) := <li>{ ^toc() }</li>;
      The same rule definition in XSLT:
      <xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
    • Intro by example
      Another rule definition in Carrot:
      ^toc(section) := <li>{ ^toc() }</li>;
      The same rule definition in XSLT:
      <xsl:template match="section" mode="toc"><li> <xsl:apply-templates mode="toc"/></li></xsl:template>
    • Intro by example
      Another rule definition in Carrot:
      ^toc(section) := <li>{ ^toc() }</li>;
      The same rule definition in XSLT:
      <xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
    • The identity transform
      In Carrot:
      ^(@*|node()) := copy{ ^(@*|node()) };
      In XSLT:
      <xsl:template match="@* | node()"> <xsl:copy><xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
    • The identity transform
      In Carrot:
      ^(@*|node()) := copy{ ^(@*|node()) };
      In XSLT:
      <xsl:template match="@* | node()"> <xsl:copy><xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
    • The identity transform
      In Carrot:
      ^(@*|node()) := copy{ ^(@*|node()) };
      In XSLT:
      <xsl:template match="@* | node()"> <xsl:copy><xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
    • The identity transform
      In Carrot:
      ^(@*|node()) := copy{ ^(@*|node()) };
      In XSLT:
      <xsl:template match="@* | node()"> <xsl:copy><xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
    • Note the asymmetry
      This definition is illegal (missing pattern):
      ^() := <foo/>;
      Just as this template rule is illegal:
      <xsl:template match=""><foo/></xsl:template>
      However, when invoking, you can omit the argument:
      ^()
      Just as in XSLT:
      <xsl:apply-templates/>
    • An XSLT example
      <xsl:transform version="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head><xsl:copy-of select="/doc/title"/> </head><body><xsl:apply-templates select="/doc/para"/></body></html></xsl:template><xsl:template match="para"><p><xsl:apply-templates/></p></xsl:template></xsl:stylesheet>
    • An XSLT example
      <xsl:transform version="2.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head><xsl:copy-of select="/doc/title"/> </head><body><xsl:apply-templates select="/doc/para"/></body></html></xsl:template><xsl:template match="para"><p><xsl:apply-templates/></p></xsl:template></xsl:stylesheet>
    • The equivalent in Carrot
      ^(/) := <html> <head>{ /doc/title }</head> <body>{ ^(/doc/para) }</body> </html>;^(para) := <p>{ ^() }</p>;
    • The equivalent in Carrot
      ^(/) := <html> <head>{ /doc/title }</head> <body>{ ^(/doc/para) }</body> </html>;^(para) := <p>{ ^() }</p>;
    • Carrot expressions
    • Carrot expressions
      Same as an expression in XQuery, with these additions:
      rulesetinvocations — ^mode(nodes)
      shallow copy{…}constructors
      text node literals — `my text node`
    • 1. Ruleset invocations
      In XSLT, each mode can be thought of as the name of a polymorphic function
      The syntax of Carrot makes this explicit
    • 1. Ruleset invocations
      “ruleset” == “mode”
      In Carrot:
      ^myMode(myExpr)
      In XSLT:
      <xsl:apply-templates mode="myMode" select="myExpr"/>
    • 1. Ruleset invocations
      “ruleset” == “mode”
      In Carrot:
      ^myMode(myExpr)
      In XSLT:
      <xsl:apply-templates mode="myMode" select="myExpr"/>
    • 1. Ruleset invocations
      “ruleset” == “mode”
      In Carrot:
      ^myMode(myExpr)
      In XSLT:
      <xsl:apply-templates mode="myMode" select="myExpr"/>
    • 2. Shallow copy constructors
      In Carrot: copy{…}
      In XSLT:<xsl:copy>…</xsl:copy>
    • 2. Shallow copy constructors
      In Carrot: copy{…}
      In XSLT:<xsl:copy>…</xsl:copy>
      Why extend XQuery here?
      The lack of shallow copy constructors in XQuery makes modified identity transforms impractical
      (specifically, for preserving namespace nodes)
    • 3. Text node literals
      In Carrot:`my text node`
    • 3. Text node literals
      In Carrot:`my text node`
      In XSLT (a literal text node):my text node
    • 3. Text node literals
      In Carrot:`my text node`
      In XSLT (a literal text node):my text node
      In XQuery (dynamic text node constructor):text{ "my text node" }
    • 3. Text node literals
      In Carrot:`my text node`
      In XSLT (a literal text node):my text node
      In XQuery (dynamic text node constructor):text{ "my text node" }
      Why aren’t dynamic text constructors sufficient?
      After all, they take only six more characters (text{…})
    • 3. Text node literals
      Consider an example:
      <xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template><xsl:template mode="file-name" match="doc">doc</xsl:template><xsl:template mode="file-ext" match="doc">.xml</xsl:template>
    • 3. Text node literals
      Consider an example:
      <xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template><xsl:template mode="file-name" match="doc">doc</xsl:template><xsl:template mode="file-ext" match="doc">.xml</xsl:template>
    • 3. Text node literals
      Consider an example:
      <xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template><xsl:template mode="file-name" match="doc">doc</xsl:template><xsl:template mode="file-ext" match="doc">.xml</xsl:template>
    • 3. Text node literals
      Consider an example:
      <xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template><xsl:template mode="file-name" match="doc">doc</xsl:template><xsl:template mode="file-ext" match="doc">.xml</xsl:template>
      And the result:
      <result>doc.xml</result>
    • 3. Text node literals
      Same example, rewritten “naturally” in Carrot:
      ^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>; ^file-name(doc) := "doc";^file-ext(doc) := ".xml";
    • 3. Text node literals
      Same example, rewritten “naturally” in Carrot:
      ^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>; ^file-name(doc) := "doc";^file-ext(doc) := ".xml";
    • 3. Text node literals
      Same example, rewritten “naturally” in Carrot:
      ^(/doc) := <result>{^file-name(.), ^file-ext(.)}</result>; ^file-name(doc) := "doc";^file-ext(doc) := ".xml";
    • 3. Text node literals
      Same example, rewritten “naturally” in Carrot:
      ^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>; ^file-name(doc) := "doc";^file-ext(doc) := ".xml";
      And the result (uh-oh):
      <result>doc .xml</result>
    • 3. Text node literals
      Sequences of atomic values have spaces added upon conversion to a text node
      Various fixes, none satisfactory:
      Wrap the two invocations in text{} or concat()
      (high coupling – what if one of them returns an element?)
      Wrap the strings in text{}
      (burdensome fix—regular strings work 90% of the time)
    • 3. Text node literals
      Note the imbalance:
      Returning a text node is more concise in XSLT than in XQuery (!)
      XSLT: Hello
      XQuery: text{"Hello"}
      Returning a string is more concise in XQuery than in XSLT
      XQuery: "Hello"
      XSLT: <xsl:sequence select="'Hello'"/>
      Text node literals in Carrot redress the imbalance
      Text nodes in Carrot: `Hello`
      Strings in Carrot: "Hello"
    • 3. Text node literals
      Rewritten properly in Carrot:
      ^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>; ^file-name(doc) := `doc`;^file-ext (doc) := `.xml`;
      Simple guideline now:
      Use text node literals when you are constructing part of a result document
      Use string literals when you know you want to return a string
    • Expression semantics
      Same as XQuery!
      With this exception (remember the earlier gripe):
      Namespace attribute declarations on element constructors do not affect the default element namespace for XPath expressions.
      Okay, enough about namespaces 
    • Carrot definitions
    • A Carrot module
      Consists of a set of unordered definitions
      Three kinds of definitions:
      Global variables
      Functions
      Rules
      Unlike XQuery, there is no top-level expression—only definitions
      Carrot is like XSLT in this regard
    • Global variables
      In Carrot:
      $foo := "a string value";
    • Global variables
      In Carrot:
      $foo := "a string value";
      Equivalent to this XQuery:
      declare variable $foo := "a string value";
    • Global variables
      In Carrot:
      $foo := "a string value";
      Equivalent to this XQuery:
      declare variable $foo := "a string value";
    • Functions
      In Carrot:
      my:foo() := "return value";
    • Functions
      In Carrot:
      my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);
    • Functions
      In Carrot:
      my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);
      Equivalent to this XQuery:
      declare function my:foo() { "return value" };declare function my:bar($str as xs:string) as xs:string { upper-case($str) };
    • Functions
      In Carrot:
      my:foo() := "return value";my:bar($str as xs:string) as xs:string:= upper-case($str);
      Equivalent to this XQuery:
      declare function my:foo() { "return value" };declare function my:bar($str as xs:string) as xs:string{ upper-case($str) };
    • Rule definitions
      In Carrot:
      ^foo(*) := "return value";
      Equivalent to this XSLT:
      <xsl:template match="*" mode="foo"> <xsl:sequence select="'return value'"/></xsl:template>
    • Rule definitions
      In Carrot:
      ^foo(*) := "return value";
      Equivalent to this XSLT:
      <xsl:template match="*" mode="foo"> <xsl:sequence select="'return value'"/></xsl:template>
    • Rule definitions
      In Carrot:
      ^foo(*) := "return value";
      Equivalent to this XSLT:
      <xsl:template match="*" mode="foo"> <xsl:sequence select="'return value'"/></xsl:template>
    • Rule definitions
      In Carrot:
      ^foo(*) := "return value";
      Equivalent to this XSLT:
      <xsl:template match="*" mode="foo"> <xsl:sequence select="'return value'"/></xsl:template>
    • Rule parameters
      In Carrot:
      ^foo(* ; $str as xs:string) := concat($str, .);
    • Rule parameters
      In Carrot:
      ^foo(* ; $str as xs:string) := concat($str, .);
      Equivalent to this XSLT:
      <xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string"/> <xsl:sequence select="concat($str, .)"/></xsl:template>
    • Rule parameters
      In Carrot:
      ^foo(* ; $str as xs:string) := concat($str, .);
      Equivalent to this XSLT:
      <xsl:template match="*" mode="foo"><xsl:param name="str" as="xs:string"/> <xsl:sequence select="concat($str, .)"/></xsl:template>
    • Rule parameters
      In Carrot:
      ^foo(* ; tunnel $str as xs:string) := concat($str, .);
      Equivalent to this XSLT:
      <xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string" tunnel="yes"/> <xsl:sequence select="concat($str, .)"/></xsl:template>
    • Conflict resolution
      Same as XSLT
      Import precedence first
      Then priority
    • Explicit priorities
      In Carrot:
      ^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();
    • Explicit priorities
      In Carrot:
      ^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();
      Equivalent to this XSLT:
      <xsl:template mode="author-listing" match="author[1]" priority="1">
      <xsl:apply-templates/>
      </xsl:template>
      <xsl:template mode="author-listing" match="author">
      <xsl:text>, </xsl:text>
      <xsl:apply-templates/>
      </xsl:template>
      <xsl:template mode="author-listing" match="author[last()]">
      <xsl:text> and </xsl:text>
      <xsl:apply-templates/>
      </xsl:template>
    • Explicit priorities
      In Carrot:
      ^author-listing( author[1] ) 1:= ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();
      Equivalent to this XSLT:
      <xsl:template mode="author-listing" match="author[1]" priority="1">
      <xsl:apply-templates/>
      </xsl:template>
      <xsl:template mode="author-listing" match="author">
      <xsl:text>, </xsl:text>
      <xsl:apply-templates/>
      </xsl:template>
      <xsl:template mode="author-listing" match="author[last()]">
      <xsl:text> and </xsl:text>
      <xsl:apply-templates/>
      </xsl:template>
    • Multiple modes
      In Carrot:
      ^foo|bar(*) := `result`;
      Equivalent to this XSLT:
      <xsl:template mode="foo bar" match="*">result</xsl:template>
    • Multiple modes
      In Carrot:
      ^foo|bar(*) := `result`;
      Equivalent to this XSLT:
      <xsl:template mode="foo bar" match="*">result</xsl:template>
    • Why I like composability
      A couple of examples
    • Pipelines are easier
      A typical pipeline approach in XSLT:
      <xsl:variable name="stage1-result"> <xsl:apply-templates mode="stage1" select="."/></xsl:variable><xsl:variable name="stage2-result"> <xsl:apply-templates mode="stage2" select="$stage1-result"/></xsl:variable><xsl:apply-templates mode="stage3" select="$stage2-result"/>
    • Pipelines are easier
      A typical pipeline approach in XSLT:
      <xsl:variable name="stage1-result"> <xsl:apply-templates mode="stage1" select="."/></xsl:variable><xsl:variable name="stage2-result"> <xsl:apply-templates mode="stage2" select="$stage1-result"/></xsl:variable><xsl:apply-templates mode="stage3" select="$stage2-result"/>
      In Carrot:
      ^stage1(.)/^stage2(.)/^stage3(.)
    • Fewer features needed
      XSLT 2.1/3.0 promises to add some convenience features, like:
      <xsl:copy select="foo">…</xsl:copy>
      With Carrot, just use foo/copy{…}
    • What about feature X?
    • Two instruction categories
      XSLT instructions that aren’t needed
      <xsl:for-each> - Use for expressions (or mapping operator!)
      <xsl:variable> - Use let instead
      <xsl:sort> - Use order byinstead
      <xsl:choose>, <xsl:if> - Use if…then…else instead
      XSLT instructions whose Carrot syntax is TBD
      <xsl:for-each-group>
      <xsl:result-document>
      <xsl:analyze-string>
      Etc., etc.
      XQuery 3.0 may add grouping…
      You can always import XSLT 2.0 stylesheets into Carrot
    • Top-level elements
      Imports and includes
      import navigation.crt;include widgets.crt;
      Top-level parameters
      param $message as xs:string;
      Etc.
      Still in mock-up stage. See some examples:
      http://github.com/evanlenz/carrot/examples
    • Implementation strategy
    • Compile to XSLT 2.0
      1:1 module mapping
      Each Carrot module compiles to an XSLT 2.0 module
      Carrot can include and import other Carrot modules or XSLT modules.
      Carrot can also import XQuery modules, but since this is not supported directly in XSLT 2.0, the semantics depend on your target XSLT processor, e.g.:
      <saxon:import-query> in Saxon
      <xdmp:import-module> in MarkLogic Server
    • Steps to implementation
      Generate a parser
      Create a BNF grammar for Carrot
      Hand-convert the EBNF grammar for XQuery expressions to BNF
      Extend the resulting BNF to support Carrot definitions and expressions
      Use yapp-xslt to generate the Carrot parser from the Carrot BNF
      http://www.o-xml.org/yapp/
      Already running into trouble with this approach…
      Write a compiler (in XSLT, naturally)
    • Project goals
      At this point, just explore the raw ideas (and share them)
      Solicit feedback
      Placeholder for now: http://github.com/evanlenz/carrot
      Help me cook this carrot 
      You write the parser, I’ll write the compiler?
    • Future possibilities
    • XML-oriented browser scripting
      XQIB
      Saxon-CE
      Carrot, or something like it, could make XSLT more palatable to Web developers
    • W3C activity
      Provide seeds for ideas in the XSL/XQuery WGs?
      Carrot will grow with XPath/XQuery/XSLT
    • Mode merging
      Random XSLT feature idea: invoke more than one mode
      <xsl:apply-templates mode="foo bar"/>
      In Carrot:
      ^foo|bar()
      A static mode extension mechanism
      Handy for multi-stage transformations where each stage is similar to but not the same as the next
      Could be added to Carrot even if not supported directly in XSLT
      Carrot as a research playground for new language ideas
    • Questions?