An ABNF PrimerConcepts, structure, syntax<br />
Introduction<br /><ul><li>W3C standard
http://www.w3.org/TR/speech-grammar
W3C recommandation:
March 16th, 2004
Defines both the XML form and the Augmented BNF (ABNF) form
They are equivalent
Supported by a complete development environment
NuGram IDE</li></ul>Copyright © 2010 Nu Echo Inc.<br />
Structure of an ABNF Grammar<br />Headers<br />#ABNF 1.0 ISO-8859-1;<br />language en-US;<br />mode voice;<br />root$yesOr...
Comments<br /><ul><li>C/C++/Java Comments
// to end of line
/* ... */
Documentation comments
/** ... */
@example to document sample sentences
Can appear before rule definitions, root header, and language header.</li></ul>Copyright © 2010 Nu Echo Inc.<br />
Grammar headers<br />Copyright © 2010 Nu Echo Inc.<br />
Grammar Headers<br /><ul><li>Self-identifying header
Mandatory
Encoding is optional
Can be preceded by byte order mask (BOM)‏</li></ul>Ex:<br />#ABNF 1.0;<br />#ABNF 1.0 UTF-8;<br />Copyright © 2010 Nu Echo...
Grammar Headers<br /><ul><li>Language
Identifies the language of the document
Required for voice grammars
Languages codes defined by RFC 3066</li></ul>Ex:<br />language en;<br />language fr-CA;<br />Copyright © 2010 Nu Echo Inc....
Grammar Headers<br /><ul><li>Mode
Indicates the type of input
Optional header
'dtmf' or 'voice' (default)</li></ul>Ex:<br />mode voice;<br />mode dtmf;<br />Copyright © 2010 Nu Echo Inc.<br />
Grammar Headers<br /><ul><li>Root
Defines the grammar's top-level rule
Optional
Root rule can be either public or private</li></ul>Ex:<br />root$rootRule;<br />Copyright © 2010 Nu Echo Inc.<br />
Grammar Headers<br /><ul><li>Tag-format
Declares content type for the semantic tags (actions)‏
Value is a URI
Recognition-engine specific</li></ul>Ex:<br />tag-format <semantics/1.0>;        (SISR)<br />tag-format <Nuance>;         ...
Grammar Headers<br /><ul><li>Lexicon
Specifies a pronunciation lexicon to use (a URI)‏
Upcoming SlideShare
Loading in …5
×

An ABNF Primer

6,670 views

Published on

An introduction to W3C's ABNF grammar syntax.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,670
On SlideShare
0
From Embeds
0
Number of Embeds
2,697
Actions
Shares
0
Downloads
46
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Note that Nuance GSL grammars support mixed-mode grammars. This is not possible with SRGS grammars.
  • Note that in GSL grammars, if no rule is declared public, all rules are assumed public.
  • Note that some engines will accept numbers (e.g., “100”). This, however, is a bad idea to rely on this because we don’t know how it will be rendered.
  • Note that square brackets also group expansions
  • Note that some engines have limits on the number of repeats. For instance, Loquendo has a maximum of 30.
  • Note that the form “(E | $NULL)” is preferable when computing grammar weights based on data since we can have a weight computed for both branches.
  • Note that this may not be supported by all engines. Even if it is, it should be tested carefully. For instance, this does not work well with OSR 3.0 (don’t know about Nuance 9).
  • An ABNF Primer

    1. 1. An ABNF PrimerConcepts, structure, syntax<br />
    2. 2. Introduction<br /><ul><li>W3C standard
    3. 3. http://www.w3.org/TR/speech-grammar
    4. 4. W3C recommandation:
    5. 5. March 16th, 2004
    6. 6. Defines both the XML form and the Augmented BNF (ABNF) form
    7. 7. They are equivalent
    8. 8. Supported by a complete development environment
    9. 9. NuGram IDE</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    10. 10. Structure of an ABNF Grammar<br />Headers<br />#ABNF 1.0 ISO-8859-1;<br />language en-US;<br />mode voice;<br />root$yesOrNo;<br />private$yesOrNo = <br /> yes {out.value = 'yes'}<br />| no {out.value = 'no'}<br />;<br />Rules<br />Copyright © 2010 Nu Echo Inc.<br />
    11. 11. Comments<br /><ul><li>C/C++/Java Comments
    12. 12. // to end of line
    13. 13. /* ... */
    14. 14. Documentation comments
    15. 15. /** ... */
    16. 16. @example to document sample sentences
    17. 17. Can appear before rule definitions, root header, and language header.</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    18. 18. Grammar headers<br />Copyright © 2010 Nu Echo Inc.<br />
    19. 19. Grammar Headers<br /><ul><li>Self-identifying header
    20. 20. Mandatory
    21. 21. Encoding is optional
    22. 22. Can be preceded by byte order mask (BOM)‏</li></ul>Ex:<br />#ABNF 1.0;<br />#ABNF 1.0 UTF-8;<br />Copyright © 2010 Nu Echo Inc.<br />
    23. 23. Grammar Headers<br /><ul><li>Language
    24. 24. Identifies the language of the document
    25. 25. Required for voice grammars
    26. 26. Languages codes defined by RFC 3066</li></ul>Ex:<br />language en;<br />language fr-CA;<br />Copyright © 2010 Nu Echo Inc.<br />
    27. 27. Grammar Headers<br /><ul><li>Mode
    28. 28. Indicates the type of input
    29. 29. Optional header
    30. 30. 'dtmf' or 'voice' (default)</li></ul>Ex:<br />mode voice;<br />mode dtmf;<br />Copyright © 2010 Nu Echo Inc.<br />
    31. 31. Grammar Headers<br /><ul><li>Root
    32. 32. Defines the grammar's top-level rule
    33. 33. Optional
    34. 34. Root rule can be either public or private</li></ul>Ex:<br />root$rootRule;<br />Copyright © 2010 Nu Echo Inc.<br />
    35. 35. Grammar Headers<br /><ul><li>Tag-format
    36. 36. Declares content type for the semantic tags (actions)‏
    37. 37. Value is a URI
    38. 38. Recognition-engine specific</li></ul>Ex:<br />tag-format <semantics/1.0>; (SISR)<br />tag-format <Nuance>; (Nuance)<br />tag-format <swi-semantics/1.0>; (Nuance)<br />tag-format <semantics-ms/1.0>; (Microsoft)<br />tag-format <semantics/1.0.2006>; (LumenVox)<br />Copyright © 2010 Nu Echo Inc.<br />
    39. 39. Grammar Headers<br /><ul><li>Lexicon
    40. 40. Specifies a pronunciation lexicon to use (a URI)‏
    41. 41. One or more occurrences
    42. 42. Format is engine-specific</li></ul>Ex:<br />lexicon <../lex/names.pls>;<br />Copyright © 2010 Nu Echo Inc.<br />
    43. 43. Grammar Headers<br /><ul><li>Base URI
    44. 44. Base URI for all relative URIs in the document
    45. 45. Optional
    46. 46. Has precedence over the meta header</li></ul>Ex:<br />base <http://localhost:8080/>;<br />Copyright © 2010 Nu Echo Inc.<br />
    47. 47. Grammar Headers<br /><ul><li>Meta and Http-Equiv
    48. 48. Metadata attached to the grammar
    49. 49. Properties taking precedence over HTTP headers
    50. 50. One or more occurrences</li></ul>Ex:<br />meta“Author”is“J. Doe”;<br />http-equiv“Expires”is“0”;<br />Copyright © 2010 Nu Echo Inc.<br />
    51. 51. Relative URI Resolving<br /><ul><li>Base URI :
    52. 52. base header
    53. 53. meta“base” header
    54. 54. metadata from protocol interaction (ex. HTTP header)‏
    55. 55. base URI of the current document</li></ul>Ex:<br />meta“base”is“http://example.com/grammars/date.abnf”;<br />meta“base”is“http://example.com/grammars/”;<br />base “http://example.com/grammars/date.abnf”;<br />Copyright © 2010 Nu Echo Inc.<br />
    56. 56. Relative URI Resolving<br /><ul><li>ASR Engines behave differently
    57. 57. OSR / Nuance 9
    58. 58. Full conformance to RFC 2396 (now obsoleted by RFC 3986)
    59. 59. Loquendo ASR
    60. 60. Base URI must not contain the document’s name
    61. 61. Best practice: always end the URI with a “/”</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    62. 62. Grammar rules<br />Copyright © 2010 Nu Echo Inc.<br />
    63. 63. Grammar Rules<br /><ul><li>A rule is an association between a name and an expansion
    64. 64. Rule names are unique in a grammar
    65. 65. Rules can be either public or private
    66. 66. A public rule is visible outside the scope of the grammar document
    67. 67. A rule is private by default
    68. 68. An expansion describes a set of word sequences (sentences)‏</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    69. 69. Grammar Rules<br /><ul><li>Syntax of a rule definition:</li></ul>public$name=expansion;<br />or<br />private$name = expansion;<br />or<br />$name = expansion;<br />Copyright © 2010 Nu Echo Inc.<br />
    70. 70. Rule names<br /><ul><li>A rule name must be a valid XML Name
    71. 71. http://www.w3.org/TR/2000/REC-xml-20001006#NT-Name
    72. 72. But:
    73. 73. Cannot be NULL, VOID, GARBAGE
    74. 74. Must not contain '.', ':', '-'
    75. 75. Rule names begin with '$'
    76. 76. Ex: $number $names $digits</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    77. 77. Basic Expansions<br />Copyright © 2010 Nu Echo Inc.<br />
    78. 78. Expansions<br /><ul><li>Words (aka tokens)‏
    79. 79. A string that the recognizer can convert to a phonetic representation
    80. 80. Typically a word in the specified language
    81. 81. Words are delimited by whitespaces, and$, <, >, (, ), /, |, [, ], {, }, !
    82. 82. Can be enclosed in “”</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    83. 83. Expansions<br /><ul><li>Words (cont'd)‏
    84. 84. Examples:
    85. 85. hello
    86. 86. Montréal
    87. 87. trente-deux
    88. 88. “San Francisco”</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    89. 89. Expansions<br /><ul><li>Words (cont'd)‏
    90. 90. Tips:
    91. 91. Avoid acronyms, like “IBM” (unless word is in the dictionary)Separate each letter with a “.” or “_”.
    92. 92. Good: “I.B.M.”, “I triple E”
    93. 93. Bad: “IBM”, “IEEE”
    94. 94. Don't use abbreviations.
    95. 95. Replace “Dr.” with “Doctor”, “St.” with “street"
    96. 96. Use spelled forms for numbers
    97. 97. Use “One hundred” instead of “100”</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    98. 98. Expansions<br /><ul><li>Sequences
    99. 99. Whitespace-separated expansions
    100. 100. Each expansion must match in turn</li></ul>Ex:<br /> I would like to<br />$day $month $year<br />Copyright © 2010 Nu Echo Inc.<br />
    101. 101. Expansions<br /><ul><li>Choices
    102. 102. To match one of a number of choices
    103. 103. Expansions separated by '|'</li></ul>Ex:<br /> one | two | three | four<br /> $yes | $no{answer = 'no'}<br />Copyright © 2010 Nu Echo Inc.<br />
    104. 104. Expansions<br /><ul><li>Grouping
    105. 105. Parentheses group expansions (as do square brackets)
    106. 106. Enables encapsulation
    107. 107. Useful to ensure correct precedence in parsing a group of expansions</li></ul>Ex:<br />no (it's | it is) not<br />(oh | zero) { out = '0'}<br />Copyright © 2010 Nu Echo Inc.<br />
    108. 108. Expansions<br /><ul><li>Rule References
    109. 109. References to named expansions
    110. 110. Three types:
    111. 111. Local references
    112. 112. External references
    113. 113. Special rule names</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    114. 114. Expansions<br /><ul><li>Local rule references
    115. 115. References to rules declared in the same document</li></ul>Ex:<br />$date $civicNumber $digit<br />Copyright © 2010 Nu Echo Inc.<br />
    116. 116. Expansions<br /><ul><li>External rule references
    117. 117. Implicit
    118. 118. Uses root rule
    119. 119. Root rule can be either public or private</li></ul>$<documentURI><br /><ul><li>Explicit
    120. 120. Rule must be declared public</li></ul>$<documentURI#ruleName><br /><ul><li>Media type can be specified:</li></ul>$<documentURI#ruleName>~<mediatype><br />Copyright © 2010 Nu Echo Inc.<br />
    121. 121. Expansions<br /><ul><li>External rule references (cont'd)‏</li></ul>Ex:<br />$<../common/numbers.abnf><br />$<../common/numbers.abnf#oneToNine><br />$<http://localhost:8800/names.abnf?id=45><br />$<names.abnf?id=45>~<application/srgs+xml><br />Copyright © 2010 Nu Echo Inc.<br />
    122. 122. Expansions<br /><ul><li>Special rule references
    123. 123. $NULL
    124. 124. matches automatically
    125. 125. equivalent to ()‏
    126. 126. $VOID
    127. 127. matches nothing
    128. 128. $GARBAGE
    129. 129. matches anything
    130. 130. implementation-specific behavior</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    131. 131. Advanced expansions<br />Copyright © 2010 Nu Echo Inc.<br />
    132. 132. Expansions<br /><ul><li>Repeats
    133. 133. Force the repetition of an expansion
    134. 134. Bounded: <n-m> or <n>
    135. 135. Unbounded: <n-></li></ul>Ex:<br />$digit <7> | $digit <10><br />$topping <1-><br />Copyright © 2010 Nu Echo Inc.<br />
    136. 136. Expansions<br /><ul><li>Repeats (cont'd)‏
    137. 137. Special case: [ E ]
    138. 138. Equivalent to E <0-1>
    139. 139. Or to ( E | $NULL )‏</li></ul>Ex:<br />$hesitation = [euh] ok [euh];<br />$date = $month [$year];<br />Copyright © 2010 Nu Echo Inc.<br />
    140. 140. Expansions<br /><ul><li>Repeats (cont'd)‏
    141. 141. The probability of the repetition can be specified
    142. 142. Syntax: <n-m /prob/>
    143. 143. prob is value between 0.0 and 1.0</li></ul>Ex:<br />$digit <2-4 /0.8/><br />$topping <1- /0.785/><br />Copyright © 2010 Nu Echo Inc.<br />
    144. 144. Expansions<br /><ul><li>Choices – Weighting‏
    145. 145. Multiplying factor attached to a choice
    146. 146. Positively biased when weight > 1.0
    147. 147. Negatively biased when weight < 1.0
    148. 148. Default weight is 1.0
    149. 149. Syntax: /n./ or /n.n/ or /.n/ or /n/</li></ul>Ex:<br />(/1.7/ New York | /0.4/ Newark)‏<br />Copyright © 2010 Nu Echo Inc.<br />
    150. 150. Expansions<br /><ul><li>Language
    151. 151. Overrides the default language
    152. 152. Applies to all words of the target expansion
    153. 153. Syntax: E!code</li></ul>Ex:<br />yes | oui!fr-CA<br />(Michel Tremblay|André Roy)!fr-CA<br />Copyright © 2010 Nu Echo Inc.<br />
    154. 154. Expansions<br /><ul><li>Tags
    155. 155. To attach “meaning” to a sentence
    156. 156. Arbritrary strings enclosed in {}
    157. 157. Content depends on value of the tag-format header</li></ul>Ex:<br /> (two | second) {out.day = 1}<br />$number{val = number.val}<br />Copyright © 2010 Nu Echo Inc.<br />
    158. 158. Expansions<br /><ul><li>Tags (cont'd)‏
    159. 159. Use {!{ ... }!} when tag must contain '}'.
    160. 160. Can appear as grammar header
    161. 161. must be followed by ';‘
    162. 162. Ideal for defining global functions or data structures</li></ul>Copyright © 2010 Nu Echo Inc.<br />
    163. 163. Precedence<br />Copyright © 2010 Nu Echo Inc.<br />

    ×