How to write_language_compiler

1,095 views

Published on

How to Write Language “Compiler”

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,095
On SlideShare
0
From Embeds
0
Number of Embeds
209
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • http://javacc.java.net/doc/javaccgrm.html#prod2STATIC: This is a boolean option whose default value is true. If true, all methods and class variables are specified as static in the generated parser and token manager. This allows only one parser object to be present, but it improves the performance of the parser. To perform multiple parses during one run of your Java program, you will have to call the ReInit() method to reinitialize your parser if it is static. If the parser is non-static, you may use the "new" operator to construct as many parsers as you wish. These can all be used simultaneously from different threads. DEBUG_PARSER: This is a boolean option whose default value is false. This option is used to obtain debugging information from the generated parser. Setting this option to true causes the parser to generate a trace of its actions. Tracing may be disabled by calling the method disable_tracing() in the generated parser class. Tracing may be subsequently enabled by calling the method enable_tracing() in the generated parser class. JAVA_UNICODE_ESCAPE: This is a boolean option whose default value is false. When set to true, the generated parser uses an input stream object that processes Java Unicode escapes (\\u...) before sending characters to the token manager. By default, Java Unicode escapes are not processed. This option is ignored if either of options USER_TOKEN_MANAGER, USER_CHAR_STREAM is set to true. UNICODE_INPUT: This is a boolean option whose default value is false. When set to true, the generated parser uses uses an input stream object that reads Unicode files. By default, ASCII files are assumed. This option is ignored if either of options USER_TOKEN_MANAGER, USER_CHAR_STREAM is set to true. IGNORE_CASE: This is a boolean option whose default value is false. Setting this option to true causes the generated token manager to ignore case in the token specifications and the input files. This is useful for writing grammars for languages such as HTML. It is also possible to localize the effect of IGNORE_CASE by using an alternate mechanism described later.
  • The token manager starts initially in the state "DEFAULT“In the default mode (start of the program)
  • How to write_language_compiler

    1. 1. How to Write Language “Compiler”Philip Zhong© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1
    2. 2. • Language Compilers• JAVACC• SQL Parser© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2
    3. 3. • ANTLR• YACC• JAVACC© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3
    4. 4. • Another Tool for Language Recognition• Java/C++/C/C#/Python/Ruby/object C• BSD© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4
    5. 5. • Yet Another Compiler Compile• C++/C for Unix• BSD© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
    6. 6. • Java Compiler Compile• Java• BSD© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6
    7. 7. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
    8. 8. "n" newline * zero or more copies of the preceding expression + one or more copies of the preceding expression ? zero or one copy of the preceding expression | or [] optional ˜[] matches any single character that is not in the empty set () must appear EOF end of line "a"-"z" any letter, from a to z "0" - "9" any numeric© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8
    9. 9. • Options• Program header• Tokens• Production© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9
    10. 10. options { JDK_VERSION = "1.6"; IGNORE_CASE=true ; JAVA_UNICODE_ESCAPE = true; UNICODE_INPUT=true; DEBUG_PARSER=false ; STATIC = false;}© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10
    11. 11. PARSER_BEGIN(SqlParser)package com.webex.wddl.engine.parser.sql;public class SqlParser implements Parser{ final public void setStatement(String sqlStatement) { InputStream stream = new ByteArrayInputStream(sqlStatement.getBytes()); ... public SqlParser() { }}PARSER_END(SqlParser)© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11
    12. 12. • TOKEN: The regular expressions in this regular expression production describe tokens in the grammar.• SPECIAL_TOKEN: The regular expressions in this regular expression production describe special tokens.• SKIP: Matches to regular expressions in this regular expression production are simply skipped (ignored) by the token manager.• MORE: Sometimes it is useful to gradually build up a token to be passed on to the parser. Matches to this kind of regular expression are stored in a buffer until the next TOKEN or SPECIAL_TOKEN match.© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12
    13. 13. TOKEN:{ <X_AND:"AND">| <X_FROM:"FROM">| <X_IN:"IN">| <X_LIKE:"LIKE">| <X_SELECT:"SELECT">| <X_WHERE:"WHERE">...}© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13
    14. 14. SPECIAL_TOKEN:{ <LINE_COMMENT: "--"(~["r","n"])*>| <MULTI_LINE_COMMENT: "/*" (~["*"])* "*" ("*" | (~["*","/"] (~["*"])* "*"))* "/">}© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 14
    15. 15. SKIP:{ ""| "t"| "r"| "n"}© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15
    16. 16. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16
    17. 17. Statement parse(String SQL):...{ ... ( statement = insert() | statement = merge() ... | statement = select() )(<EOF>|";") ...}© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17
    18. 18. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18
    19. 19. • Define tokens• Define parser tree classes• Write parser logic• Create parser classes© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19
    20. 20. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20
    21. 21. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21
    22. 22. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22
    23. 23. © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23
    24. 24. Thank you.© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24

    ×