1 This package contains CUP grammars for the Java programming language.
3 Copyright (C) 2002 C. Scott Ananian
4 This code is free software; you can redistribute it and/or modify it
5 under the terms of the GNU General Public License as published by the
6 Free Software Foundation; either version 2 of the License, or (at your
7 option) any later version. See the file COPYING for more details.
10 Parse/ contains the Java grammars.
11 java10.cup contains a Java 1.0 grammar.
12 java11.cup contains a Java 1.1 grammar.
13 java12.cup contains a Java 1.2 grammar. [added 11-feb-1999]
14 java14.cup contains a Java 1.4 grammar. [added 10-apr-2002]
15 java15.cup contains a Java 1.5 Java grammar. [added 12-apr-2002]
16 [last updated 28-jul-2003; Java 1.5 spec not yet final.]
17 Lexer.java interface description for a lexer.
19 Lex/ contains a simple but complete Java lexer.
21 Sym.java a copy of Parse/Sym.java containing symbolic constants.
24 Main.java a simple testing skeleton for the parser/lexer.
26 The grammar in Parse/ should be processed by CUP into Grm.java.
27 There are much better ways to write lexers for java, but the
28 implementation in Lex/ seemed to be the easiest at the time. The
29 lexer implemented here may not be as efficient as a table-driven lexer,
30 but adheres to the java specification exactly.
32 -- C. Scott Ananian <cananian@alumni.princeton.edu> 3-Apr-1998
36 UPDATE: Fixed a lexer bug: '5.2' is a double, not a float. Thanks to
37 Ben Walter <bwalter@mit.edu> for spotting this.
39 -- C. Scott Ananian <cananian@alumni.princeton.edu> 14-Jul-1998
41 UPDATE: Fixed a couple of minor bugs.
42 1) Lex.EscapedUnicodeReader wasn't actually parsing unicode escape sequences
43 properly because we didn't implement read(char[], int, int).
44 2) Grammar fixes: int[].class, Object[].class and all array class
45 literals were not being parsed. Also special void.class literal
46 inadvertantly omitted from the grammar.
47 Both these problems have been fixed.
49 -- C. Scott Ananian <cananian@alumni.princeton.edu> 11-Feb-1999
51 UPDATE: Fixed another lexer bug: Large integer constants such as
52 0xFFFF0000 were being incorrectly flagged as 'too large for an int'.
53 Also, by the Java Language Specification, "\477" is a valid string
54 literal (it is the same as "\0477": the character '\047' followed by
55 the character '7'). The lexer handles this case correctly now.
57 Created java12.cup with grammar updated to Java 1.2. Features
58 added include the 'strictfp' keyword and the various new inner class
59 features at http://java.sun.com/docs/books/jls/nested-class-clarify.html.
61 Also added slightly better position/error reporting to all parsers.
63 -- C. Scott Ananian <cananian@alumni.princeton.edu> 11-Feb-1999
65 UPDATE: fixed some buglets with symbol/error position reporting.
67 -- C. Scott Ananian <cananian@alumni.princeton.edu> 13-Sep-1999
69 UPDATE: multi-line comments were causing incorrect character position
70 reporting. If you were using the character-position-to-line-number
71 code in Lexer, you would never have noticed this problem. Thanks to
72 William Young <youngwr@rose-hulman.edu> for pointing this out.
74 -- C. Scott Ananian <cananian@alumni.princeton.edu> 27-Oct-1999
76 UPDATE: extended grammar to handle the 'assert' statement added in
77 Java 1.4. Also fixed an oversight: a single SEMICOLON is a valid
78 ClassBodyDeclaration; this was added to allow trailing semicolons
79 uniformly on class declarations. This wasn't part of the original
80 JLS, but was revised in to conform with the actual behavior of the
81 javac compiler. I've added this to all the grammars from 1.0-1.4
82 to conform to javac behavior; let me know if you've got a good
83 reason why this production shouldn't be in early grammars.
85 Also futzed with the Makefile some to allow building a 'universal'
86 driver which will switch between java 1.0/1.1/1.2/1.4 on demand.
87 This helps me test the separate grammars; maybe you'll find a
88 use for this behavior too.
90 -- C. Scott Ananian <cananian@alumni.princeton.edu> 10-Apr-2002
92 NEW: added a grammar for the JSR-14 "Adding Generics to the Java
93 Programming Language" Java variant. Calling this java15.cup, since
94 this JSR currently seems to be destined for inclusion in Java 1.5.
95 This grammar is very very tricky! I need to use a lexer trick to
96 handle type casts to parameterized types, which otherwise do not
98 -- C. Scott Ananian <cananian@alumni.princeton.edu> 12-Apr-2002
100 UPDATE: various bug fixes to all grammars, in reponse to bugs reported
101 by Eric Blake <ebb9@email.byu.edu> and others.
102 a) TWEAK: added 'String' type to IDENTIFIER terminal to match Number types
103 given to numeric literals. (all grammars)
104 b) BUG FIX: added SEMICOLON production to interface_member_declaration to
105 allow optional trailing semicolons, in accordance with the JLS (for
106 java 1.2-and-later grammars) and Sun practice (for earlier grammars).
107 The 10-Apr-2002 release did not address this problem completely, due
108 to an oversight. (all grammars)
109 c) BUG FIX: '<primary>.this(...);' is not a legal production;
110 '<name>.super(...);' and '<name>.new <identifier>(...' ought to be.
111 In particular, plain identifiers ought to be able to qualify instance
112 creation and explicit constructor invocation.
113 (fix due to Eric Blake; java 1.2 grammar and following)
114 d) BUG FIX: parenthesized variables on the left-hand-side of an assignment
115 ought to be legal, according to JLS2. For example, this code is
117 class Foo { void m(int i) { (i) = 1; } }
118 (fix due to Eric Blake; java 1.2 grammar and following)
119 e) BUG FIX: array access of anonymous arrays ought to be legal, according
120 to JLS2. For example, this is legal code:
121 class Foo { int i = new int[]{0}[0]; }
122 (fix due to Eric Blake; java 1.2 grammar and following)
123 f) BUG FIX: nested parameterized types ought to be legal, for example:
124 class A<T> { class B<S> { } A<String>.B<String> c; }
125 (bug found by Eric Blake; jsr-14 grammar only)
126 g) TWEAK: test cases were added for these issues.
128 In addition, it should be clarified that the 'java15.cup' grammar is
129 really only for java 1.4 + jsr-14; recent developments at Sun indicate
130 that "Java 1.5" is likely to include several additional language changes
131 in addition to JSR-14 parameterized types; see JSR-201 for more details.
132 I will endeavor to add these features to 'java15.cup' as soon as their
133 syntax is nailed down.
134 -- C. Scott Ananian <cananian@alumni.princeton.edu> 13-Apr-2003
136 UPDATE: Updated the 'java15.cup' grammar to match the latest specifications
137 (and their corrections) for Java 1.5. This grammar matches the 2.2
138 prototype of JSR-14 + JSR-201 capabilities, with some corrections for
139 mistakes in the released specification and expected future features of
140 Java 1.5 (in particular, arrays of parameterized types bounded by
141 wildcards). Reimplemented java15.cup to use a refactoring originally
142 due to Eric Blake which eliminates our previous need for "lexer lookahead"
143 (see release notes for 12-April-2002). Added new 'enum' and '...' tokens
144 to the lexer to accomodate Java 1.5. New test cases added for the
145 additional language features.
146 -- C. Scott Ananian <cananian@alumni.princeton.edu> 28-Jul-2003