This content has been marked as final.
Show 4 replies
-
1. Re: lexer won't tokenize identifiers with non us-ascii chara
hengels May 19, 2005 2:54 AM (in response to hengels)Please, can anyone comment on this? Is this possibly a bug? Can you please check the grammar, if valid (from the java language perspective) identifiers with non-usascii characters are tokenized correctly?
Thanks,
Holger -
2. FIX
hengels May 19, 2005 8:17 AM (in response to hengels)the following patches fix the problem:
--- javassist-3.0/src/main/javassist/preproc/Compiler.java 2005-01-18 15:53:48.000000000 +0100 +++ javassist-3.0_patched/src/main/javassist/preproc/Compiler.java 2005-05-19 12:07:20.038916592 +0200 @@ -199,8 +199,7 @@ throws IOException { int c = skipSpaces(reader, ' '); - while ('A' <= c && c <= 'Z' || 'a' <= c && c <= 'z' - || '0' <= c && c <= '9' || c == '.' || c == '_') { + while (Character.isJavaIdentifierPart((char)c)) { buf.append((char)c); c = reader.read(); }
--- javassist-3.0/src/main/javassist/compiler/Lex.java 2005-01-18 15:53:48.000000000 +0100 +++ javassist-3.0_patched/src/main/javassist/compiler/Lex.java 2005-05-19 13:09:20.419332800 +0200 @@ -133,8 +133,7 @@ return readSeparator('.'); } } - else if ('A' <= c && c <= 'Z' || 'a' <= c && c <= 'z' || c == '_' - || c == '$') + else if (Character.isJavaIdentifierStart((char)c)) return readIdentifier(c, token); else return readSeparator(c); @@ -434,8 +433,7 @@ do { tbuf.append((char)c); c = getc(); - } while ('A' <= c && c <= 'Z' || 'a' <= c && c <= 'z' || c == '_' - || c == '$' || '0' <= c && c <= '9'); + } while (Character.isJavaIdentifierPart((char)c)); ungetc(c);
-
3. Re: lexer won't tokenize identifiers with non us-ascii chara
kabirkhan May 20, 2005 5:18 AM (in response to hengels)Hi,
I've added this to JIRA http://jira.jboss.org/jira/browse/JASSIST-10 -
4. Re: lexer won't tokenize identifiers with non us-ascii chara
chiba Jun 7, 2005 3:01 PM (in response to hengels)I applied your patch to both Javassist 3.0 and 3.1