3 Replies Latest reply on Nov 17, 2009 8:51 AM by thez

    JBoss and the "em dash" character

    thez

      Hope this is the correct forum to post this.

      I've a JSP that I use to print out contents of txt files. I ran into trouble with some files which when opened cause JBoss to truncate the output of the JSP page, even though the page is executed to the end. I debugged this and found the following.

      The files are saved as UTF8, and contain the "em dash" character, U+2014 (8212) unicode, or & mdash; in html (without the space, it seems these forums can't handle the character either). The UTF8 byte sequence is "E2 80 94".

      Now, when I read the file contents with BufferedReader (InputStreamReader opened with UTF8), line by line, and print the lines out, the output is truncated to the first line containing the character, and nothing after that is printed.

      Is this a bug or a 'feature' ? Seems very strange behavior.