2 Replies Latest reply on Jan 10, 2011 12:11 PM by mhn

    XML Parsing Error: Invalid unicode character with NEKO

    mhn

      My page hangs if a user submits a query via ajax with unicode characters (e.g. Alt + 01).

       

      The ajax console shows during rendring:

      error[14:57:16,523]: Error parsing XML

      error[14:57:16,523]: Parse Error: XML Parsing Error: Invalid unicode character.

       

      Environment:

      - richfaces 3.3.3

      - myfaces 1.2.9

      - facelets

      - neko parser 1.9.12

      - main template contains:

      -- <?xml version="1.0" encoding="UTF-8"?>

      -- <meta charset="UTF-8"/>

      - request and response character encoding  is also UTF-8

       

      The unicode char is correctly processed and displayed in NON-Ajax requests.

      The unicode char is correctly processed but not displayed in the response if tidy filter is used for ajax requests.

       

      What can I do to avoid this parsing error in combination with NEKO?

        • 1. XML Parsing Error: Invalid unicode character with NEKO
          mhn

          After a lot of debugging I found an inconsistency.

          In my sample user enters the Unicode Character 'START OF HEADING' (in windows Alt + 01 or U+0001).

          http://www.fileformat.info/info/unicode/char/1/index.htm

           

          This character is rendered as \u0001 in outputText tags which is fine.

           

          It causes the above mentioned parsing error if it is rendered as an attribute value, e.g. as value of an inputText field.

          This is caused by a wrong transformation in class org.ajax4jsf.xml.serializer.ToXHTMLStream.

          Method ToXHTMLStream.writeAttrString contains following check:

           

          String outputStringForChar = m_charInfo.getOutputStringForChar(ch);

          if (null != outputStringForChar)

          {

              writer.write(outputStringForChar);

          }

          else if (escapingNotNeeded(ch))

          {

              writer.write(ch); // no escaping in this case

          }

          else

          {

              writer.write("&#");

              writer.write(Integer.toString(ch)); //THIS IS CALLED!

              writer.write(';');

          }

           

          The 'START OF HEADING' control char is transformed to the entity &#1; which results in an parsing error on client side.

          • 2. XML Parsing Error: Invalid unicode character with NEKO
            mhn

            Is there any workaorund or should I create a JIRA issue ?