4 Replies Latest reply on Jan 19, 2007 7:07 PM by rajanikanth

    URIencoding is not working... Not able to save the German ch

      Hi,

      I am using JBoss as AS, oracle as DB and using the struts framework. When I am saving the form information using the "GET" method the data is stored properly in the DB but when I am using the "POST" method the german characters are getting corrupted. When I tried to debug I found that actionform itself has the corrupted characters.

      I have set the URIencoding = UTF-8 in the server.xml file. Still the problem persist.

      I think it might not be a bug in Jboss, it is somewhat a configuration issue only. Please let me the configuration changes that I need to store the data in the correct format.

      The same application is working fine in the WAS environment. We are migrating the application from WAS to Jboss.

      Is this a known issue in JBoss or am I doing something wrong. Please suggest me resolving this issue.

      Thanks in advance for your valuable suggestions.

      Rajani Kanth

        • 1. Re: URIencoding is not working... Not able to save the Germa
          leonidsh

          URIencoding works only for GET method, not for POST
          This is Tomcat behaviour.
          See: http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset

          Did You find any solution for POST?

          Thanks, Leon

          • 2. Re: URIencoding is not working... Not able to save the Germa
            stevehalsey

            Hi,

            You say using JBoss you managed to get GETs to process the UTF-8 correctly, but not the POSTs. I found your article because I had the opposite problem!

            I had read http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/ and managed to get POSTs to correctly be decoded by puttting the following call to httpServletRequest.setCharacterEncoding("UTF-8") in your servlet processRequest or doGet or doPut methods:-


            protected void processRequest(HttpServletRequest httpServletRequest,
            HttpServletResponse httpServletResponse)
            throws ServletException, IOException {
            LOG.debug("ENTERING processRequest method.");
            //As this is the UTF-8 servlet then the assumption is that all params are UTF-8 encoded.
            //If we didn't do this then the Servlet Engine would assume ISO-8859-1 or use the
            //charset header if it was sent (most browsers don't seem to ever send this though)

            //If you uncomment these two test lines below it will cause the setCharacterEncoding
            //command to have no effect and your parameters will be MANGLED.
            //LOG.debug("Testing calling getParameter before setCharacterEncoding, this should screw things up...");
            //String thisShouldScrewThingsUp = httpServletRequest.getParameter("thisShouldScrewThingsUp");

            //This servlet expects ALL parameters to be UTF-8 encoded and so the
            //following setCharacterEncoding MUST be called before ANY reading of parameters
            //from the HttpServletRequest object or it will be too late and the default of
            //ISO-8859-1 will have been used to decode the parameters and so they will be
            //mangled.
            LOG.debug("in TestUtfEightParamServlet.processRequest calling httpServletRequest.setCharacterEncoding(\"UTF-8\");");
            httpServletRequest.setCharacterEncoding("UTF-8");

            I then POSTed the following char ? to the server and it correctly recognises it as unicode character U+920d, but then if I uncomment the line that gets the parameter thisShouldScrewThingsUp then it does, as expected, screw up, interpretting the character as characters with Unicode code points \ue9 \u88 \u8d.

            So with the above code, POST worked but GET still didn't.

            As it says in your post and at:-
            http://java.sun.com/developer/technicalArticles/Intl/HTTPCharset/
            you can fix this by setting URIencoding = UTF-8 in the server.xml file, but this then means all your apps on that server will have their URIs interpreted as UTF-8 which you may well not want (I don't, since one app on the server still needs to work with ISO-8859-1 encoded URIs). So I found a way round this is to set useBodyEncodingForURI="true" as follows in server.xml:-

            <!-- A HTTP/1.1 Connector on port 8080 -->


            After restarting JBoss the above code now works for both POSTs and GETs, but any programs which still work with ISO-8859-1 as the default for decoding their GETs and POSTs should still work OK I think.

            Its all very messy and hard to find in the documentation. It would be good to have a page in the Jboss documentation that explains this kind of thing, maybe there is one but I can't find it?

            • 3. Re: URIencoding is not working... Not able to save the Germa
              stevehalsey

              In my last post I tried to display the following char:-

              ?

              but it came up as a question mark. So if you want to see this character for testing go to:-

              http://demo.icu-project.org/icu-bin/convexp?conv=windows-950-2000&b=B6&s=ALL#layout

              and its the one in the 07 column and 70 row marked ? 920D

              cheers

              steve.

              • 4. Re: URIencoding is not working... Not able to save the Germa

                I selected a hardway to resolve the issue by writing a filter to our application which will set the URLencoding for the application. That worked but that shoun't be the way I thought.

                From the start I thought there should be some property using which we should be able to set the encoding.

                Anyway thanks for giving me the answer I am expecting.