2 Replies Latest reply on Jun 8, 2004 9:04 AM by innovate

    Encoding of Request/Reponse

    innovate Newbie

      Hello,

      I moved the topic about german umlauts from the user space to the development space because I think there is something wrong how the encoding is handled inside of Nukes/JBoss:

      1. You do not set an HTTP-Header to tell the browser which encoding the stream is. If you look at JSPWicki output:

      Content-Type text/html;charset=UTF-8

      and yours:

      Content-Type text/html


      2. If I enter some text (Umlauts) into this bb, in the preview I see the Umlauts because they get escaped after processing. In the text area, you send UNICODE encoded test stream to the browser. I see in the text area the UNICODE encoded characters for the umlauts. It seems, that the whole page sent to the browser includes different encodings or it is the result of the escaped characters which leads to this statement.


      3. You set the encoding in the HTML page to iso 8859-1 which is false, because you include also UNICODE encoded chars.


      4. If I get an UML diagram or something similar to get started, I will help or completely fix this problem. I saw that you use some filters, processors and ...

      Do you actively escape characters?

      Regards,

      Cyrill

        • 1. Re: Encoding of Request/Reponse
          Viet Master

          yes the encoding is a little bit messy.

          we should switch it to UTF-8. Nukes always output chars, at the end tomcat does the conversion from java.lang.String to the right encoding.

          encoding is given to the browser uwith 2 infos : in the HTTP response headers and in the page with the meta tag. I don't understand really well why there is a need for both of these.

          • 2. Re: Encoding of Request/Reponse
            innovate Newbie

            Hello,

            I have investigated some time to discover the whole problem. It seems, that the problem is not nukes, but the integration of tomcat into JBoss.
            The encoding is not messy as long as I have seen - you mentioned.

            I have simply overwritten the method doGet(..) in the Class NukesServlet with the following content including umlauts.

            protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException
            {
            resp.setContentType("text/html; charset=UTF-8");
            Writer out = resp.getWriter();
            out
            .write("<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">");
            out.write("");
            out.write(" Encoding Servlet");
            out.write(" ");
            out.write(" <H1>Hallo, - über meinem Haupt ist der Himmel</H1>");
            out.write(" ");
            out.write("");
            out.flush();
            out.close();
            }

            In the WebBrowser I see only question marks instead of the umlauts. You can simply insert this code into Tomcat 4 or 5 and you will see the umlauts.

            So, I am right and how can we change that?


            Regards,

            Cyrill