Wildfly Umlaut (UTF-8) issue
waterstorm Oct 11, 2018 6:11 AMHello,
I'm currently having a lot of issues (after upgrading Wildfly > 10) with special characters, such as the German "Umlaut" using AJP. I already created a Stackoverflow issues a while ago, but I simply wanted to ask here in the forums one more time before posting this as a Bug Report.
I'll update my question according to my latest tests, but it will be very similar. So here we go:
As already stated this problem happened after upgrading Wildfly. Originally (when writing my Stackoverflow question) I had 10 installed and upgraded to 13, but the problem still exists with the most current version (14.0.1).
I'm have serious issues with encoding in Wildfly. I don't know if this is a Wildfly Bug so I'm asking for help here first. Maybe I just missed something.
I also tested this with multiple Java versions in the meantime, so I'm pretty sure that it's not related to my Java 8 to Java 10 upgrade (which I stated in the Stackoverflow question).
My setup is a bit more "complex":
Shibboleth SP -> Apache2 (with AJP) -> Wildfly -> Wicket Application
Before the upgrade, everything was working as excepted. I'd get the attributes of the logged in user via the HttpServletRequest
in Java:
(HttpServletRequest)getRequest().getContainerRequest().attributes
For example if I wanted to get the display name I could do:
((HttpServletRequest)getRequest().getContainerRequest()).getAttribute("displayName")
However, this I returns now for example "Ãberpruefung" instead of "Überpruefung". Before the update I did not have this issue.
I've come a long way to post here, so I'll describe my (failed) steps to fix this issue in short
Validating the problem
First I checked and validated what exactly the issue is and it turned out that my string was encoded somehow/somewhere in ISO-8859-1 (latin-1) because I could "fix" the issue by doing:
new String(attribute.getBytes("ISO-8859-1"), "UTF-8")
However this seems to me to be nothing but a workaround, I'd rather would have this fixed (and it did work before after all...)
1 Wildfly
Obviously I though Wildfly is the issue, so I set UTF-8 as default. I've done this as suggested here for the server and the AJP listener
<servlet-container name="default" default-encoding="UTF-8"><ajp-listener name="ajp" socket-binding="ajp" url-charset="UTF-8"/>
This showed up in the Wildfly interface, so it was set, but it did not change anything on the issue.
2 Java
Second, I was thinking of the new Java 10 which I also updated back then around that time and which therefore could also be the reason for this.
I tried setting all Java charsets to default to UTF-8 using the VM options:
-Dfile.encoding=UTF8 -Dfile.io.encoding=UTF8 -DjavaEncoding=UTF8
Just to be sure I checked this in Java using this code snippet
System.err.println("Default Charset: " + Charset.defaultCharset());System.err.println("file.encoding: " + System.getProperty("file.encoding"));System.err.println("Default Charset in use: " + getDefaultCharSet());System.err.println("Request encoding: " + ((HttpServletRequest)getRequest().getContainerRequest()).getCharacterEncoding());
Which returned:
15:33:28,354 ERROR [stderr] (default task-1) Default Charset: UTF-815:33:28,355 ERROR [stderr] (default task-1) file.encoding: utf-815:33:28,356 ERROR [stderr] (default task-1) Default Charset in use: UTF815:33:28,356 ERROR [stderr] (default task-1) Request encoding: UTF-8
Shibboleth / Apache2 / Wicket
Checking all the steps of the way to make sure UTF-8 is default, even though it worked before in the same setup. I did not upgrade either of those.
Shibboleth
Docs here and here only speak of UTF-8 and checking the attributes using the URL /Shibboleth.sso/Session
did show all chars just fine.
Apache
As suggested somewhere on Stackoverflow I've added AddDefaultCharset UTF-8
to /etc/apache2/apache2.conf
.
But this did not change anything either. The string was still showing up in the "wrong" charset in Java.
Wicket
As suggested here wicket can be set to UTF-8 as well. But this was already set in my Application.java:
@Override protected void init() { super.init(); getMarkupSettings().setDefaultMarkupEncoding("UTF-8"); getRequestCycleSettings().setResponseRequestEncoding("UTF-8"); }
So nothing new here either.
Filter
I've read some bug reports on the Wildfly JIRA about this issue and about custom filters for UTF-8 here, here and here (yes some of them are quite old and marked fixed, but I was kind of desperate). So I tried implementing a very basic filter as described in one of the reports and added it to the web.xml:
public class Utf8Filter implements Filter { @Override public void init(FilterConfig filterConfig) { } @Override public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException { System.err.println("encoding is " + servletRequest.getCharacterEncoding()); if (servletRequest.getCharacterEncoding() == null) { servletRequest.setCharacterEncoding("UTF-8"); } if (servletRequest.getCharacterEncoding() == null) { System.err.println("could not set encoding"); Thread.dumpStack(); } filterChain.doFilter(servletRequest, servletResponse); } @Override public void destroy() { }}
This prints encoding is UTF-8
in the terminal, so it was obviously already set correctly. Therefore the string still showed up as ISO-8859-1:
15:33:28,356 ERROR [stderr] (default task-1) Display Name: Ãberpruefung15:33:28,357 ERROR [stderr] (default task-1) Detected Charset of Display Name: ISO-8859-1
Everything I can think of is now set manually to UTF-8. Clearly the string gets encoded somewhere in ISO-8859-1 but I just can't figure out where and how to prevent this.
Did I miss something? How I can I get the attributes from the Servlet in UTF-8 so it will show the German "Umlaut" correctly?
It clearly worked before, I just don't know why it does not work now. Any help is very appreciated. Thank you!
The link to the original Stackoverflow questions: https://stackoverflow.com/questions/51542234/wildfly-13-utf-8-encoding-servlet-issues