I've been recently made aware of a bug in the fileUpload component where the encoding of the filename would get corrupted during the upload. E.g. you can take a file named štěně.png (that's Czech for puppy) and after upload the name would be Å¡tÄ›nÄ›.png. It doesn't occur on servers that use Servlet 2.5 (e.g. JBoss EAP 6.4).
The issue is caused by a difference in default encoding between two standards:
- HTTP/1.1 - the standard was published in 1999 so unfortunately the default encoding is iso-8859-1 (it is specified in chapter 3.4.1).
- XHR2 - the default encoding when using FormData is utf-8
When a request is sent to the server the headers may look like this:
A fileUpload request will look like this:
notice that the charset isn't specified, that's part of the XHR2 standard. When the request is parsed the server will use iso-8859-1 and the name gets corrupted.
Fortunately the default for HTTP can be changed, although if you change it to anything else than utf-8 - i.e. the encoding of the request - it will not work (we're getting this fixed in RichFaces 4.5.8).
- on Tomcat add this to web.xml
<filter>
<filter-name>SetCharacterEncoding</filter-name>
<filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>SetCharacterEncoding</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
- on WildFly you can add <default-encoding>UTF-8</default-encoding> to jboss-web.xml
However this solution doesn't currently work on WildFly due to a bug in Undertow. The bug was reported and fixed, the fix will be available in the upcoming version. In the meantime if you encounter the issue you can fix the filename simply with
filename = new String(filename.getBytes("iso-8859-1"), "utf-8");
If you encounter an issue with the encoding after 4.5.8 is released please report it (if it's not on WildFly).
Comments