Page tree
Skip to end of metadata
Go to start of metadata

The Gist

This UTF-8 line will not work in the properties files,

some.dutch.text = Één van de wijken van Curaçao heet Saliña.

It will incorrectly output to screen and look something like this,

�én van de wijken van Curaçao heet Saliña.

In Java code then you need to convert using Java's native2ascii tool which results in,

some.dutch.text = \u00c9\u00e9n van de wijken van Cura\u00e7ao heet Sali\u00f1a.

To work with non-English languages then, you need to convert back and forth all the time. This is hard to maintain and increases the chances of introducing human error.

The Details

The standard approach uses the ResourceBundle API in combination with properties files which contains externalized text. The ResourceBundle API will load the proper text based on the current Locale and the default (fallback) locale.

The problem is that at it's core, the ResourceBundle API uses the Properties#load(InputStream) method to load the properties files. Unfortunately, this method uses by default the ISO-8859-1 encoding. This is explicitly mentioned in its javadoc as well,

The load(InputStream) / store(OutputStream, String) methods work the same way as the load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in ISO 8859-1 character encoding. Characters that cannot be directly represented in this encoding can be written using Unicode escapes ; only a single ‘u’ character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings.

Java 1.6 introduced new 18n enhancements. However, it still does not fully resolve the problem as there are still many other technologies, such as JSTL 1.1 that still use the regular resource bundle.

 

References

Good overview of problem and possibly good solution for jsf - http://jdevelopment.nl/internationalization-jsf-utf8-encoded-properties-files/

  • No labels