This package contains a number of utility classes dealing with generic encoding of {@link java.lang.String}s.
Although this might sound useless at first (as {@link java.lang.String}s do support encoding internally already), this class deals with a very subtle problem encountered when merging Java {@link java.lang.String}s and old byte-based (non internationalized) transports, such as Base 64 and URL encoding.
Let's consider (as an example) the URL encoded {@link java.lang.String}
%C2%A3 100
can be easily decomposed in a byte array using
URL decoding techniques: we would end up with the following byte array:
0x0C2 0x0A3 0x20 0x31 0x30 0x30
.
This byte-array, though, doesn't tell us anything about how to represent this as a readable and usable {@link java.lang.String} in Java. To be able to convert this we have to decode it again using a charset (or an encoding).
So, for example, if we were to decode the above mentioned byte array using
the ISO-8859-1 encoding, we would obtain the string
"£ 100
", or in details:
If we were to decode the same byte sequence using UTF-8, on the
other hand, we would obtain the (quite different) string
"£ 100
", or in details:
Therefore, as a conclusion, when Java {@link java.lang.String}s are encoded using Base 64, URL encoding, or similar techiques, one always have to remember that encoding (or decoding) must be done twice, and this package provides a way to deal with this mechanism.