lib/RT/Interface/Web/Handler.pm · 89a85683fe8cff0b6bdaa3b20629c28026888757 · best-practical / rt

Note that HTTP output still incorrectly relies on is_utf8 · 89a85683

Alex Vandiver authored Aug 08, 2014

Currently, any string which has the "UTF-8" flag is encoded as UTF-8
before being sent to the browser. This requires that any output which
is binary, or has already been encoded to bytes, _not_ have the flag
accidentally set.

It also requires that all output character strings have the "UTF-8" flag
enabled; while necessary for codepoints > 255, it is not strictly
required for codepoints between 127 and 255. As RT now consistently
uses Encode::decode() to produce character strings, which sets the
"UTF-8" flag even for characters in that range, this is likely safe.

The most correct fix would be to explicitly flag output that needs to be
encoded. However, doing so in a backwards compatible manner is
extremely difficult; as is_utf8 is unlikely to be incorrect in this
context, the small potential additional correctness is deemed unworth
the cost of requiring all external modules to flag their binary (or
character) output as such.

89a85683