lib/RT/I18N.pm · 17702cde3c9a7240a112bffac8749fc76b1194dd · best-practical / rt

Verify that MIME::Entity bodies are bytes, and remove _utf8_off call · 17702cde

Alex Vandiver authored Aug 08, 2014

Use the newly-added RT::Util::assert_bytes function to verify that the
body is indeed bytes, and not characters.

We also remove the _utf8_off call -- because, contrary to what the
comment implies, the presence or absence of the "UTF8" flag does _not_
determine if a string is "encoded as octets and not as characters"; it
merely states that the string is capable of holding codepoints > 255.
If it happens to not contain any, the _utf8_off does nothing. If it
does, it effectively encodes all codepoints > 127 in UTF-8.

Given the premise that the string contains bytes in some (probably
non-UTF-8) encoding, re-encoding some bytes of it as UTF-8 cannot
possibly produce valid output. The flaw in this situation cannot be
fixed by a simple _utf8_off, but instead must be fixed by ensuring that
the body always contains bytes, not wide characters -- as it now does,
thanks to the prior commits. The call to RT::Util::assert_bytes serves
as an additional safeguard against backsliding on that assumption.

17702cde