Commit 17702cde authored by Alex Vandiver's avatar Alex Vandiver
Browse files

Verify that MIME::Entity bodies are bytes, and remove _utf8_off call

Use the newly-added RT::Util::assert_bytes function to verify that the
body is indeed bytes, and not characters.

We also remove the _utf8_off call -- because, contrary to what the
comment implies, the presence or absence of the "UTF8" flag does _not_
determine if a string is "encoded as octets and not as characters"; it
merely states that the string is capable of holding codepoints > 255.
If it happens to not contain any, the _utf8_off does nothing.  If it
does, it effectively encodes all codepoints > 127 in UTF-8.

Given the premise that the string contains bytes in some (probably
non-UTF-8) encoding, re-encoding some bytes of it as UTF-8 cannot
possibly produce valid output.  The flaw in this situation cannot be
fixed by a simple _utf8_off, but instead must be fixed by ensuring that
the body always contains bytes, not wide characters -- as it now does,
thanks to the prior commits.  The call to RT::Util::assert_bytes serves
as an additional safeguard against backsliding on that assumption.
parent a21eb81c
......@@ -291,13 +291,12 @@ sub SetMIMEEntityToEncoding {
if ( $body && ($enc ne $charset || $enc =~ /^utf-?8(?:-strict)?$/i) ) {
my $string = $body->as_string or return;
$RT::Logger->debug( "Converting '$charset' to '$enc' for "
. $head->mime_type . " - "
. ( Encode::decode("UTF-8",$head->get('subject')) || 'Subjectless message' ) );
# NOTE:: see the comments at the end of the sub.
my $orig_string = $string;
( my $success, $string ) = EncodeFromToWithCroak( $orig_string, $charset => $enc );
if ( !$success ) {
......@@ -328,19 +327,6 @@ sub SetMIMEEntityToEncoding {
# NOTES: Why Encode::_utf8_off before Encode::from_to
# All the strings in RT are utf-8 now. Quotes from Encode POD:
# [$length =] from_to($octets, FROM_ENC, TO_ENC [, CHECK])
# ... The data in $octets must be encoded as octets and not as
# characters in Perl's internal format. ...
# Not turning off the UTF-8 flag in the string will prevent the string
# from conversion.
=head2 DecodeMIMEWordsToUTF8 $raw
An utility method which mimics MIME::Words::decode_mimewords, but only
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment