Skip to content
  • Alex Vandiver's avatar
    Standardize on the stricter Encode::encode("UTF-8", ...) everywhere · 1d18663b
    Alex Vandiver authored
    This is not only for code consistency, but also for consistency of
    output.  Encode::encode_utf8(...) is equivalent to
    Encode::encode("utf8",...) which is the non-"strict" form of UTF-8.
    Strict UTF-8 encoding differs in that (from `perldoc Encode`):
        ...its range is much narrower (0 ..  0x10_FFFF to cover only 21 bits
        instead of 32 or 64 bits) and some sequences are not allowed, like
        those used in surrogate pairs, the 31 non-character code points
        0xFDD0 .. 0xFDEF, the last two code points in any plane (0xXX_FFFE
        and 0xXX_FFFF), all non-shortest encodings, etc.
    RT deals with interchange with databases, email, and other systems.  In
    dealing with encodings, it should ensure that it does not produce byte
    sequences that are invalid according to official Unicode standards.