Skip to content
  • Alex Vandiver's avatar
    Add a utility method to check that an input is bytes · a21eb81c
    Alex Vandiver authored
    Note that it is impossible to verify that an input is characters; here,
    we can only validate if it _could_ be bytes.
    
    First, any string with the "UTF8" flag off cannot contain codepoints
    above 255, and as such is safe.  Additionally, if the "UTF8" flag is on,
    having no codepoints above 127 means the bytes are unambigious.  Having
    codepoints above 255 is guaranteedly a sign that the input is not a byte
    string.
    
    This leaves only the case of a string with the "UTF8" flag on, and
    codepoints above 127 but below 255.  The "UTF8" flag is a sign that they
    were _likely_ touched by character data at some point.  In such cases we
    warn, suggesting that the bytes have the "UTF8" flag disabled by means
    of utf8::downgrade, if they are indeed bytes.
    a21eb81c