wycats/encodings.markdown Secret

## encodings.markdown

      
    Raw
  

              encodings.markdown
            
          
    Ruby Encoding Cheat Sheet


Only call force_encoding on BINARY Strings.
When receiving a BINARY string from the network or file system, make sure to force_encode it to its correct encoding.

In general, the encoding information is provided in an out-of-band channel, such as the Content-Type header in HTTP
If you don't know the encoding, the String is BINARY forever and should not be concatenated with non-BINARY strings


When calling force_encoding on a BINARY String, immediately call encode! afterwards. This will transcode the String
to the default_internal encoding
When using a regular expression with /u, make sure that only Unicode Strings are possible
When using a regular expression with /n, make sure that only BINARY Strings are possible
If you get an incompatible encoding between BINARY (ASCII-8BIT) and another encoding, the correct debugging approach
is to identify where the BINARY String came from. Usually, this means that a library read in BINARY data from the
network and didn't give it an encoding.
In app code, never use force_encoding to convert BINARY data into a particular encoding. By the time you've reached
app code, you have lost the information about which encoding is being used. Instead, find where the String came into
Ruby, and fix it to set up the encoding based on the information it knows.
In library code, only use force_encoding to convert BINARY data into an encoding if you have information about
what encoding is being used. This means that you have a header in network protocols or a magic comment in templates
(like ERB) or source files.
Only include the magic comment in source files that actually contain characters from that encoding
To combine two Strings with known, but different encodings, use encode to transcode the Strings into the same
encoding, then combine them.