Skip to content

Instantly share code, notes, and snippets.

@sivel
Last active October 2, 2019 13:54
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sivel/9eb368f98974e15033c3ac2f989ac4e0 to your computer and use it in GitHub Desktop.
Save sivel/9eb368f98974e15033c3ac2f989ac4e0 to your computer and use it in GitHub Desktop.

Draft

Using to_bytes/to_native/to_text

errors

The default value for errors, although specified as None in the function signature is surrogate_then_replace

The most common and recommended values for compatibility between python2 and python3 are:

  • surrogate_then_replace
  • surrogate_or_strict

When to use which?

surrogate_then_replace should be used when the data is informational only, such as when displaying information to the user. Ultimately, just heading to a log or displayed to the user.

surrogate_or_strict should be used when the data makes a difference to the computer's understanding of the world. Such as with file paths or database keys.

nonstring

This specifies the strategy to use if a nonstring is passed. The default is simplerepr and will return a string representation using either str(obj) or repr(obj) preferring the str() method.

Other values are empty which returns an empty string, passthru which returns the original object, or strict which will raise a TypeError exception.

An example of using passthru would be when either passing a string or a file like object for use in a HTTP POST request with to_bytes.

to_native

"native" in this context is meant to indicate the default string type on Python 2 and 3 as produced by str

On the controller

to_native on the controller, is used for a small set of functionality:

  1. When converting information for use in exceptions
  2. When the underlying python API expects a native string type

Typically speaking, native values should not be long lived, and should be converted at the borders to native where they are needed. If a variable must be assigned to a native value, the variable should be prefixed with n_ such as n_output.

On the target

  1. Typically most all strings on the target should utilize the native string type for the most easy integration of the underlying python APIs. However, be careful to note the information from the errors section, which dictates which errors value to use for informational vs operational values.

to_bytes

"bytes" in this context refers to the data type produced by bytes on Python 2 and Python3.

On Python 2 this is str and on Python 3 this is bytes.

Values converted to bytes should not be long lived. Typically values should be converted at the borders to bytes where they are needed. If a variable must be assigned to a bytes value, the variable should be prefixed with b_ such as b_path. This includes params in the function signature, if a function accepts a bytes value.

Everywhere

When dealing with byte-oriented APIs. This is common when dealing with file paths, or with data being passed through HTTP requests.

to_text

"text" in this context is meant to indicate the type produced by the unicode function on Python2, and str on Python3.

On the controller

  1. When data is ingested into Ansible, values should typically be cast to text for the lifetime of that data.
  2. All information sent to the Display class, such as display.display or display.vvv should be cast to text.

NOTE: Only on the borders where the data leaves Ansible should it be converted to bytes or native.

On the target

It is not likely to need to_text in many scenarios on the target. Only when the API you are dealing with specifically needs text types, such as in some MySQL libraries.

@sivel
Copy link
Author

sivel commented Jun 27, 2019

Sure. I haven't documented anything around encoding yet, and after today, I at least have some useful things to add for when you would want to use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment