CGuichard/python-str-formatting.md

## python-str-formatting.md

      
    Raw
  

              python-str-formatting.md
            
          
    Python: String formatting

In every language, text formatting is an important feature. Almost every language has a specific way to deal with it. So here's a little tutorial/reminder on how to do string formatting in Python.

Note: I'm using Python 3.8 in the examples.

Let's go

Addition & cast

Simple, intuitive. The first thing people do is use the +. When coming from Java, that's understandable.
>>> a = "Hello"
>>> b = "world!"
>>> a + " " + b
'Hello world!'
It works fine between str, but what about str and int types?
>>> "I'm " + 23 + " years old"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "int") to str
We got a type error. Unlike Java the int is not cast automatically, we need to do it ourselves:
>>> "I'm " + str(23) + " years old"
"I'm 23 years old"
When used with variables, str(...) is often called.
This "solution", if we can call it like that, is quite ugly when you start to do heavy formatting. Because you're obligated to close " and use +, etc... this way to do formatting takes a lot of place and becomes quickly difficult to read.
Modulo operator formatting (%-formatting)

This method is quite old-fashioned. It uses format types like the one we can see in C and C++, as well as a lot of other languages.
>>> "Hello %s" % ("Batman")
'Hello Batman'
Or
>>> "I'm %d years old" % (23)
"I'm 23 years old"
As we can see, the %s is used for str types, and %d for int types.
This method particularity is that you have to know the type formatter symbol, it is a strength, and a weakness.
>>> "%.2f" % (2.55662)
'2.56'
With correct knowledge, you can format your strings pretty effectively. And if I can add something, I lied a little just above. I said that you need to know the type formatter to use, but you can use %s for pretty much everything.
>>> "%s" % (2.55662)
'2.55662'
While it is working, it's great to use the real type formatter to indicate the type of the variable. It allows you to use some advanced feature like in my example, with %.2f to display a float with 2 numbers after the decimal point.
You can also store it inside a variable to create template-like string variables:
>>> greeting = "Hello %s!"
>>> greeting % "world"
'Hello world!'
>>> greeting % "you"
'Hello you!'
Format function

The str.format method, introduced in Python 3, is a well-used way to format a string.
>>> "Hello {}, I am {} years old!".format("you", 23)
'Hello you, I am 23 years old!'
As per you see, it is using a bracket syntax {} to delimit your values positions in the string. The number of arguments passed to format must be equal to the number of placeholders {}.
By default, the arguments indexes are numerically increasing from 0, like an array, but you can change it by specifying the index.
>>> "{0} {1}".format("Hello", "you")
'Hello you'
>>> "{1} {0}".format("Hello", "you")
'you Hello'
You can still specify the formatting symbol. It is also compatible with the previous point.
>>> "{:.2f}".format(2.55662)
'2.56'
>>> "{0:.2f}".format(2.55662)
'2.56'
For better readability, you can also name the placeholders. Previous features can also be used.
>>> "{value}".format(value=2.55662)
'2.55662'
>>> "{value:.2f}".format(value=2.55662)
'2.56'
Just like for the modulo operator, this method can be used to create template-like strings.
>>> greeting = "Hello {name}"
>>> greeting.format(name="Clément")
'Hello Clément'
>>> greeting.format(name="Thang")
'Hello Thang'
F-String

This format is quite recent, and really powerful because it allows you to use literal string interpolation directly in the string. The verbose name for f-string is "formatted string", and they exist since Python 3.6, you can read all about it in PEP 498. They are evaluated at runtime. Here's a short example:
>>> name = "Clément"
>>> f"Hello {name}"
'Hello Clément'
The f-string provides a way to embed an expression inside a string literal. And by expression I mean that it can interpolate more than a variable.
>>> age = 24
>>> incr = 5
>>> f"In {incr} years, I will be {age+incr} years old"
'In 5 years, I will be 29 years old'
Just like previous solutions, you can specify formatters.
>>> value=2.55662
>>> f"{value:.2f}"
'2.56'
>>> import datetime
>>> today = datetime.datetime.today()
>>> f"{today:%B %d, %Y}"
'June 26, 2022'
By default, an f-string will use the __str__ method for objects, but you can make sure they use __repr__ with the conversion flag !r.
>>> class A:
...     def __str__(self):
...             return "A.__str__"
...     def __repr__(self):
...             return "A.__repr__"
... 
>>> f"{A()}"
'A.__str__'
>>> f"{A()!r}"
'A.__repr__'
In terms of speed, f-string are faster than both %-formatting and str.format.
Since Python 3.8 you can also use the = specifier, which self-documents the expression, extremely useful for debugging.
>>> a = 3
>>> f"{a=}"
'a=3'
>>> b = 5
>>> f"{a+b=}"
'a+b=8'
The weakness of the f-string is that it can't be used as "template" unlike %-formatting and str.format, precisely because it is evaluated in-place.
Bonus

Classes

Small point on __repr__ and __str__. These methods are both used to get the string representation of an object, but their goals are different:

__repr__: unambiguous and verbose way of representing an object, used for debugging and development.
__str__: used for creating output for end user.


Note: print and str() built-in uses __str__ while repr() uses __repr__.

Formatting types


Symbol
Description


s
String conversion via str() prior to formatting


c
Character


i
Signed decimal integer


d
Signed decimal integer(base-10)


u
Unsigned decimal integer


o
Octal integer


f
Floating point display


b
Binary


o
Octal


x
Hexadecimal with lowercase letters after 9


X
Hexadecimal with uppercase letters after 9


e
Exponent notation


For %-formatting: %s, %c, etc...
For str.format and f-string: {:s}, {:c} etc...

Escape sequences


Sequence
Description


\0
Null character


\a
Makes a sound like a bell


\b
Backspace


\n
Breaks the string into a new line


\t
Adds a horizontal tab


\v
Adds a vertical tab


\
Prints a backslash


'
Prints a single quote


"
Prints a double quote
Symbol	Description
s	String conversion via str() prior to formatting
c	Character
i	Signed decimal integer
d	Signed decimal integer(base-10)
u	Unsigned decimal integer
o	Octal integer
f	Floating point display
b	Binary
o	Octal
x	Hexadecimal with lowercase letters after 9
X	Hexadecimal with uppercase letters after 9
e	Exponent notation
Sequence	Description
\0	Null character
\a	Makes a sound like a bell
\b	Backspace
\n	Breaks the string into a new line
\t	Adds a horizontal tab
\v	Adds a vertical tab
\	Prints a backslash
'	Prints a single quote
"	Prints a double quote