Skip to content

Instantly share code, notes, and snippets.

@johnberroa
Created July 26, 2018 20:00
Show Gist options
  • Save johnberroa/cd49976220933a2c881e89b69699f2f7 to your computer and use it in GitHub Desktop.
Save johnberroa/cd49976220933a2c881e89b69699f2f7 to your computer and use it in GitHub Desktop.
Remove umlauts from text in Python
def remove_umlaut(string):
"""
Removes umlauts from strings and replaces them with the letter+e convention
:param string: string to remove umlauts from
:return: unumlauted string
"""
u = 'ü'.encode()
U = 'Ü'.encode()
a = 'ä'.encode()
A = 'Ä'.encode()
o = 'ö'.encode()
O = 'Ö'.encode()
ss = 'ß'.encode()
string = string.encode()
string = string.replace(u, b'ue')
string = string.replace(U, b'Ue')
string = string.replace(a, b'ae')
string = string.replace(A, b'Ae')
string = string.replace(o, b'oe')
string = string.replace(O, b'Oe')
string = string.replace(ss, b'ss')
string = string.decode('utf-8')
return string
@johnberroa
Copy link
Author

Gerne! Es gibt vielleicht noch einen einfacheren Weg, aber man kann damit schon anfangen.

@TrevisGordan
Copy link

TrevisGordan commented May 20, 2022

hey guys,
this will do ;)

def replace_umlauts(text: str) -> str:
    """replace special German umlauts (vowel mutations) from text. 
    ä -> ae, Ä -> Ae...
    ü -> ue, Ü -> Ue...
    ö -> oe, Ö -> Oe...
    ß -> ss
    """
    vowel_char_map = {
        ord('ä'): 'ae', ord('ü'): 'ue', ord('ö'): 'oe', ord('ß'): 'ss',
        ord('Ä'): 'Ae', ord('Ü'): 'Ue', ord('Ö'): 'Oe'
    }
    return text.translate(vowel_char_map)

!EDIT: Added Uppercase Umlaute Ä -> Ae...

@johnberroa
Copy link
Author

A lot more compact! Great! Learned a new function (translate) 😄

@adinator97
Copy link

hey guys, this will do ;)

def replace_umlauts(text:str) -> str:
    """replace special German umlauts (vowel mutations) from text. 
    ä -> ae...
    ü -> ue 
    """
    vowel_char_map = {ord('ä'):'ae', ord('ü'):'ue', ord('ö'):'oe', ord('ß'):'ss'}
    return text.translate(vowel_char_map)

Uppercase Umlaute are missing

@TrevisGordan
Copy link

@ad1c1337, you are absolutely right. I modified the original snippet to include uppercase Umlauts. Dankeschoen! ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment