Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Python string multireplacement
def multireplace(string, replacements, ignore_case=False):
"""
Given a string and a replacement map, it returns the replaced string.
:param str string: string to execute replacements on
:param dict replacements: replacement dictionary {value to find: value to replace}
:param bool ignore_case: whether the match should be case insensitive
:rtype: str
"""
# If case insensitive, we need to normalize the old string so that later a replacement
# can be found. For instance with {"HEY": "lol"} we should match and find a replacement for "hey",
# "HEY", "hEy", etc.
if ignore_case:
def normalize_old(s):
return s.lower()
re_mode = re.IGNORECASE
else:
def normalize_old(s):
return s
re_mode = 0
replacements = {normalize_old(key): val for key, val in replacements.items()}
# Place longer ones first to keep shorter substrings from matching where the longer ones should take place
# For instance given the replacements {'ab': 'AB', 'abc': 'ABC'} against the string 'hey abc', it should produce
# 'hey ABC' and not 'hey ABc'
rep_sorted = sorted(replacements, key=len, reverse=True)
rep_escaped = map(re.escape, rep_sorted)
# Create a big OR regex that matches any of the substrings to replace
pattern = re.compile("|".join(rep_escaped), re_mode)
# For each match, look up the new string in the replacements, being the key the normalized old string
return pattern.sub(lambda match: replacements[normalize_old(match.group(0))], string)
@derdav3

This comment has been minimized.

Copy link

commented Apr 6, 2017

how would you modify this to have a case-insensitive match?

@HatScripts

This comment has been minimized.

Copy link

commented Apr 27, 2017

@derdav3

def multi_replace(string, replacements, ignore_case=False):
    """
    Given a string and a dict, replaces occurrences of the dict keys found in the 
    string, with their corresponding values. The replacements will occur in "one pass", 
    i.e. there should be no clashes.
    :param str string: string to perform replacements on
    :param dict replacements: replacement dictionary {str_to_find: str_to_replace_with}
    :param bool ignore_case: whether to ignore case when looking for matches
    :rtype: str the replaced string
    """
    rep_sorted = sorted(replacements, key=lambda s: len(s[0]), reverse=True)
    rep_escaped = [re.escape(replacement) for replacement in rep_sorted]
    pattern = re.compile("|".join(rep_escaped), re.I if ignore_case else 0)
    return pattern.sub(lambda match: replacements[match.group(0)], string)
@bgusach

This comment has been minimized.

Copy link
Owner Author

commented May 10, 2017

@derdav3, as @HatScripts suggested, just pass the ignore-case flag to re.compile.

@HatScripts, I haven't tested your proposal, but... aren't you sorting the strings by the length of the first character (i.e. always 1?)

@sidscry

This comment has been minimized.

Copy link

commented Feb 6, 2018

@HatScripts This case fails
string = "original text is here"
replacements = {
"original": "text",
"text" : "fake",
"Is hEre": "was there"
}
ignore_case = True

@thorfi

This comment has been minimized.

Copy link

commented Jul 11, 2018

@sidscry @HatScripts @bgusach:
Bugfixes for the above, replace: rep_sorted = ... with:

    if ignore_case:
        replacements = dict((pair[0].lower(), pair[1]) for pair in sorted(replacements.iteritems()))
    rep_sorted = sorted(replacements, key=lambda s: (len(s), s), reverse=True)
    ...
    return pattern.sub(lambda match: replacements[match.group(0).lower() if ignore_case else match.group(0)], string)
@Gutsu7

This comment has been minimized.

Copy link

commented Sep 2, 2019

How is it possible to use a pandas DataFrame for replacement instead of using a dictionary?

@bgusach

This comment has been minimized.

Copy link
Owner Author

commented Sep 3, 2019

How is it possible to use a pandas DataFrame for replacement instead of using a dictionary?

I'm not familiar with pandas DataFrames, but I guess you can convert that data structure into a dictionary in a meaningful way and then use this function.

@bgusach

This comment has been minimized.

Copy link
Owner Author

commented Oct 2, 2019

Hi @thorfi and @HatScripts, I updated the gist with a solution inspired in yours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.