Skip to content

Instantly share code, notes, and snippets.

@cereniyim
Created April 29, 2020 11:51
Show Gist options
  • Save cereniyim/d1df215db5a7f64e1c59a1dcffef96f7 to your computer and use it in GitHub Desktop.
Save cereniyim/d1df215db5a7f64e1c59a1dcffef96f7 to your computer and use it in GitHub Desktop.
seacrh a given keyword in a feature
def extract_features_from_description(df,
column_name,
new_feature_name,
extract_words):
# function to extract features from the column_name
# searches column_name feature for a given list
# ASSUMPTION: There is no NA values
# in the description feature
check_regex = (r'\b(?:{})\b'
.format('|'
.join(
map(re.escape,
extract_words))))
df[new_feature_name] = (df[column_name]
.str
.contains(check_regex,
regex=True)
.astype('uint8'))
return df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment