Created
January 9, 2024 02:57
-
-
Save lucaswells/98d3ebe6e5427a695838899388e9db55 to your computer and use it in GitHub Desktop.
Function to recode ECOSUBCD to DIVCD
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def recode_divcd(df: pd.DataFrame) -> pd.DataFrame: | |
""" | |
Convert the ECOSUBCD column to DIVCD | |
Parameters | |
---------- | |
df : pd.DataFrame | |
The input DataFrame | |
Returns | |
------- | |
pd.DataFrame | |
The input DataFrame with the ECOSUBCD column converted to DIVCD | |
""" | |
# Extract the numeric part and convert to integer | |
df["DIVCD"] = df["ECOSUBCD"].str.extract("(\d+)").astype(int) | |
# Round down to the nearest 10 | |
df["DIVCD"] = (df["DIVCD"] // 10) * 10 | |
# Add 'M' prefix if originally present | |
df["DIVCD"] = df.apply( | |
lambda row: "M" + str(row["DIVCD"]) | |
if row["ECOSUBCD"].startswith("M") | |
else row["DIVCD"], | |
axis=1, | |
) | |
return df |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment