Skip to content

Instantly share code, notes, and snippets.

@Archipelogic
Archipelogic / field_comparison.py
Last active July 1, 2025 19:59
Code to compare the same fields from two different sources in the same dataframe
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from itertools import combinations
import re
class MultiSourceFieldComparator:
def __init__(self, df, fields_to_compare, config=None):
@Archipelogic
Archipelogic / address_standardizer.py
Last active July 1, 2025 18:57
US address standardizer for pandas DataFrame merging - handles various formats and normalizes ZIP codes to 5 digits
"""
US Address Standardizer for DataFrame Merging (Enhanced Version)
Standardizes address columns in pandas DataFrames to enable consistent merging.
Handles various address formats, reduces ZIP codes to 5 digits, and includes fuzzy matching.
Requirements:
pip install pandas usaddress fuzzywuzzy python-levenshtein
"""
import pandas as pd
import usaddress