This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.phony: all | |
all: push | |
# configuration setup | |
PROJECT_FILE_NAME=project.yml | |
CONFIG := python3 -c "import yaml; import json; from pathlib import Path; print(json.dumps(yaml.safe_load(Path(\"$(PROJECT_FILE_NAME)\").read_text())))" | |
OS := $(shell python3 -c "import platform; print(platform.system())") | |
# defaults |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
def sample_dataframe(df, date_column, N): | |
""" | |
Samples a pandas DataFrame to achieve a balanced representation across different date groups. | |
This function ensures that smaller date groups (those with counts less than an average size) | |
are fully represented in the sample. Larger date groups are sampled uniformly to contribute | |
towards a total sample size of N. If the initial sampling process results in a total sample size |
OlderNewer