Skip to content

Instantly share code, notes, and snippets.

@Juanmaria-rr
Juanmaria-rr / phasing_propagation.rg
Created October 31, 2025 08:43
phase propagation approaches
Approach 1. Assign phased from one position until next.
import pandas as pd
import numpy as np
def assign_phase_untilNext(df):
df = df.sort_values(["CHROM", "POS"]).reset_index(drop=True)
# Keep only phased SNPs as anchors
phased_mask = df["GT"].isin(["0|1", "1|0"])
### load datasets
target_path = "gs://open-targets-data-releases/24.03/output/etl/parquet/targets/"
target = spark.read.parquet(target_path)
go = spark.read.parquet("gs://open-targets-data-releases/24.03/output/etl/parquet/go")
diseases_all = spark.read.parquet(
"gs://open-targets-data-releases/24.03/output/etl/parquet/diseases/"
)
#### Target Facets