Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
A method for dropping highly correlated assets in the universe
# A method for dropping highly correlated assets in the universe
# Threshold for finding highly correlated pairs
CORR_THRESHOLD = 0.5
# data: a pandas DataFrame containing the closing prices of all securities.
# Firstly ranking all tickers based on momentum.
# You can use other scoring methods here.
# This score will be used when selecing which one to remain/drop on highly correlated pairs
ticker_orders = data.pct_change(250).iloc[-1].sort_values(ascending=False).index.to_list()
ret = data[ticker_orders].pct_change().fillna(0).dropna()
cor_matrix = ret.corr().abs()
# Upper Triangle of the Correlation Matrix
ut = cor_matrix.where(np.triu(np.ones(cor_matrix.shape),k=1).astype(bool))
# Below two lists will contain tickers to drop and ticker to remain
to_drop = [column for column in ut.columns if any(ut[column] > CORR_THRESHOLD)]
remain_ticker = [item for item in ticker_orders if item not in to_drop]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment