A method for dropping highly correlated assets in the universe
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# A method for dropping highly correlated assets in the universe | |
# Threshold for finding highly correlated pairs | |
CORR_THRESHOLD = 0.5 | |
# data: a pandas DataFrame containing the closing prices of all securities. | |
# Firstly ranking all tickers based on momentum. | |
# You can use other scoring methods here. | |
# This score will be used when selecing which one to remain/drop on highly correlated pairs | |
ticker_orders = data.pct_change(250).iloc[-1].sort_values(ascending=False).index.to_list() | |
ret = data[ticker_orders].pct_change().fillna(0).dropna() | |
cor_matrix = ret.corr().abs() | |
# Upper Triangle of the Correlation Matrix | |
ut = cor_matrix.where(np.triu(np.ones(cor_matrix.shape),k=1).astype(bool)) | |
# Below two lists will contain tickers to drop and ticker to remain | |
to_drop = [column for column in ut.columns if any(ut[column] > CORR_THRESHOLD)] | |
remain_ticker = [item for item in ticker_orders if item not in to_drop] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment