Created
April 27, 2015 21:32
-
-
Save ashishsingal1/e1828ffd1a449513b8f8 to your computer and use it in GitHub Desktop.
How to qcut with non unique bin edges
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# I've had a lot of problems with creating unique bins for decile analysis, | |
# so I wrote this code that won't give you the "non unique bin error" in pandas | |
def calc_ranks(events, fields, result_field, cuts=10): | |
cut_size = cuts / 100.0 | |
result = {} | |
for i in fields: | |
result[i] = {} | |
events[i+'_rank'] = events[i].rank(pct=True, ascending=True) | |
for j in range(cuts): | |
result[i][j] = events[(events[i+'_rank'] > j*cut_size) & (events[i+'_rank'] <= (j+1)*cut_size)][result_field].mean() | |
return pd.DataFrame(result) |
could you give an example ?
and some explanation about the parameters passed?
Hi Ashish,
Thanks for sharing this valuable code. May I know the parameters need to pass.
Otherwise this code is not useful at all.
Regards,
Bhagwat
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
could you give an example ?
and some explanation about the parameters passed?