Skip to content

Instantly share code, notes, and snippets.

@findmyway
Last active August 29, 2015 14:14
Show Gist options
  • Save findmyway/e2ba5c0e215459ed65cf to your computer and use it in GitHub Desktop.
Save findmyway/e2ba5c0e215459ed65cf to your computer and use it in GitHub Desktop.
total = pd.concat([click[['TimeStamp','SesID','ItemID']],buy[['TimeStamp','SesID','ItemID']]],
ignore_index=True)
buy_sessions = buy['SesID'].drop_duplicates()
total['IsBuy'] = total['SesID'].isin(buy_sessions)
total_sesID_group = total.groupby('SesID')
click_sesID_group = total[total['IsBuy'] == False].groupby('SesID')
buy_sesID_group = total[total['IsBuy'] == True].groupby('SesID')
item2category = click[['ItemID','Category']].copy()
item2category.drop_duplicates(inplace=True)
print 'length of item2category :',len(item2category)
print 'total num of items: ', len(total['ItemID'].drop_duplicates())
# length of item2category : 100110
# total num of items: 52739
items_group = item2category.groupby('ItemID')
category_group = item2category.groupby('Category')
items_group_count = items_group.count()
category_group_count = category_group.count()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment