Skip to content

Instantly share code, notes, and snippets.

@hernamesbarbara
Created September 20, 2014 23:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hernamesbarbara/52c11486ad9cad8f0707 to your computer and use it in GitHub Desktop.
Save hernamesbarbara/52c11486ad9cad8f0707 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import pandas as pd
sample = """user_id,website 1 name,website 1 value,website 2 name,website 2 value,website 3 name,website 3 value
'000001,linkedin,https://linkedin.com/245563,github,https://github.com/926850,,
'000002,facebook,https://facebook.com/976099,,,,
'000003,twitter,https://twitter.com/612711,,,linkedin,https://linkedin.com/840609
'000004,github,https://github.com/220993,blog,https://blog.com/201188,linkedin,https://linkedin.com/479351
'000005,,,twitter,https://twitter.com/897816,,
'000006,,,twitter,https://twitter.com/937093,linkedin,https://linkedin.com/428950
'000007,blog,https://blog.com/273870,,,facebook,https://facebook.com/646576
'000008,,,linkedin,https://linkedin.com/248499,github,https://github.com/409893
'000009,,,,,facebook,https://facebook.com/822582
'0000010,blog,https://blog.com/195175,,,,
'0000011,,,linkedin,https://linkedin.com/159019,,
'0000012,linkedin,https://linkedin.com/418608,,,facebook,https://facebook.com/173588
'0000013,twitter,https://twitter.com/113959,linkedin,https://linkedin.com/877061,blog,https://blog.com/216711
'0000014,,,blog,https://blog.com/830860,github,https://github.com/886692
'0000015,,,linkedin,https://linkedin.com/415107,,
'0000016,facebook,https://facebook.com/638539,,,blog,https://blog.com/902939
'0000017,facebook,https://facebook.com/693167,,,,
'0000018,facebook,https://facebook.com/983373,,,,
'0000019,,,,,,
'0000020,twitter,https://twitter.com/259354,github,https://github.com/489686,facebook,https://facebook.com/708898
'0000021,github,https://github.com/461978,,,blog,https://blog.com/583090
'0000022,,,github,https://github.com/857316,linkedin,https://linkedin.com/634040
'0000023,,,twitter,https://twitter.com/136693,blog,https://blog.com/838401
'0000024,github,https://github.com/774564,facebook,https://facebook.com/783581,,
'0000025,linkedin,https://linkedin.com/335739,,,,
'0000026,linkedin,https://linkedin.com/171750,twitter,https://twitter.com/731644,blog,https://blog.com/920183
'0000027,linkedin,https://linkedin.com/774670,facebook,https://facebook.com/816667,blog,https://blog.com/612197
'0000028,,,,,,
'0000029,twitter,https://twitter.com/447780,,,facebook,https://facebook.com/680694
'0000030,,,,,linkedin,https://linkedin.com/257538
',,,,,,
"""
def populate_site_columns(row):
sites, urls = [], []
for field in row.index:
if 'name' in field and row[field] != "":
sites.append(row[field])
elif 'value' in field and row[field] != "":
urls.append(row[field])
return pd.Series(dict(zip(sites, urls)))
df = pd.read_csv(pd.io.parsers.StringIO(sample)).fillna('')
new = df.apply(populate_site_columns, axis=1)
df2 = pd.merge(df, new, left_index=1, right_index=1)
has_fb_mask = (df2.facebook.notnull()) & (df2.facebook != "")
print df2[has_fb_mask]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment