Skip to content

Instantly share code, notes, and snippets.

@larssono
Created July 10, 2015 21:41
Show Gist options
  • Save larssono/e6bc9db83a96105fc00e to your computer and use it in GitHub Desktop.
Save larssono/e6bc9db83a96105fc00e to your computer and use it in GitHub Desktop.
import synapseclient
import pandas as pd
syn=synapseclient.Synapse(skip_checks=True)
syn.login(silent=True)
records = pd.read_csv('publicRecordIds', sep='\t')
records = records.query('studyId=="parkinson"')
allVoiceData = syn.tableQuery('SELECT recordId, healthCode FROM syn4590865').asDataFrame()
idx = ['Voice' in x for x in records.schemaKey]
voiceRecordsToKeep = records.ix[idx, 'recordId']
#Find common healthCodes
df = allVoiceData[allVoiceData.recordId.isin(voiceRecordsToKeep)]
df = df.groupby('healthCode')['healthCode'].count()
df.sort()
print '\',\''.join(df[-50:].index)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment