Skip to content

Instantly share code, notes, and snippets.

View atrisovic's full-sized avatar

Ana Trisovic atrisovic

View GitHub Profile
import pandas as pd
from huggingface_hub import HfApi
import datetime
import json
df = pd.read_csv('../../data/llm_leaderboard.csv', index_col=1, header=1)
def get_model_info(model_id):
api = HfApi()
atrisovic /
Last active February 20, 2023 15:40
Form to document new analytic data on FASSE

Step 1: Check analytic data

Is the data you need already on FASSE? Check out the catalog here:

If it is not, see step 2.

Step 2: Fill in the form below and add it in the comments here.

The format of the form goes like this:

 | |-zip.rds
 | |-zip2.rds
 | |-SA_COPD2.Rmd
 | |-review.R
 | |-SA_MI-New.Rmd
 | |-SA_CHF2.Rmd
 | |-SA_LungCancer2.Rmd
 | |-SA_LungCancer-New.Rmd
atrisovic /
Last active April 17, 2022 04:32
import pandas as pd
import numpy as np
import json
from simplejson import loads
def get_outcomes():
""" Get and return ICD codes """""
f = open('icd_codes.json')
outcomes_ = json.load(f)
"aki": {
"icd10": [
"icd9": [
"all_kidney": {
# Before running, activate env:
# export CONDA_ENVS_PATH=/nfs/projects/n/nsaph_common/conda/envs/
# export CONDA_PKGS_PATH=/nfs/projects/n/nsaph_common/conda/pkgs/
# source activate nsaph
## Code to ID hospitalizations
atrisovic /
Last active March 9, 2022 02:50
sample file summaries in R

Get data sample

To get the data sample, we take first 25k rows and last 25k rows from the sample of 59mil rows in bash:

>> tail -n25000 /2016/mbsf_abcd_summary_res000017155_req008183_2016.dat \
        > sample_mbsf_abcd_summary_res000017155_req008183_2016.dat
>> head -n25000 /2016/mbsf_abcd_summary_res000017155_req008183_2016.dat \
        >> sample_mbsf_abcd_summary_res000017155_req008183_2016.dat
# word count:
atrisovic /
Last active March 3, 2022 15:35
Rewriting git history for data_requests

Rewriting git history for data_requests

What happened

Beneficiery ID numbers were shared in a private GitHub repositry, in the following directories:


Dataset stats from Dataverse

in the format DOI, release_year, mime_type

SQL DB query:

SELECT p.authority, p.identifier, f.contenttype, p.publicationdate 
FROM datafile f, dvobject o, dataset s, dvobject p 
WHERE = AND o.owner_id = AND = AND s.harvestingclient_id IS NULL
from flask import Flask, redirect, url_for
from celery import Celery
from celery import Task
from subprocess import PIPE, Popen
import logging, os
logger = logging.getLogger(__name__)
# Running locally: