Becky Sweger bsweger

## .README.md

      
              4 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                bsweger
                / .README.md
            
            
              Last active
              June 5, 2024 19:54
            
              
                Accessing Cloud-Based Hubverse Data
              
          
    Cloud-Based Hubs

The Hubverse is in the processing of making hub data available via publicly-accessible AWS S3 buckets.
Cloud-based hubs "mirror" the data stored in a hub's GitHub repository and provide a few advantages for data consumers:

No need to clone a repository to access data
Cloud-based model-output files are in parquet format, which is easier to work with and more performant

The examples here use the CDC's FluSight Forecast Hub, which is available in the following S3 bucket:

  
## make_samples.py
from itertools import product
import pandas as pd
import numpy as np

def make_sample(
    n_samples: int = 2,
    n_horizons: int = 3,
    n_variants: int = 3,
    n_locations: int = 2,
    samples_joint_across: list[str] = None

## .zshrc
# If you come from bash you might have to change your $PATH.
# export PATH=$HOME/bin:/usr/local/bin:$PATH

# Path to your oh-my-zsh installation.
export ZSH="/home/becky/.oh-my-zsh"

# Set name of the theme to load --- if set to "random", it will
# load a random theme each time oh-my-zsh is loaded, in which case,
# to know which specific one was loaded, run: echo $RANDOM_THEME
# See https://github.com/ohmyzsh/ohmyzsh/wiki/Themes

## mock_requests.py
import pytest
import requests

@pytest.fixture
def mock_request_json():
    """Return fake API data."""
    return {'spring': [
        {
            'name': 'birds',
            'status': 'chirpin',

## gist:24a47e4c253dc9b17869ef096700dffd
### Keybase proof

I hereby claim:

  * I am bsweger on github.
  * I am bendystraw (https://keybase.io/bendystraw) on keybase.
  * I have a public key ASCmomBSZgzxe3w1YQf-eZwTB3KNg7k4VGD29NN8hhmrDAo

To claim this, I am signing this object:

## seti_grants.py
# Get SETI Institute grants from USAspending API into a pandas dataframe
import json
import requests
import pandas as pd
from pandas.io.json import json_normalize

uri = 'https://api.usaspending.gov/api/v2/search/spending_by_transaction/'
headers = {'content-type': 'application/json'}
seti_json = []
next = 1

## iterm_profile.json
{
  "Ansi 3 Color" : {
    "Green Component" : 0.73333334922790527,
    "Blue Component" : 0,
    "Red Component" : 0.73333334922790527
  },
  "Tags" : [

  ],
  "Ansi 12 Color" : {

## financial_balances_agency.md

      
              1 file
            
          
              0 forks
            
          
              2 comments
            
          
              0 stars
            
          
                bsweger
                / financial_balances_agency.md
            
            
              Last active
              June 7, 2017 01:03
            
              
                financial_balances_agency.md
              
          
    Retrieve financial balances by agency and fiscal year

Route: /api/v2/financial_balances/agencies
Method: GET
This route retrieves financial balance information by funding agency and fiscal year
Sample Request

/api/v2/financial_balances/agencies?funding_agency=775&amp;fiscal_year=2017

  
## pandas_pad_using_apply.py
# example of using a parameterized function as a converter when reading .csv in pandas

import pandas as pd

# a function that will be used to pad datafram column values to a specified length
# (some incoming values are multiple spaces; those should convert to Noe)
padFunction = lambda field, padTo: str(field).strip().zfill(padTo) if len(str(field).strip()) else None

# read file w/o using converters and display list of unique alloc_id values
pa = pd.read_csv(

## markdown-table-to-html.py
#tabedata.txt is a file that contains the piped markdown table rows
with open ('markdownrows.txt') as f:
    content = f.readlines()

rows = []

for c in content:
    cells = c.split('|')
    cells = ['<td>{}</td>'.format(cell.strip()) for cell in cells]
    rows.append(cells)
	from itertools import product
	import pandas as pd
	import numpy as np

	def make_sample(
	n_samples: int = 2,
	n_horizons: int = 3,
	n_variants: int = 3,
	n_locations: int = 2,
	samples_joint_across: list[str] = None
	# If you come from bash you might have to change your $PATH.
	# export PATH=$HOME/bin:/usr/local/bin:$PATH

	# Path to your oh-my-zsh installation.
	export ZSH="/home/becky/.oh-my-zsh"

	# Set name of the theme to load --- if set to "random", it will
	# load a random theme each time oh-my-zsh is loaded, in which case,
	# to know which specific one was loaded, run: echo $RANDOM_THEME
	# See https://github.com/ohmyzsh/ohmyzsh/wiki/Themes
	import pytest
	import requests

	@pytest.fixture
	def mock_request_json():
	"""Return fake API data."""
	return {'spring': [
	{
	'name': 'birds',
	'status': 'chirpin',
	### Keybase proof

	I hereby claim:

	* I am bsweger on github.
	* I am bendystraw (https://keybase.io/bendystraw) on keybase.
	* I have a public key ASCmomBSZgzxe3w1YQf-eZwTB3KNg7k4VGD29NN8hhmrDAo

	To claim this, I am signing this object:
	# Get SETI Institute grants from USAspending API into a pandas dataframe
	import json
	import requests
	import pandas as pd
	from pandas.io.json import json_normalize

	uri = 'https://api.usaspending.gov/api/v2/search/spending_by_transaction/'
	headers = {'content-type': 'application/json'}
	seti_json = []
	next = 1
	{
	"Ansi 3 Color" : {
	"Green Component" : 0.73333334922790527,
	"Blue Component" : 0,
	"Red Component" : 0.73333334922790527
	},
	"Tags" : [

	],
	"Ansi 12 Color" : {
	# example of using a parameterized function as a converter when reading .csv in pandas

	import pandas as pd

	# a function that will be used to pad datafram column values to a specified length
	# (some incoming values are multiple spaces; those should convert to Noe)
	padFunction = lambda field, padTo: str(field).strip().zfill(padTo) if len(str(field).strip()) else None

	# read file w/o using converters and display list of unique alloc_id values
	pa = pd.read_csv(
	#tabedata.txt is a file that contains the piped markdown table rows
	with open ('markdownrows.txt') as f:
	content = f.readlines()

	rows = []

	for c in content:
	cells = c.split('\|')
	cells = ['<td>{}</td>'.format(cell.strip()) for cell in cells]
	rows.append(cells)