Skip to content

Instantly share code, notes, and snippets.

View seandavi's full-sized avatar

Sean Davis seandavi

View GitHub Profile
@seandavi
seandavi / file_metadata.json
Last active November 29, 2023 17:29
Proposal for available files metadata json for easier and more robust client parsing [note that data are fake]
{
"accession": "GSE000123",
"files": [
{
"filetype": "Series SOFT file",
"name": "GSE227465_family.soft.gz",
"size": 23413,
"md5sum": "....",
"created_at": "DATE",
"updated_at": "DATE"
@seandavi
seandavi / test.qmd
Created November 15, 2023 20:37
Quarto test mermaid document
---
format:
html:
mermaid-format: svg
---
```{mermaid}
%%| fig-width: 100%
autonumber
Participant C as Client[<font size=6>]
@seandavi
seandavi / ena_browser_api.md
Created November 11, 2023 11:02
Example queries from ENA browser API

The ENA browser API https://www.ebi.ac.uk/ena/portal/api/swagger-ui/

There is only limit, no offset. API is designed to simply stream large resultsets

Search by SRA study ID

Output as TSV

SEARCH_QUERY='secondary_study_accession=SRP082656' && curl "https://www.ebi.ac.uk/ena/portal/api/search?query=${SEARCH_QUERY}&result=read_run&fields=experiment_accession%2Cexperiment_title%2Csecondary_study_accession%2Caligned%2Caltitude%2Cassembly_quality%2Cassembly_software%2Cbam_aspera%2Cbam_bytes%2Cbam_ftp%2Cbam_galaxy%2Cbam_md5%2Cbase_count%2Cbinning_software%2Cbio_material%2Cbisulfite_protocol%2Cbroad_scale_environmental_context%2Cbroker_name%2Ccage_protocol%2Ccell_line%2Ccell_type%2Ccenter_name%2Cchecklist%2Cchip_ab_provider%2Cchip_protocol%2Cchip_target%2Ccollected_by%2Ccollection_date%2Ccollection_date_end%2Ccollection_date_start%2Ccompleteness_score%2Ccontamination_score%2Ccontrol_experiment%2Ccountry%2Ccultivar%2Cculture_collection%2Cdatahub%2Cdepth%2Cdescription%2Cdev_stage%2Cdnase_protocol%2Cecotype%2
@seandavi
seandavi / datasets.yaml
Last active October 25, 2023 00:33
yaml description of public health data resources
datasets:
- name: brfss
title: The brfss dataset
description: |
a very long description which can be
in [markdown](https://markdown.org).
- list item
- list item 2
processor: readr::read_csv
- name: svi
"""This script scrapes the state cancer profiles website
and saves the results as csv files.
The website is a bit of a mess, so this script is a bit of a mess.
It requires scraping the select options from the website
and then iterating over all the possible combinations of
select options to get the data.
The script is designed to be run from the command line
with the following command:
@seandavi
seandavi / README.md
Last active August 2, 2023 14:41
Basic google LLM API quickstart and example

See: https://cloud.google.com/vertex-ai/docs/generative-ai/chat/test-chat-prompts

You'll need:

  1. A project id
  2. An installed and working gcloud command-line tool
  3. A service account that has been granted access to vertex AI
  4. A json key file associated with the service account
  5. Set up local environment to use the json key for authentication

Tasks 1, 3, and 4 can be performed using the console OR using gcloud by a user authorized to create resources. Note that the

@seandavi
seandavi / Dockerfile
Last active August 1, 2023 23:00
Run a google cloud batch job set up in python
from python:3.11
RUN pip install --upgrade pip
RUN pip install omicidx
RUN pip install fsspec gcsfs s3fs
COPY abc.py .
@seandavi
seandavi / chatgpt.R
Created July 11, 2023 21:35
Small play function for interacting with Azure openai playground (chatgpt)
library(httr2)
ChatGPT <- function(prompt='Tell me about Bioconductor') {
base_request = httr2::request('https://openai-playground-asdf.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2023-03-15-preview')
res = base_request |>
httr2::req_headers('api-key'=Sys.getenv('API_KEY')) |>
httr2::req_body_json(list(
messages=
list(
list(role="system",content="You are a helpful AI assistant"),
list(role="user", content=prompt)
digraph ChatGPT {
node [shape=box]
subgraph cluster_0 {
label="Microsoft Asure"
style=dashed
ChatGPT
}
subgraph cluster_1 {
label="On Prem or Cloud"
style="dashed"
@seandavi
seandavi / annotationhub_alabaster.R
Created June 7, 2023 16:30
Alabaster process all of annotationhub....
library(alabaster)
library(AnnotationHub)
library(jsonlite)
ah = AnnotationHub()
STAGE_DIR = '/tmp/ah_staging'
dir.create(STAGE_DIR, showWarnings = FALSE)
write_error = function(dirname, e) {
jsonlite::write_json(list(message=jsonlite::unbox(e$message)), file.path(dirname,'error.json'))