Skip to content

Instantly share code, notes, and snippets.

View seandavi's full-sized avatar

Sean Davis seandavi

View GitHub Profile
@seandavi
seandavi / datasets.yaml
Last active October 25, 2023 00:33
yaml description of public health data resources
datasets:
- name: brfss
title: The brfss dataset
description: |
a very long description which can be
in [markdown](https://markdown.org).
- list item
- list item 2
processor: readr::read_csv
- name: svi
"""This script scrapes the state cancer profiles website
and saves the results as csv files.
The website is a bit of a mess, so this script is a bit of a mess.
It requires scraping the select options from the website
and then iterating over all the possible combinations of
select options to get the data.
The script is designed to be run from the command line
with the following command:
@seandavi
seandavi / README.md
Last active August 2, 2023 14:41
Basic google LLM API quickstart and example

See: https://cloud.google.com/vertex-ai/docs/generative-ai/chat/test-chat-prompts

You'll need:

  1. A project id
  2. An installed and working gcloud command-line tool
  3. A service account that has been granted access to vertex AI
  4. A json key file associated with the service account
  5. Set up local environment to use the json key for authentication

Tasks 1, 3, and 4 can be performed using the console OR using gcloud by a user authorized to create resources. Note that the

@seandavi
seandavi / Dockerfile
Last active August 1, 2023 23:00
Run a google cloud batch job set up in python
from python:3.11
RUN pip install --upgrade pip
RUN pip install omicidx
RUN pip install fsspec gcsfs s3fs
COPY abc.py .
@seandavi
seandavi / chatgpt.R
Created July 11, 2023 21:35
Small play function for interacting with Azure openai playground (chatgpt)
library(httr2)
ChatGPT <- function(prompt='Tell me about Bioconductor') {
base_request = httr2::request('https://openai-playground-asdf.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2023-03-15-preview')
res = base_request |>
httr2::req_headers('api-key'=Sys.getenv('API_KEY')) |>
httr2::req_body_json(list(
messages=
list(
list(role="system",content="You are a helpful AI assistant"),
list(role="user", content=prompt)
digraph ChatGPT {
node [shape=box]
subgraph cluster_0 {
label="Microsoft Asure"
style=dashed
ChatGPT
}
subgraph cluster_1 {
label="On Prem or Cloud"
style="dashed"
@seandavi
seandavi / annotationhub_alabaster.R
Created June 7, 2023 16:30
Alabaster process all of annotationhub....
library(alabaster)
library(AnnotationHub)
library(jsonlite)
ah = AnnotationHub()
STAGE_DIR = '/tmp/ah_staging'
dir.create(STAGE_DIR, showWarnings = FALSE)
write_error = function(dirname, e) {
jsonlite::write_json(list(message=jsonlite::unbox(e$message)), file.path(dirname,'error.json'))
@seandavi
seandavi / prompts.md
Created June 2, 2023 23:44
LLM prompts

Meetings

Summary from zoom transcript

You are an AI assistant that summarizes meeting notes generated by Zoom. For the meeting notes provided as input, provide these items if possible from the content:

  • meeting date, time, and title
  • meeting attendees as a bulleted list
  • executive summary (3-5 sentences)
@seandavi
seandavi / bioconductor_ehub_ahub_dataconductor.dot
Created May 31, 2023 13:00
Graphviz dot version of experimenthub and annotationhub replacement by dataconductor
# Place the cursor inside "graph" to get some refactoring options
digraph G {
fontname="Helvetica,Arial,sans-serif"
node [fontname="Helvetica,Arial,sans-serif"]
edge [fontname="Helvetica,Arial,sans-serif"]
edge[color="#00000050"]
subgraph cluster_0 {
@seandavi
seandavi / cytoband_table_prompt.txt
Last active May 10, 2023 21:22
Using GPT-4 to augment AnnotationHub resource metadata
I would like you to describe the UCSC genome browser table called "cytoBand". The first few lines of the table are here:
chrom chromStart chromEnd name gieStain
chr1 0 2300000 p36.33 gneg
chr1 2300000 5300000 p36.32 gpos25
chr1 5300000 7100000 p36.31 gneg
chr1 7100000 9100000 p36.23 gpos25
chr1 9100000 12500000 p36.22 gneg
chr1 12500000 15900000 p36.21 gpos50
chr1 15900000 20100000 p36.13 gneg