Skip to content

Instantly share code, notes, and snippets.

@dwinston
dwinston / nmdc-schema-v3.2.0-v6.0.3.json.diff
Last active August 12, 2022 14:30
<github.com:microbiomedata/nmdc-schema>: git diff v3.2.0 v6.0.3 -- jsonschema/nmdc.schema.json
diff --git a/jsonschema/nmdc.schema.json b/jsonschema/nmdc.schema.json
index 3a18b59..1efa788 100644
--- a/jsonschema/nmdc.schema.json
+++ b/jsonschema/nmdc.schema.json
@@ -53,6 +53,28 @@
"title": "Agent",
"type": "object"
},
+ "AnalysisTypeEnum": {
+ "description": "",
@dwinston
dwinston / nmdc-schema-v3.2.0-v6.0.3.src.diff
Created August 12, 2022 14:29
<github.com:microbiomedata/nmdc-schema>: git diff v3.2.0 v6.0.3 -- src/schema/*
diff --git a/src/schema/basic_slots.yaml b/src/schema/basic_slots.yaml
index 17c8bf3..1fa4bda 100644
--- a/src/schema/basic_slots.yaml
+++ b/src/schema/basic_slots.yaml
@@ -29,7 +29,7 @@ slots:
description: >-
A unique identifier for a thing.
Must be either a CURIE shorthand for a URI or a complete URI
- #required: false # for now we setting this to false until we develop an id template
+ #required: false # for now we are setting this to false until we develop an id template
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
id action attribute value
emsl:456424 update processing_institution Environmental Molecular Sciences Laboratory
@dwinston
dwinston / nmdc_envo_term_subterms.py
Created April 15, 2022 18:43
build a subsumption map scoped to ENVO terms in use by NMDC biosamples
"""
Build a subsumption map scoped to ENVO terms in use by NMDC biosamples
"""
from collections import defaultdict
import json
from rdflib import Graph
from rdflib.namespace import Namespace
from tqdm import tqdm
{"biosample_set": [{
"id": "fake3",
"env_broad_scale" : {
"term" : {"id": "ENVO:01000253"}
},
"env_local_scale" : {
"term" : {"id": "ENVO:01000621"}
},
"env_medium" : {
"term" : {"id": "ENVO:01000017"}
{"biosample_set": [{
"id": "fake2",
"env_broad_scale" : {
"term" : {"id": "ENVO:01000253"}
},
"env_local_scale" : {
"term" : {"id": "ENVO:01000621"}
},
"env_medium" : {
"term" : {"id": "ENVO:01000017"}
{
"name" : "FT ICR-MS analysis results",
"description" : "FT ICR-MS-based metabolite assignment results table",
"filter" : "{\"url\": {\"$regex\": \"nom\\\\/results\"}, \"description\": {\"$regex\": \"FT ICR-MS\"}}",
"id" : "nmdc:sys045mx19"
}
{
"name" : "GC-MS Metabolomics Results",
"description" : "GC-MS-based metabolite assignment results table",
"filter" : "{\"url\": {\"$regex\": \"metabolomics\\\\/results\"}}",
@dwinston
dwinston / sensor.py
Created June 23, 2021 14:45
dagster resource in a sensor via preset definition run config
from dagster import (
ModeDefinition, PresetDefinition, resource, StringSource,
build_init_resource_context, RunRequest, sensor,
)
class ApiClient:
def __init__(self, base_url: str, site_id: str, client_id: str, client_secret: str):
self.base_url = base_url
self.site_id = site_id
self.client_id = client_id
@dwinston
dwinston / all_your_zulip_are_belong_to_us.py
Last active February 26, 2021 15:57
get all Zulip messages sent by non-bot users to public streams
"""
A script developed to get all Zulip messages sent by non-bot users to public streams.
Need to pip install pymongo tqdm zulip, and run a local MongoDB server.
But you can also adapt the script to append to an in-memory Python list, and not need MongoDB or pymongo.
I found that the total volume of data in my case (see in-script comments) was 700MB uncompressed.
Developed at the Recurse Center (https://www.recurse.com/) in order to apply PageRank to Zulip entities.
Licensed as <https://opensource.org/licenses/MIT>(year=2021, copyright_holder="Donny Winston").
"""