Skip to content

Instantly share code, notes, and snippets.

View mpkocher's full-sized avatar

M. Kocher mpkocher

View GitHub Profile
@mpkocher
mpkocher / HelloScalaAlmond.ipynb
Last active October 30, 2018 22:48
Almond Example Notebook
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mpkocher
mpkocher / P_Filter.py
Created October 26, 2018 17:14
Legacy smrtpipe.py workflow examples from the RS
"""Filters sequences based on the pls.h5 file, removing apparently
doubly loaded wells and unloaded wells.
Note: The 'chunking' (i.e., chunkFunc) in the task decorator must be
consistent for each instance of SMRTFile. In other words, inputPlsFofn must be
scattered using the same function in *every* P_module. Therefore, the
chunkFunc must be set to context.numMovies.
"""
@mpkocher
mpkocher / get-runs.py
Last active October 2, 2019 14:54
Example of getting Runs list from SL
#!/usr/bin/env python
"""Get a List of Runs from SMRT Link"""
# From:http://bitbucket.nanofluidics.com:7990/projects/SL/repos/pbcommand/browse/pbcommand/cli/examples/template_simple.py
import os
import sys
import logging
from pbcommand.validators import validate_file
from pbcommand.utils import setup_log
@mpkocher
mpkocher / sl_clieint_template.py
Last active October 2, 2019 14:55
pbcommand SL-Client Template
import datetime
import json
import copy
import time
import functools
import itertools
import pickle
import requests
import iso8601
@mpkocher
mpkocher / ExampleCatMonoid.sc
Last active October 2, 2019 14:58
Example Monoid with Cats
// Run with ammonite use amm ExampleCatsMonoid.sc
// import $ivy.`org.typelevel::cats-core:1.0.1`
import cats.implicits._
import cats.instances.all._
import cats.Monoid
// For demonstration purposes, load a file (csv, json) and
// each item can be converted to a record of type R
case class R(name: String, age: Int, favoriteColor: String)
@mpkocher
mpkocher / ArchOverview.md
Last active October 2, 2019 14:59
High Level Overview of the SMRT Link Secondary Analysis System

Arch Overview

Core Nouns of the PacBio System

  1. Run (often created/edited from SMRT Link RunDesign, stored as XML)
  2. CollectionMetadata a Run has a list of Collection (Primary Analysis will convert a CollectionMetadata to a SubreadSet)
  3. PacBio DataSets SubreadSet, ReferenceSet, etc... These are thin-ish XML files that have general metadata as well as pointers to 'external resources' (e.g., BAM, Fasta files) and their companion index files.
  4. SMRT Link Job A general (async) unit of work to perform operations on PacBio DataSets
  5. ** DataStoreFile** a container for output files from a SMRT Link Job and contains metadata, such as file type, size, path. A list of DataStore Files is called a DataStore. This is the core output of SMRT Link Job.
  6. ** Report** a Report is general model to capture Report metrics (also referred to as 'Attributes'), Report Tables and Report Plot Groups. A Report is a specific type of DataStoreFile and are used to communicate details of
@mpkocher
mpkocher / Example.scala
Last active October 2, 2019 15:00
akka-http Uri Example (Ammonite output)
@ import akka.http.scaladsl.model.Uri
import akka.http.scaladsl.model.Uri
@ val u = Uri("http://smrtlink-bihourly:8081")
u: Uri = Uri("http", Authority(NamedHost("smrtlink-bihourly"), 8081, ""), , None, None)
@ val p1 = Uri.Path("root") / "alpha" / "beta"
p1: Uri.Path = Segment("root", Slash(Segment("alpha", Slash(Segment("beta", )))))
@ val p2 = p1 / "gamma"
@mpkocher
mpkocher / Example.md
Last active October 2, 2019 15:02
TechSupport Manual Submission of Failed Job Example

Manual creation of sending TS

  1. Get Failed Job By Id
  2. Use pbservice to create a TS failed job request
  3. Manually download the TGZ file from the Job UI page or curl
  4. Manulally send TGZ file using `tech-support-uploader'

From SL beta.

@mpkocher
mpkocher / slln.py
Last active December 27, 2017 23:43
Example for extracting SubreadSet Bam files from a SMRT Link Job
#!/usr/bin/env python
import logging
import sys
from pbcommand.services import ServiceAccessLayer as S
from pbcore.io.dataset import SubreadSet
log = logging.getLogger(__name__)
@mpkocher
mpkocher / compute_core_hours.py
Last active December 28, 2017 17:36
Compute Core Hours
#!/usr/bin/env python
import json
import sys
import argparse
import os
import operator
def extract_table_values(dt, column_id):
for column_d in dt['columns']: