Skip to content

Instantly share code, notes, and snippets.

View disulfidebond's full-sized avatar

disulfidebond disulfidebond

  • UWisconsin-Madison
  • Madison, WI
View GitHub Profile
@disulfidebond
disulfidebond / BashQuickReference.md
Created May 24, 2017 16:50
Quick Reference and Guide for Bash

Bash Cheatsheet and Quick Reference

Introduction

This guide provides an overview to Bash, and is divided into sections on Introduction, Overview, Advanced, Awk, Sed, and Grep. Note that there is overlap among the sections, such as the Advanced section describing piping in Bash using grep. External links are noted by links, and bash commands are denoted by italics. Some commands can be copied and pasted, however, caution should be used when copying and pasting from this guide, and extreme caution should be used when copying and pasting bash commands from an unknown or unverified source, because there is no "undo" when commands are executed from a Bash terminal window.

Key Concepts:

  • Comments in bash are denoted with a hash # symbol:

echo 'Hello World' # <- everything after here will be ignored by Bash

@disulfidebond
disulfidebond / dockersetup.sh
Last active February 22, 2018 05:45
Setup for Docker on Ubuntu Xenial xerus
#!/bin/sh
sudo apt-get update
sleep 1
sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
@disulfidebond
disulfidebond / FileManipulationCookbook
Last active March 12, 2024 13:00
Excel File Manipulation Cookbook
Notes: A hash symbol (#) denotes a comment that is not part of any code or Excel formula entry
The following is a cookbook of solutions to common problems with data file manipulation.
When possible, solutions for fixes within Excel and outside of Excel are provided.
I. Problem: There are duplicates in a column, but you don't want to lose row ordering
Solution Within Excel: Pick a column with unique identifiers, or create a new column with unique identifiers in each row with the command:
=1 # add this to the first cell in the column
=[row]+1 # copy and paste this this to each subsequent cell in the column
@disulfidebond
disulfidebond / bash_profile_example
Created August 30, 2018 23:08
bash profile with comments
# standard bash profile
# segments borrowed from stephnell on Github
# PATH
# You need to have this line of code. It sets up your Bash $PATH, which is the file that directs Bash how to run programs
# the syntax is:
# export -> this tells Bash to set up PATH with the following values:
# PATH= -> this tells Bash to assign whatever your type next to the PATH variable
# $PATH: -> this is *very* important--it tells Bash not to throw away what was in your PATH already,
# # but to append to it instead. If you leave this out, your Bash shell may not work!
@disulfidebond
disulfidebond / labkey_setup_ubuntu.sh
Last active August 31, 2018 02:57
Script that automates labkey setup
#!/bin/sh
echo 'WARNING!! THIS SETUP IS NOT INTENDED'
echo 'FOR A PRODUCTION ENVIRONMENT IN ANY CAPACITY!!!'
sleep 3
STARTDIR=$(PWD)
mkdir installspacepg
cd ./installspacepg
@disulfidebond
disulfidebond / software_carpentry_2018_etherpad.txt
Created August 31, 2018 02:59
Etherpad from Software Carpentry 2018 Workshop
†Welcome to Software Carpentry Etherpad!
This pad is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents.
Use of this service is restricted to members of the Software Carpentry and Data Carpentry community; this is not for general purpose use (for that, try etherpad.wikimedia.org).
Users are expected to follow our code of conduct: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html
All content is publicly available under the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/
@disulfidebond
disulfidebond / parse_mhc_workflow_pt1.md
Last active September 2, 2018 20:43
parsing IPD data and formatting it

Description of Workflow

Step 1. download MHC dataset from IPD. It will have the file extension ".dat", however it can be viewed/edited as a text file. Note that it is very large, so opening it in Atom, BBEdit, or similar text editors/IDE is strongly discouraged.

A preview of this dataset is available here:

                login$ head -n 100 MHC_dat.txt 
                ID   NHP00001
                XX   

DT 15/07/2008 (Release)

@disulfidebond
disulfidebond / parse_mhc_workflow_pt2.md
Last active September 2, 2018 20:45
parsing IPD data and formatting it, continued

Description of Workflow

Step 2: Use python file to parse the filtered output from part 1 (most likely the file named 'parsed_mhc_output.txt'). The python script uses argparse(), so it can be called from the commandline with one of the following options, but note that output is to STDOUT, so it will need to be redirected to a file.

  • exonList -> a text file with a comma separated list of exons

  • fastaExonList -> a fasta formatted list of exons

  • mergedFastaList -> a fasta formatted list of cDNA sequences

              #!/usr/bin/python3
              import argparse
    

import time

@disulfidebond
disulfidebond / parse_mhc_workflow_pt3.md
Last active October 7, 2018 19:00
parsing IPD data and formatting it, continued

Description of Workflow

Step 3: Create a fasta file of exons that extends the window for mapping reads N base pairs in both directions, where N == the number of base pairs the for the sequencing length. For Illumina, this is usually 75 or 150.

  • Required parameters:
    • ntype -> the type of input, possible options are:
      • rna -> indicates that the input sequenced read data is from rna/cDNA (mutually exclusive with dna)
      • dna -> indicates that the input sequenced read data is from dna (mutually exclusive with rna, do not enter dna for cDNA data)
    • filteredfile -> the file containing the input sequenced reads that will be used. Must be in MHC.dat format
  • Other parameters:
  • flength -> the length of padding for the reads, the default is 75
@disulfidebond
disulfidebond / Mac_OSX_keychain_options_commandline.md
Created October 6, 2018 21:39
Mac OSX keychain options when using commandline

Unlock Keychain

  • Explanation: this can be used to remotely (or within a Terminal session) to unlock keychain
  • Requirements/Restrictions:
    • You must know the keychain password
    • This is confirmed to work on Mac OSX High Sierra, but may have different usage on other Mac OSX versions (Sierra, Mojave, etc)
  • Cautions/Warnings:
    • This will unlock your keychain, which is a potential security risk that you may not want to happen.

      # command to use in Bash, will prompt for the password
      

If no keychain is specified, then the default will be used