Skip to content

Instantly share code, notes, and snippets.

@selbyk
selbyk / compile-nginx.sh
Last active April 14, 2016 19:41
Compile nginx for HTTP/2, bash script for compressing static files, and a config
#!/bin/bash
./configure \
--sbin-path=/usr/sbin/nginx \
--conf-path=/etc/nginx/nginx.conf \
--pid-path=/var/run/nginx.pid \
--with-http_ssl_module \
--with-http_v2_module \
--with-http_gzip_static_module \
--with-pcre=../pcre-8.38 \
--with-zlib=../zlib-1.2.8
import Ember from 'ember';
export default Ember.Component.extend({
tagName: 'audio',
attributeBindings: ['src', 'preload'],
src: Ember.computed.alias('url'),
preload: 'none',
didInsertElement() {
let _this = this;
audiojs.events.ready(function() {
@selbyk
selbyk / archlinuxusb.sh
Created March 18, 2016 02:18
archlinuxusb.sh
#!/bin/bash
sudo dd bs=4M if=archlinux-2016.03.01-dual.iso of=/dev/sdd status=progress && sync
@selbyk
selbyk / replace.sh
Created March 17, 2016 16:37
Format ms power point slide text from operating systems into markdown
#!/bin/bash
# Usage: ./replace.sh
#* Adapted from slides by Reva Freedman (The UNIX Systems), NIU
### Feng Chen
### Tue/Thu 5:00-6:30PM
#1120 Patrick F. Taylor Hall
# https://github.com/selbyk/operating-systems
@selbyk
selbyk / process_pids.sh
Created March 17, 2016 16:34
iterate through all running Linux processes and just print the process ID
#!/bin/bash
# Usage: ./process_pids.sh
for proc in /proc/*
do
FILENAME=${proc##*/}
if [[ $FILENAME =~ ^-?[0-9]+ ]]
then
echo $FILENAME
fi
@selbyk
selbyk / find_running_process.sh
Created March 17, 2016 16:13
accepts a command line argument for the process name, returns an exit code of 1 if the process is currently running
#/bin/bash
# Usage: ./find_running_process.sh <process_name>
# DEBUG=1 ./find_running_process.sh <process_name>
# Function to help with debug messages
debug_message () {
if [ $DEBUG -eq 1 ]
then
echo $1
fi
Scraper/Content Extraction Training
Goal: Fetch relevant information sources, extract only appropriate content, save as documents as training data and usable by Watson
Method:
Fetch a few pages from various data sources using Phantom.js, then parse and save the website’s HTML as JSON
Iterate the text elements and extract features such as size, position, text, CSS properties, etc
Run the DBSCAN clustering algorithm over the document’s extracted feature data. Similar elements such as titles, headers, and article content should be grouped into the same clusters
Manually tag a portion of the documents to use as training data
A support vector machine (SVM) with linear kernel using a 4-fold cross validation should be capable of detecting the main content of a scraped page
@selbyk
selbyk / movie.sh
Last active December 5, 2015 20:02
#!/bin/bash
# Set to number of cores your computer has or you want to use
export MAGICK_THREAD_LIMIT=8
# Deleted old files
rm *.jpg
rm *.png
rm *.gif
rm *.mp4
#!/bin/bash
rm *.jpg
rm *.png
rm *.gif
rm *.mp4
find ../ -mindepth 1 -maxdepth 1 -mtime -7 -name "*.jpg" -exec cp -t . {} +
export MAGICK_THREAD_LIMIT=4
# This is an example of the kind of things you can do in a configuration file.
# All flags used by the client can be configured here. Run Let's Encrypt with
# "--help" to learn more about the available options.
# Use a 4096 bit RSA key instead of 2048
rsa-key-size = 4096
# Always use the staging/testing server
server = https://acme-v01.api.letsencrypt.org/directory