Skip to content

Instantly share code, notes, and snippets.

@selbyk
selbyk / compile-nginx.sh
Last active April 14, 2016 19:41
Compile nginx for HTTP/2, bash script for compressing static files, and a config
#!/bin/bash
./configure \
--sbin-path=/usr/sbin/nginx \
--conf-path=/etc/nginx/nginx.conf \
--pid-path=/var/run/nginx.pid \
--with-http_ssl_module \
--with-http_v2_module \
--with-http_gzip_static_module \
--with-pcre=../pcre-8.38 \
--with-zlib=../zlib-1.2.8
import Ember from 'ember';
export default Ember.Component.extend({
tagName: 'audio',
attributeBindings: ['src', 'preload'],
src: Ember.computed.alias('url'),
preload: 'none',
didInsertElement() {
let _this = this;
audiojs.events.ready(function() {
@selbyk
selbyk / archlinuxusb.sh
Created March 18, 2016 02:18
archlinuxusb.sh
#!/bin/bash
sudo dd bs=4M if=archlinux-2016.03.01-dual.iso of=/dev/sdd status=progress && sync
@selbyk
selbyk / replace.sh
Created March 17, 2016 16:37
Format ms power point slide text from operating systems into markdown
#!/bin/bash
# Usage: ./replace.sh
#* Adapted from slides by Reva Freedman (The UNIX Systems), NIU
### Feng Chen
### Tue/Thu 5:00-6:30PM
#1120 Patrick F. Taylor Hall
# https://github.com/selbyk/operating-systems
@selbyk
selbyk / process_pids.sh
Created March 17, 2016 16:34
iterate through all running Linux processes and just print the process ID
#!/bin/bash
# Usage: ./process_pids.sh
for proc in /proc/*
do
FILENAME=${proc##*/}
if [[ $FILENAME =~ ^-?[0-9]+ ]]
then
echo $FILENAME
fi
@selbyk
selbyk / find_running_process.sh
Created March 17, 2016 16:13
accepts a command line argument for the process name, returns an exit code of 1 if the process is currently running
#/bin/bash
# Usage: ./find_running_process.sh <process_name>
# DEBUG=1 ./find_running_process.sh <process_name>
# Function to help with debug messages
debug_message () {
if [ $DEBUG -eq 1 ]
then
echo $1
fi
Scraper/Content Extraction Training
Goal: Fetch relevant information sources, extract only appropriate content, save as documents as training data and usable by Watson
Method:
Fetch a few pages from various data sources using Phantom.js, then parse and save the website’s HTML as JSON
Iterate the text elements and extract features such as size, position, text, CSS properties, etc
Run the DBSCAN clustering algorithm over the document’s extracted feature data. Similar elements such as titles, headers, and article content should be grouped into the same clusters
Manually tag a portion of the documents to use as training data
A support vector machine (SVM) with linear kernel using a 4-fold cross validation should be capable of detecting the main content of a scraped page
#include <vector>
#include <fstream>
#include <iostream>
#include <ctype.h>
#include "fann.h"
#include "fann_cpp.h"
using namespace std;
void error(const char* p, const char* p2 = ""){
@selbyk
selbyk / php.ini
Created April 24, 2013 00:31
PHP5 Apache php.ini
[PHP]
;;;;;;;;;;;;;;;;;;;
; About php.ini ;
;;;;;;;;;;;;;;;;;;;
; PHP's initialization file, generally called php.ini, is responsible for
; configuring many of the aspects of PHP's behavior.
; PHP attempts to find and load this configuration from a number of locations.
; The following is a summary of its search order:
@selbyk
selbyk / threaded_vectors.clj
Created April 18, 2013 20:38
10 threads manipulating one shared data structure, which consists of 100 vectors each one containing 10 (initially sequential) unique numbers. Each thread then repeatedly selects two random positions in two random vectors and swaps them. All changes to the vectors occur in transactions by making use of Clojure's software transactional memory sys…
(defn run [nvecs nitems nthreads niters]
(let [vec-refs (vec (map (comp ref vec)
(partition nitems (range (* nvecs nitems)))))
swap #(let [v1 (rand-int nvecs)
v2 (rand-int nvecs)
i1 (rand-int nitems)
i2 (rand-int nitems)]
(dosync
(let [temp (nth @(vec-refs v1) i1)]
(alter (vec-refs v1) assoc i1 (nth @(vec-refs v2) i2))