Skip to content

Instantly share code, notes, and snippets.

View illy's full-sized avatar
💭
I may be slow to respond.

illy

💭
I may be slow to respond.
View GitHub Profile
@illy
illy / reordering2.r
Last active December 15, 2015 14:59
sample <- read.table("~/Dropbox/sample.txt", header=T, sep="\t")
p <- ggplot(sample)
p <- p + geom_boxplot(aes(x=factor(Type), y=Word, fill=factor(Type)), notch=T, outlier.shape = NA) +
theme(axis.text.x=element_text(angle=15, hjust=0.8, vjust=1, size=12),
axis.text.y=element_text(size=12)) +
guides(fill=F) + scale_fill_grey() +
scale_x_discrete(limits=c("NSR", "stock-related", "NTR", "ticker-related", "NEG", "NEU", "POS")) +
scale_y_continuous(limits = c(0, 30))
print(p)
@illy
illy / reordering1.r
Last active December 15, 2015 14:59
recording in ggplot
sample <- read.table("~/Dropbox/sample.txt", header=T, sep="\t")
p <- ggplot(sample)
p <- p + geom_boxplot(aes(x=factor(Type), y=Word, fill=factor(Type)), notch=T, outlier.shape = NA) +
theme(axis.text.x=element_text(angle=15, hjust=0.8, vjust=1, size=12),
axis.text.y=element_text(size=12)) +
guides(fill=F) + scale_fill_grey() +
scale_y_continuous(limits = c(0, 30))
print(p)
@illy
illy / Useful_R_Commands.md
Last active December 14, 2015 08:28
Some useful R commands.

##1 Data manipulation

  1. If the data contains NA values, it regards it as factor, not numeric.

     DATA$COLUMN <- as.numeric(as.character(DATA$COLUMN))
    
  2. Rename the column:

     names(DATA)[2] <- "NEW_NAME"
    
@illy
illy / crawl.sh
Last active December 11, 2015 03:58
script for crawling and preprocessing tweet data
## This script is for crawling tweets with a specific address file.
#!/usr/bin/env bash
DIR=PARENT_DIR/`date "+%d-%m-%y-%H:%M"` #set the download file based to download date
mkdir -p $DIR #make dir according to above
wget -i EXTERNAL_ADDRESS_LIST -np -r -N -l1 -P $DIR
m.geQuote <- as.matrix(geQuote[,2:5])
acf.geQuote <- acf(m.geQuote, lag=5, plot=F, na.action=na.contiguous)
m.acf.geQuote <- melt(acf.geQuote$acf)
str(acf.geQuote)
List of 6
$ acf : num [1:5, 1:4, 1:4] 1 -0.1917 -0.478 0.1049 0.0648 ...
$ type : chr "correlation"
$ n.used: int 5
$ lag : num [1:5, 1:4, 1:4] 0 1 2 3 4 0 -1 -2 -3 -4 ...
$ series: chr "m.geQuote"
$ snames: chr [1:4] "Open" "Close" "Low" "High"
- attr(*, "class")= chr "acf"
p <- ggplot(m.acf.geQuote)
p <- p + geom_raster(aes(x=Var1, y=Var2, lable=value, fill= value)) +
facet_wrap(~Var3, nrow=4) +
ggtitle("Cross-correlation of 4 different prices of GE ticker") +
theme(legend.position="none") +
labs(fill="Correlation") +
xlab("") + ylab("")
print(p)
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 1] <- "Day0"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 2] <- "Day1"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 3] <- "Day2"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 4] <- "Day3"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 5] <- "Day4"
m.acf.geQuote$Var1 <- factor(m.acf.geQuote$Var1,
+ levels=unique(m.acf.geQuote$Var1), ordered=T)
m.acf.geQuote$Var2[m.acf.geQuote$Var2 == 1] <- "Open"
m.acf.geQuote$Var2[m.acf.geQuote$Var2 == 2] <- "Close"
@illy
illy / ggplot2_heat_map.r
Last active December 10, 2015 22:59
Sample script of using ggplot to plot acf matrix data.
m.geQuote <- as.matrix(geQuote[,2:5])
acf.geQuote <- acf(m.geQuote, lag=5, plot=F, na.action=na.contiguous)
m.acf.geQuote <- melt(acf.geQuote$acf)
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 1] <- "Day0"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 2] <- "Day1"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 3] <- "Day2"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 4] <- "Day3"
m.acf.geQuote$Var1[m.acf.geQuote$Var1 == 5] <- "Day4"
@illy
illy / sed and awk notes.md
Created July 31, 2012 22:15
sed and awk notes

##AWK notes##

  1. selective printing

     awk '$2 ~ regex, { $1="", pring $0}' 
    

If $2 = regex, then print the whole line but not $1

  1. convert a single line to multiple lines