Last active
December 9, 2015 21:08
-
-
Save peccu/4328957 to your computer and use it in GitHub Desktop.
データフレームでネットワークらしきものの関係を渡したら,隣接行列っぽいものに変換して返す.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
入力はどの属性が有効かってデータを表現してるつもり. | |
入力例は | |
{ | |
{am,loc1,act1}, | |
{pm,loc2,act2}, | |
{am,loc3,act2}, | |
{pm,loc3,act2} | |
} | |
ってデータを表現してるつもり. | |
出力は隣接行列っぽく変換してる. | |
縦横両軸に属性(am,pm,loc1,loc2,loc3,act1,act2)を取って,それぞれの関係が存在する回数をその要素にしてる. | |
(am,am)=2はam自身の出現回数, | |
(am,loc1)=1,(am,act1)=1は {am,loc1,act1}の関係から, | |
(am,loc3)=1,(am,act2)=1は {am,loc3,act2}の関係から, | |
(pm,act2)=2は{pm,loc2,act2},{pm,loc3,act2}の関係から,という感じ. | |
関係がないところは0になる. | |
adjacency()は単純に(i,j) = iかつjの出現回数 | |
adjacencyNormalize()は(i,j) = (iかつjの出現回数) / (iの出現回数).なので対角行列は1になってるはず | |
adjacencyBinominal()は(i,j) = iかつjが回数に関わらず出現すれば1,出現しなければ0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
adjacency <- function(n){ | |
## 出力行列の初期化 | |
mydata <- diag(0,length(colnames(n))) | |
## 行,列に名前を付ける | |
colnames(mydata) <- colnames(n) | |
rownames(mydata) <- colnames(n) | |
## rは列番号 | |
for(r in 1:length(colnames(n))){ | |
## r行目に,縦方向に合計したものを代入 | |
mydata[r,] <- colSums( | |
## r列目が1になっている行を対象にして | |
n[n[,r] == 1,] | |
) | |
} | |
mydata | |
} | |
## colSumsで,その値を割ったものを入れる | |
## am,loc1のエッジの値は,(amかつloc1の出現回数) / (amの出現回数) | |
## こんな感じ? mydata[am,loc1] = n[am, n[,loc1] == 1] / colsums(n[am,]) | |
adjacencyNormalize <- function(n){ | |
## 出力行列の初期化 | |
mydata <- diag(0,length(colnames(n))) | |
## 行,列に名前を付ける | |
colnames(mydata) <- colnames(n) | |
rownames(mydata) <- colnames(n) | |
## rは列番号 | |
for(r in 1:length(colnames(n))){ | |
## r行目に,縦方向に合計したものを(r列目の合計)で割って代入 | |
mydata[r,] <- colSums( | |
## r列目が1になっている行を対象にして | |
n[n[,r] == 1,] | |
## r列目の合計で割る | |
) / sum(n[,r]) | |
} | |
mydata | |
} | |
## 出現回数に関わらず,関係が存在すれば1,しなければ0 | |
adjacencyBinominal <- function(n){ | |
## 出力行列の初期化 | |
mydata <- diag(0,length(colnames(n))) | |
## 行,列に名前を付ける | |
colnames(mydata) <- colnames(n) | |
rownames(mydata) <- colnames(n) | |
## rは列番号 | |
for(r in 1:length(colnames(n))){ | |
## r行目に,縦方向に合計してその合計で割ったもの(値があれば1,なければ0のまま?)を代入 | |
mydata[r,] <- colSums(n[n[,r] == 1,]) | |
mydata[r,mydata[r,] != 0] <- 1 | |
} | |
mydata | |
} | |
## ファイルから読み込んで データフレーム→隣接行列 に変換 | |
adjacencyFromFile <- function(file = "input.csv"){ | |
## 列名がついたデータの読み込み.これはカンマ区切り | |
(n <- read.table(file, header=TRUE,sep = ",")) | |
## colnames(n) | |
adjacency(n) | |
} | |
## adjacencyFromFile() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
am | pm | loc1 | loc2 | loc3 | act1 | act2 | |
---|---|---|---|---|---|---|---|
1 | 0 | 1 | 0 | 0 | 1 | 0 | |
0 | 1 | 0 | 1 | 0 | 0 | 1 | |
1 | 0 | 0 | 0 | 1 | 0 | 1 | |
0 | 1 | 0 | 0 | 1 | 0 | 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> (n <- read.table("input.csv", header=TRUE,sep = ",")) | |
am pm loc1 loc2 loc3 act1 act2 | |
1 1 0 1 0 0 1 0 | |
2 0 1 0 1 0 0 1 | |
3 1 0 0 0 1 0 1 | |
4 0 1 0 0 1 0 1 | |
> adjacencyBinominal(n) | |
am pm loc1 loc2 loc3 act1 act2 | |
am 1 0 1 0 1 1 1 | |
pm 0 1 0 1 1 0 1 | |
loc1 1 0 1 0 0 1 0 | |
loc2 0 1 0 1 0 0 1 | |
loc3 1 1 0 0 1 0 1 | |
act1 1 0 1 0 0 1 0 | |
act2 1 1 0 1 1 0 1 | |
> adjacencyNormalize(n) | |
am pm loc1 loc2 loc3 act1 act2 | |
am 1.0000000 0.0000000 0.5 0.0000000 0.5000000 0.5 0.5 | |
pm 0.0000000 1.0000000 0.0 0.5000000 0.5000000 0.0 1.0 | |
loc1 1.0000000 0.0000000 1.0 0.0000000 0.0000000 1.0 0.0 | |
loc2 0.0000000 1.0000000 0.0 1.0000000 0.0000000 0.0 1.0 | |
loc3 0.5000000 0.5000000 0.0 0.0000000 1.0000000 0.0 1.0 | |
act1 1.0000000 0.0000000 1.0 0.0000000 0.0000000 1.0 0.0 | |
act2 0.3333333 0.6666667 0.0 0.3333333 0.6666667 0.0 1.0 | |
> adjacency(n) | |
am pm loc1 loc2 loc3 act1 act2 | |
am 2 0 1 0 1 1 1 | |
pm 0 2 0 1 1 0 2 | |
loc1 1 0 1 0 0 1 0 | |
loc2 0 1 0 1 0 0 1 | |
loc3 1 1 0 0 2 0 2 | |
act1 1 0 1 0 0 1 0 | |
act2 1 2 0 1 2 0 3 | |
> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment