Skip to content

Instantly share code, notes, and snippets.

@krishnanraman
Last active August 29, 2015 14:04
Show Gist options
  • Save krishnanraman/c17397667ea9090a5bc1 to your computer and use it in GitHub Desktop.
Save krishnanraman/c17397667ea9090a5bc1 to your computer and use it in GitHub Desktop.
Alfred Brauer's Ovals of Cassini, using Scalding Typed API
Idea:
The chief intuition behind Gershgorin: Simply examine matrix rows, one at a time.
But what if you look at matrix rows, two at a time ?
That is the chief intuition behind the Cassini Oval !
Details:
To obtain ballpark estimates of eigen values of matrices (especially diagonally-dominant ones)
one usually resorts to Gerschgorin disks.
The idea is that eigens lie in the union of Gershgorin disks.
But what if the eigens were to lie in a smaller oval, which is nested well within the larger Gershgorin disk ?!
Suffice it to say, you should then use the oval for the better approximation of the eigen.
These ovals, termed the Ovals of Cassini, were first described by Alfred Brauer.
Cassini ovals are not as trivial to compute as the Gershgorin circles.
Morover, an nxn matrix has nC2 ovals!
So a 10x10 matrix would have 10 Gershgorin circles, but 45 Ovals (10 Choose 2 = 10*9/2 = 45)
You'd have to union these 45 ovals...so mathematicians generally don't bother with these Ovals,
because they are more numerous & computing them is tedious.
But in these days of big data, approximations like the Count Min Sketch, HyperLogLog & QTree are the norm,
given the very large size of the problem and the inability to compute exact solutions.
Suddenly, good approximations are all the rage,
and Cassini Ovals are a valuable approximation of eigens for (very) large, especially diagonally dominant matrices.
Given an nxn matrix,
the Gershgorin is
|z-a(i)(i)| <= R(i)
ie. take the diagonal = a(i)(i) = that's the center of the disk
take the sum of the nondiagonal norms = R(i) = the radius of the disk
Do this once per row, you get n Gershgorin circles which bound n eigens.
The Cassini Oval is
|z-a(i)(i)|*|z-a(j)(j)| <= R(i)*R(j), for i != j
Choosing 2 rows given an nxn marix can be done in nC2 ways = hence the nC2 ovals.
References:
http://buzzard.ups.edu/courses/2007spring/projects/brakkenthal-paper.pdf
http://www.math.kent.edu/~varga/pub/paper_232.pdf
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/CountMinSketch.scala
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/HyperLogLog.scala
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/QTree.scala
/*
TL:DR;
-------
Consider the 4x4 symmetric matrix
7 1 2 4
1 8 3 2
2 3 9 1
4 2 1 11
True L2 Norm: 15.58
Approximate L2 Norm per Cassini Oval: 16.557
Approximate L2 Norm per Gershgorin: 18
------
*/
$ git clone https://github.com/twitter/scalding.git
cd scalding
$ ./sbt scalding-repl/console
/* A 4x4 symmetric matrix
7 1 2 4
1 8 3 2
2 3 9 1
4 2 1 11
*/
scala> val symmetric = Seq(Seq(7,1,2,4), Seq(1,8,3,2), Seq(2,3,9,1), Seq(4,2,1,11))
symmetric: Seq[Seq[Int]] = List(List(7, 1, 2, 4), List(1, 8, 3, 2), List(2, 3, 9, 1), List(4, 2, 1, 11))
scala> val pipe = TypedPipe.from(symmetric.zipWithIndex)
pipe: com.twitter.scalding.typed.TypedPipe[(Seq[Int], Int)] = IterablePipe(List((List(7, 1, 2, 4),0), (List(1, 8, 3, 2),1), (List(2, 3, 9, 1),2), (List(4, 2, 1, 11),3)),cascading.flow.FlowDef@7a2fc0ff,Local(false))
scala> val diagonals = pipe.map { rowIndex =>
val (row, index) = rowIndex
(index, row(index))
}
diagonals: com.twitter.scalding.typed.TypedPipe[(Int, Int)] = IterablePipe(List((0,7), (1,8), (2,9), (3,11)),cascading.flow.FlowDef@7a2fc0ff,Local(false))
scala> val nonDiagonals = pipe.map { rowIndex =>
val (row, index) = rowIndex
(index, (0 until index).map{row} ++ (index+1 until row.size).map{row})
}
nonDiagonals: com.twitter.scalding.typed.TypedPipe[(Int, scala.collection.immutable.IndexedSeq[Int])] = IterablePipe(List((0,Vector(1, 2, 4)), (1,Vector(1, 3, 2)), (2,Vector(2, 3, 1)), (3,Vector(4, 2, 1))),cascading.flow.FlowDef@7a2fc0ff,Local(false))
scala> val nonDiagonalNormSum = nonDiagonals.map{ idxRow => val (idx, row) = idxRow; (idx, row.sum) }
nonDiagonalNormSum: com.twitter.scalding.typed.TypedPipe[(Int, Int)] = IterablePipe(List((0,7), (1,6), (2,6), (3,7)),cascading.flow.FlowDef@6a87832d,Local(false))
scala> val diagNonDiagPipe = diagonals.join(nonDiagonalNormSum)
diagNonDiagPipe: com.twitter.scalding.typed.CoGrouped[Int,(Int, Int)] = com.twitter.scalding.typed.CoGroupable$$anon$3@4833240b
scala> val nC2combinations = diagNonDiagPipe.cross(diagNonDiagPipe).filter {
ab =>
val (a,b) = ab
a._1 < b._1
}
nC2combinations: com.twitter.scalding.typed.TypedPipe[((Int, (Int, Int)), (Int, (Int, Int)))] = TypedPipeInst(Each(_pipe_12*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>)
scala> nC2combinations.dump
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] starting
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"]
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"]
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.747368470074863"]
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/19 22:07:22 INFO flow.FlowStep: [ScaldingShell] starting step: local
((0,(7,7)),(1,(8,6)))
((0,(7,7)),(2,(9,6)))
((0,(7,7)),(3,(11,7)))
((1,(8,6)),(2,(9,6)))
((1,(8,6)),(3,(11,7)))
((2,(9,6)),(3,(11,7)))
scala> val bcPipe = nC2combinations.map { xy =>
val (x,y) = xy
val (aIdx,aRow) = x
val (bIdx, bRow) = y
val (aDiag, aRadius) = aRow
val (bDiag, bRadius) = bRow
val b = -(aDiag + bDiag)
val c = aDiag*bDiag - aRadius*bRadius
(b,c)
}
bcPipe: com.twitter.scalding.typed.TypedPipe[(Int, Int)] = TypedPipeInst(Each(_pipe_15*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>)
scala> bcPipe.dump
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] starting
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"]
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"]
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.14451103388434228"]
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/19 22:13:05 INFO flow.FlowStep: [ScaldingShell] starting step: local
(-15,14)
(-16,21)
(-18,28)
(-17,36)
(-19,46)
(-20,57)
scala> val quadraticSolver = bcPipe.map{ bc =>
val (b,c) = bc
val root1 = (-b + math.sqrt(b*b-4*c))/2
val root2 = (-b - math.sqrt(b*b-4*c))/2
if (root1 < root2) (root1,root2) else (root2,root1)
}
quadraticSolver: com.twitter.scalding.typed.TypedPipe[(Double, Double)] = TypedPipeInst(Each(_pipe_15*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>)
scala> quadraticSolver.dump14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] starting
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"]
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"]
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.18617943262924586"]
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/19 22:17:51 INFO flow.FlowStep: [ScaldingShell] starting step: local
(1.0,14.0)
(1.4425614756979996,14.557438524302)
(1.719890110719482,16.280109889280517)
(2.479202710603852,14.520797289396148)
(2.847932652174965,16.152067347825035)
(3.4425614756979996,16.557438524302)
scala> val min = quadraticSolver.groupAll.minBy{ x=> x._1 }.values.map{ x=> x._1}
min: com.twitter.scalding.typed.TypedPipe[Double] = TypedPipeInst(Every(_pipe_15*_pipe_4*_pipe_5)[TypedBufferOp[decl:'value']],'key', 'value',<function1>)
scala> val max = quadraticSolver.groupAll.maxBy{ x=> x._2 }.values.map{ x=> x._2}
max: com.twitter.scalding.typed.TypedPipe[Double] = TypedPipeInst(Every(_pipe_15*_pipe_4*_pipe_5)[TypedBufferOp[decl:'value']],'key', 'value',<function1>)
scala> val unionOfOvals = min.cross(max)
unionOfOvals: com.twitter.scalding.typed.TypedPipe[(Double, Double)] = TypedPipeInst(Each(_pipe_19*_pipe_15*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>)
scala> unionOfOvals.dump
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] starting
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"]
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"]
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.2263624362034401"]
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/20 13:43:46 INFO flow.FlowStep: [ScaldingShell] starting step: local
(1.0,16.557438524302)
// All four eigens are contained in the [1,16.5] interval
scala> val largestEigen = unionOfOvals.map{ x=> x._2}.dump
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] starting
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"]
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"]
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.469332024541651"]
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/20 12:39:17 INFO flow.FlowStep: [ScaldingShell] starting step: local
16.557438524302
// Lets verify the same with Wolfram Alpha
eigenvalues {{7,1,2,4},{1,8,3,2},{2,3,9,1},{4,2,1,11}}
{15.5845, 9.57363, 5.9301, 3.91173}
Notice that all 4 eigen values are contained in the [1,16.5] interval
Also note that the L2 Norm of this symmetric matrix = largest eigen = 15.58
15.58 is well approximated by 16.557, the approximation obtained from the Cassini Oval.
For the record, the Gershgorin approximation for this matrix is 18.
-----
To summarize:
Consider the 4x4 symmetric matrix
7 1 2 4
1 8 3 2
2 3 9 1
4 2 1 11
True L2 Norm: 15.58
Approximate L2 Norm per Cassini Oval: 16.557
Approximate L2 Norm per Gershgorin: 18
------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment