Last active
August 29, 2015 14:04
-
-
Save krishnanraman/c17397667ea9090a5bc1 to your computer and use it in GitHub Desktop.
Alfred Brauer's Ovals of Cassini, using Scalding Typed API
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Idea: | |
The chief intuition behind Gershgorin: Simply examine matrix rows, one at a time. | |
But what if you look at matrix rows, two at a time ? | |
That is the chief intuition behind the Cassini Oval ! | |
Details: | |
To obtain ballpark estimates of eigen values of matrices (especially diagonally-dominant ones) | |
one usually resorts to Gerschgorin disks. | |
The idea is that eigens lie in the union of Gershgorin disks. | |
But what if the eigens were to lie in a smaller oval, which is nested well within the larger Gershgorin disk ?! | |
Suffice it to say, you should then use the oval for the better approximation of the eigen. | |
These ovals, termed the Ovals of Cassini, were first described by Alfred Brauer. | |
Cassini ovals are not as trivial to compute as the Gershgorin circles. | |
Morover, an nxn matrix has nC2 ovals! | |
So a 10x10 matrix would have 10 Gershgorin circles, but 45 Ovals (10 Choose 2 = 10*9/2 = 45) | |
You'd have to union these 45 ovals...so mathematicians generally don't bother with these Ovals, | |
because they are more numerous & computing them is tedious. | |
But in these days of big data, approximations like the Count Min Sketch, HyperLogLog & QTree are the norm, | |
given the very large size of the problem and the inability to compute exact solutions. | |
Suddenly, good approximations are all the rage, | |
and Cassini Ovals are a valuable approximation of eigens for (very) large, especially diagonally dominant matrices. | |
Given an nxn matrix, | |
the Gershgorin is | |
|z-a(i)(i)| <= R(i) | |
ie. take the diagonal = a(i)(i) = that's the center of the disk | |
take the sum of the nondiagonal norms = R(i) = the radius of the disk | |
Do this once per row, you get n Gershgorin circles which bound n eigens. | |
The Cassini Oval is | |
|z-a(i)(i)|*|z-a(j)(j)| <= R(i)*R(j), for i != j | |
Choosing 2 rows given an nxn marix can be done in nC2 ways = hence the nC2 ovals. | |
References: | |
http://buzzard.ups.edu/courses/2007spring/projects/brakkenthal-paper.pdf | |
http://www.math.kent.edu/~varga/pub/paper_232.pdf | |
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/CountMinSketch.scala | |
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/HyperLogLog.scala | |
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/QTree.scala |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
TL:DR; | |
------- | |
Consider the 4x4 symmetric matrix | |
7 1 2 4 | |
1 8 3 2 | |
2 3 9 1 | |
4 2 1 11 | |
True L2 Norm: 15.58 | |
Approximate L2 Norm per Cassini Oval: 16.557 | |
Approximate L2 Norm per Gershgorin: 18 | |
------ | |
*/ | |
$ git clone https://github.com/twitter/scalding.git | |
cd scalding | |
$ ./sbt scalding-repl/console | |
/* A 4x4 symmetric matrix | |
7 1 2 4 | |
1 8 3 2 | |
2 3 9 1 | |
4 2 1 11 | |
*/ | |
scala> val symmetric = Seq(Seq(7,1,2,4), Seq(1,8,3,2), Seq(2,3,9,1), Seq(4,2,1,11)) | |
symmetric: Seq[Seq[Int]] = List(List(7, 1, 2, 4), List(1, 8, 3, 2), List(2, 3, 9, 1), List(4, 2, 1, 11)) | |
scala> val pipe = TypedPipe.from(symmetric.zipWithIndex) | |
pipe: com.twitter.scalding.typed.TypedPipe[(Seq[Int], Int)] = IterablePipe(List((List(7, 1, 2, 4),0), (List(1, 8, 3, 2),1), (List(2, 3, 9, 1),2), (List(4, 2, 1, 11),3)),cascading.flow.FlowDef@7a2fc0ff,Local(false)) | |
scala> val diagonals = pipe.map { rowIndex => | |
val (row, index) = rowIndex | |
(index, row(index)) | |
} | |
diagonals: com.twitter.scalding.typed.TypedPipe[(Int, Int)] = IterablePipe(List((0,7), (1,8), (2,9), (3,11)),cascading.flow.FlowDef@7a2fc0ff,Local(false)) | |
scala> val nonDiagonals = pipe.map { rowIndex => | |
val (row, index) = rowIndex | |
(index, (0 until index).map{row} ++ (index+1 until row.size).map{row}) | |
} | |
nonDiagonals: com.twitter.scalding.typed.TypedPipe[(Int, scala.collection.immutable.IndexedSeq[Int])] = IterablePipe(List((0,Vector(1, 2, 4)), (1,Vector(1, 3, 2)), (2,Vector(2, 3, 1)), (3,Vector(4, 2, 1))),cascading.flow.FlowDef@7a2fc0ff,Local(false)) | |
scala> val nonDiagonalNormSum = nonDiagonals.map{ idxRow => val (idx, row) = idxRow; (idx, row.sum) } | |
nonDiagonalNormSum: com.twitter.scalding.typed.TypedPipe[(Int, Int)] = IterablePipe(List((0,7), (1,6), (2,6), (3,7)),cascading.flow.FlowDef@6a87832d,Local(false)) | |
scala> val diagNonDiagPipe = diagonals.join(nonDiagonalNormSum) | |
diagNonDiagPipe: com.twitter.scalding.typed.CoGrouped[Int,(Int, Int)] = com.twitter.scalding.typed.CoGroupable$$anon$3@4833240b | |
scala> val nC2combinations = diagNonDiagPipe.cross(diagNonDiagPipe).filter { | |
ab => | |
val (a,b) = ab | |
a._1 < b._1 | |
} | |
nC2combinations: com.twitter.scalding.typed.TypedPipe[((Int, (Int, Int)), (Int, (Int, Int)))] = TypedPipeInst(Each(_pipe_12*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>) | |
scala> nC2combinations.dump | |
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] starting | |
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"] | |
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"] | |
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.747368470074863"] | |
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true | |
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] starting jobs: 1 | |
14/07/19 22:07:22 INFO flow.Flow: [ScaldingShell] allocating threads: 1 | |
14/07/19 22:07:22 INFO flow.FlowStep: [ScaldingShell] starting step: local | |
((0,(7,7)),(1,(8,6))) | |
((0,(7,7)),(2,(9,6))) | |
((0,(7,7)),(3,(11,7))) | |
((1,(8,6)),(2,(9,6))) | |
((1,(8,6)),(3,(11,7))) | |
((2,(9,6)),(3,(11,7))) | |
scala> val bcPipe = nC2combinations.map { xy => | |
val (x,y) = xy | |
val (aIdx,aRow) = x | |
val (bIdx, bRow) = y | |
val (aDiag, aRadius) = aRow | |
val (bDiag, bRadius) = bRow | |
val b = -(aDiag + bDiag) | |
val c = aDiag*bDiag - aRadius*bRadius | |
(b,c) | |
} | |
bcPipe: com.twitter.scalding.typed.TypedPipe[(Int, Int)] = TypedPipeInst(Each(_pipe_15*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>) | |
scala> bcPipe.dump | |
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] starting | |
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"] | |
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"] | |
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.14451103388434228"] | |
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true | |
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] starting jobs: 1 | |
14/07/19 22:13:05 INFO flow.Flow: [ScaldingShell] allocating threads: 1 | |
14/07/19 22:13:05 INFO flow.FlowStep: [ScaldingShell] starting step: local | |
(-15,14) | |
(-16,21) | |
(-18,28) | |
(-17,36) | |
(-19,46) | |
(-20,57) | |
scala> val quadraticSolver = bcPipe.map{ bc => | |
val (b,c) = bc | |
val root1 = (-b + math.sqrt(b*b-4*c))/2 | |
val root2 = (-b - math.sqrt(b*b-4*c))/2 | |
if (root1 < root2) (root1,root2) else (root2,root1) | |
} | |
quadraticSolver: com.twitter.scalding.typed.TypedPipe[(Double, Double)] = TypedPipeInst(Each(_pipe_15*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>) | |
scala> quadraticSolver.dump14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] starting | |
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"] | |
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"] | |
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.18617943262924586"] | |
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true | |
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] starting jobs: 1 | |
14/07/19 22:17:51 INFO flow.Flow: [ScaldingShell] allocating threads: 1 | |
14/07/19 22:17:51 INFO flow.FlowStep: [ScaldingShell] starting step: local | |
(1.0,14.0) | |
(1.4425614756979996,14.557438524302) | |
(1.719890110719482,16.280109889280517) | |
(2.479202710603852,14.520797289396148) | |
(2.847932652174965,16.152067347825035) | |
(3.4425614756979996,16.557438524302) | |
scala> val min = quadraticSolver.groupAll.minBy{ x=> x._1 }.values.map{ x=> x._1} | |
min: com.twitter.scalding.typed.TypedPipe[Double] = TypedPipeInst(Every(_pipe_15*_pipe_4*_pipe_5)[TypedBufferOp[decl:'value']],'key', 'value',<function1>) | |
scala> val max = quadraticSolver.groupAll.maxBy{ x=> x._2 }.values.map{ x=> x._2} | |
max: com.twitter.scalding.typed.TypedPipe[Double] = TypedPipeInst(Every(_pipe_15*_pipe_4*_pipe_5)[TypedBufferOp[decl:'value']],'key', 'value',<function1>) | |
scala> val unionOfOvals = min.cross(max) | |
unionOfOvals: com.twitter.scalding.typed.TypedPipe[(Double, Double)] = TypedPipeInst(Each(_pipe_19*_pipe_15*_pipe_4*_pipe_5)[Identity[decl:'key', 'value']],'key', 'value',<function1>) | |
scala> unionOfOvals.dump | |
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] starting | |
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"] | |
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"] | |
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.2263624362034401"] | |
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true | |
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] starting jobs: 1 | |
14/07/20 13:43:46 INFO flow.Flow: [ScaldingShell] allocating threads: 1 | |
14/07/20 13:43:46 INFO flow.FlowStep: [ScaldingShell] starting step: local | |
(1.0,16.557438524302) | |
// All four eigens are contained in the [1,16.5] interval | |
scala> val largestEigen = unionOfOvals.map{ x=> x._2}.dump | |
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] starting | |
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.2594611917490499"] | |
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.7722493355962736"] | |
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.469332024541651"] | |
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true | |
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] starting jobs: 1 | |
14/07/20 12:39:17 INFO flow.Flow: [ScaldingShell] allocating threads: 1 | |
14/07/20 12:39:17 INFO flow.FlowStep: [ScaldingShell] starting step: local | |
16.557438524302 | |
// Lets verify the same with Wolfram Alpha | |
eigenvalues {{7,1,2,4},{1,8,3,2},{2,3,9,1},{4,2,1,11}} | |
{15.5845, 9.57363, 5.9301, 3.91173} | |
Notice that all 4 eigen values are contained in the [1,16.5] interval | |
Also note that the L2 Norm of this symmetric matrix = largest eigen = 15.58 | |
15.58 is well approximated by 16.557, the approximation obtained from the Cassini Oval. | |
For the record, the Gershgorin approximation for this matrix is 18. | |
----- | |
To summarize: | |
Consider the 4x4 symmetric matrix | |
7 1 2 4 | |
1 8 3 2 | |
2 3 9 1 | |
4 2 1 11 | |
True L2 Norm: 15.58 | |
Approximate L2 Norm per Cassini Oval: 16.557 | |
Approximate L2 Norm per Gershgorin: 18 | |
------ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment