Skip to content

Instantly share code, notes, and snippets.

@krishnanraman
Last active August 29, 2015 14:04
Show Gist options
  • Save krishnanraman/0693b397291001f97280 to your computer and use it in GitHub Desktop.
Save krishnanraman/0693b397291001f97280 to your computer and use it in GitHub Desktop.
Gershgorin using the Scalding Typed API !!!
Given a symmetric matrix, one can compute the approximate L2 Norm using a dated 1970s technique nobody uses anymore.
(Its a very good approximation, btw. )
For geezers like myself, who learnt math in the 1970s using abacus & log tables & such,
the standard technique to computing approximate eigen values in your head
was to mentally draw out the Gershgorin circles.
The eigen values would lie in them circles.
The max eigen value is the spectral radius.
L2 Norms of symmetric matrices are simply their spectral radius.
If you don't know what L2 Norm or an eigen value is or why all this matters, you should, like, go, like, back like, to, like, middle school, and like, demand your tuition, like, back and stuff.
Given a nxn square matrix :
there exist n Gershgorin Circles = 1 circle per row
For each row,
center of circle = the diagonal cell.
radius of circle = sum the off-diagonal cell norms
Theorem: Eigen Values of a matrix must lie within its Gershgorin.
Each real eigen value is bounded by (center - radius, center + radius)
The Spectral Radius is the maximum eigen value.
For Symmetric Matrices:
The L2 norm of a symmetric marix is simply the spectral radius.
If the matrix ain't symmeric, all bets are off.
$ git clone https://github.com/twitter/scalding.git
cd scalding
$ ./sbt scalding-repl/console
/* this is a symmetric matrix
7 1 2
1 8 3
2 3 9
*/
scala> val symmetric = Seq(Seq(7,1,2.0), Seq(1,8,3.0), Seq(2,3,9.0))
symmetric: Seq[Seq[Double]] = List(List(7.0, 1.0, 2.0), List(1.0, 8.0, 3.0), List(2.0, 3.0, 9.0))
// now its sitting in a Scalding Typed Pipe
// a Pipe is a "description of a computation" ( Posco)
// that means you can compute eigen values & suchlike from this pipe.
scala> val pipe = TypedPipe.from(symmetric.zipWithIndex)
pipe: com.twitter.scalding.typed.TypedPipe[(Seq[Double], Int)] = IterablePipe(List((List(7.0, 1.0, 2.0),0), (List(1.0, 8.0, 3.0),1), (List(2.0, 3.0, 9.0),2)),cascading.flow.FlowDef@6d639189,Local(false))
// lets first gather the cells along the diagonal
scala> val diagonals = pipe.map { rowIndex => val (row, index) = rowIndex; (index, row(index)) }
diagonals: com.twitter.scalding.typed.TypedPipe[(Int, Double)] = IterablePipe(List((0,7.0), (1,8.0), (2,9.0)),cascading.flow.FlowDef@6d639189,Local(false))
// now lets scoop up the non diagonal entries
scala> val nonDiagonals = pipe.map { rowIndex => val (row, index) = rowIndex; (index, (0 until index).map{row} ++ (index+1 until row.size).map{row})}
nonDiagonals: com.twitter.scalding.typed.TypedPipe[(Int, scala.collection.immutable.IndexedSeq[Double])] = IterablePipe(List((0,Vector(1.0, 2.0)), (1,Vector(1.0, 3.0)), (2,Vector(2.0, 3.0))),cascading.flow.FlowDef@6d639189,Local(false))
// lets take the non-diagonals & sum them per row, after getting the norm of each entry
// ( norm of a scalar is simply the absolute value)
scala> val nonDiagonalNormSum = nonDiagonals.map { indexRow => val (index, row) = indexRow; (index, row.map{math.abs}.sum)}
nonDiagonalNormSum: com.twitter.scalding.typed.TypedPipe[(Int, Double)] = IterablePipe(List((0,3.0), (1,4.0), (2,5.0)),cascading.flow.FlowDef@6d639189,Local(false))
// now lets join the diagonals with their nondiagonal norms
// that, my friends, is the gershgorin!
scala> val gershgorin = diagonals.join(nonDiagonalNormSum).values
gershgorin: com.twitter.scalding.typed.TypedPipe[(Double, Double)] = TypedPipeInst(Each(_pipe_10*_pipe_11)[Identity[decl:'key', 'value']],'key', 'value',<function1>)
// if you want to see what that looks like, dump it
scala> gershgorin.dump
14/07/18 15:15:15 INFO flow.Flow: [ScaldingShell] starting
14/07/18 15:15:15 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.32696373550075297"]
14/07/18 15:15:15 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.5669129290001957"]
14/07/18 15:15:15 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.40624991664526333"]
14/07/18 15:15:15 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/18 15:15:15 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/18 15:15:15 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/18 15:15:15 INFO flow.FlowStep: [ScaldingShell] starting step: local
(7.0,3.0)
(8.0,4.0)
(9.0,5.0)
// So the first gershgorin circle sits at 7, with a radius of 3 ie. it goes from [4, 10] along the real line
// So the second gershgorin circle sits at 8, with a radius of 4 ie. it goes from [4, 12] along the real line
// So the third gershgorin circle sits at 9, with a radius of 5 ie. it goes from [4, 14] along the real line
// these are precisely the bounds of the eigen values as well !
scala> val eigenValues = gershgorin.map { centerRadius => val (center, radius) = centerRadius; (center - radius, center + radius)}
eigenValues: com.twitter.scalding.typed.TypedPipe[(Double, Double)] = TypedPipeInst(Each(_pipe_10*_pipe_11)[Identity[decl:'key', 'value']],'key', 'value',<function1>)
scala> eigenValues.dump
14/07/18 14:55:08 INFO flow.Flow: [ScaldingShell] starting
14/07/18 14:55:08 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.32696373550075297"]
14/07/18 14:55:08 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.5669129290001957"]
14/07/18 14:55:08 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.37809438925390637"]
14/07/18 14:55:08 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/18 14:55:08 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/18 14:55:08 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/18 14:55:08 INFO flow.FlowStep: [ScaldingShell] starting step: local
(4.0,10.0)
(4.0,12.0)
(4.0,14.0)
// now that we have the eigens, just sort them & take the max
// the biggest eigen is the spectral radius
scala> val spectralRadius = eigenValues.map { minmax => val (min,max) = minmax; max }.groupAll.sorted.reverse.head.values
spectralRadius: com.twitter.scalding.typed.TypedPipe[Double] = TypedPipeInst(Every(_pipe_10*_pipe_11)[TypedBufferOp[decl:'value']],'key', 'value',<function1>)
scala> spectralRadius.dump
14/07/18 14:59:52 INFO flow.Flow: [ScaldingShell] starting
14/07/18 14:59:52 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.32696373550075297"]
14/07/18 14:59:52 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.5669129290001957"]
14/07/18 14:59:52 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.606555052652465"]
14/07/18 14:59:52 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/18 14:59:52 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/18 14:59:52 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/18 14:59:52 INFO flow.FlowStep: [ScaldingShell] starting step: local
14.0
// Voila! That is the L2 Norm of this Symmetric Matrix.
// Lets now confirm our intiutions using Wolfram's $$$$ WolframAlpha
From WolframAlpha, the eigens are:
{12.4188, 6.38677, 5.1944}
Notice that
12.4 lies between [4,14]
6.4 lies between [4,12]
5.2 lies between [4,10]
and true spectral radius = max eigen = 12.4 is well approximated by 14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment