Evan Zamir EvanZ

## PySpark1.6.1-Jupyter4.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              1 star
            
          
                rocket-ron
                / PySpark1.6.1-Jupyter4.md
            
            
              Last active
              July 6, 2016 20:22
            
          
    Configuring Spark 1.6.1 to work with Jupyter 4.x Notebooks on Mac OS X with Homebrew

I've looked around in a number of places and I have found several blog entries on setting up IPython notebooks to work with Spark. However since most of the blog posts have been written both IPython and Spark have been updated. Today, IPython has been transformed into Jupyter, and Spark is near release 1.6.2. Most of the information is out there to get things working, but I thought I'd capture this point in time with a working configuration and how I set it up.
I rely completely on Homebrew to manage packages on my Mac. So Spark, Jupyter, Python, Jenv and other things are installed via Homebrew. You should be able to achieve the same thing with Anaconda but I don't know that package manager.
Install Java

Make sure your Java installation is up to date. I use jEnv to manage Java installations on my Mac, so that adds another layer to make sure is set up correctly. You can download/update Java from Oracle, have Homebrew

  
## springer-free-maths-books.md

      
              1 file
            
          
              474 forks
            
          
              248 comments
            
          
              2243 stars
            
          
                bishboria
                / springer-free-maths-books.md
            
            
              Last active
              June 8, 2024 06:39
            
              
                Springer made a bunch of books available for free, these were the direct links
              
          
    These links no longer work. Springer have pulled the free plug.

Graduate texts in mathematics

duplicates = multiple editions
A Classical Introduction to Modern Number Theory, Kenneth Ireland Michael Rosen
A Classical Introduction to Modern Number Theory, Kenneth Ireland Michael Rosen

  
## Spark_OnlineLDA_wikipedia_example.scala
import org.apache.spark.ml.feature.{CountVectorizer, RegexTokenizer, StopWordsRemover}
import org.apache.spark.mllib.clustering.{LDA, OnlineLDAOptimizer}
import org.apache.spark.mllib.linalg.Vector

import sqlContext.implicits._

val numTopics: Int = 100
val maxIterations: Int = 100
val vocabSize: Int = 10000

## introrx.md

      
              7 files
            
          
              2514 forks
            
          
              472 comments
            
          
              21946 stars
            
          
                staltz
                / introrx.md
            
            
              Last active
              July 22, 2024 09:31
            
              
                The introduction to Reactive Programming you've been missing
              
          
    The introduction to Reactive Programming you've been missing

(by @andrestaltz)

This tutorial as a series of videos

If you prefer to watch video tutorials with live-coding, then check out this series I recorded with the same contents as in this article: Egghead.io - Introduction to Reactive Programming.


## gist:8655893
I had this really small problem today. I wanted to migrate one of my small django apps to use postgres, just to make everything easy to manage. Sqlite3 is perfectly fine for the amount of load, however I am really much faster at administering postgres than I am on sqlite3. So I decided to migrate the stuff over.

I tried a few approaches, but what ultimately worked the best and the fastest fo rmy particular problem was to do the following.

Use original SQLITE3 connection in settings.py
1. python manage.py dumpdata > dump.json

(I read some things here about some options you can pass, at the end what just worked was the following)

2. Change DB connection string in settings.py to POSTGRES

## global.R
# load required libraries
library(shiny)
library(plyr)
library(ggplot2)
library(googleVis)
library(reshape2)


####creation of example data on local directory for uploading####

## lda_gibbs.py
"""
(C) Mathieu Blondel - 2010
License: BSD 3 clause

Implementation of the collapsed Gibbs sampler for
Latent Dirichlet Allocation, as described in

Finding scientifc topics (Griffiths and Steyvers)
"""
	import org.apache.spark.ml.feature.{CountVectorizer, RegexTokenizer, StopWordsRemover}
	import org.apache.spark.mllib.clustering.{LDA, OnlineLDAOptimizer}
	import org.apache.spark.mllib.linalg.Vector

	import sqlContext.implicits._

	val numTopics: Int = 100
	val maxIterations: Int = 100
	val vocabSize: Int = 10000
	I had this really small problem today. I wanted to migrate one of my small django apps to use postgres, just to make everything easy to manage. Sqlite3 is perfectly fine for the amount of load, however I am really much faster at administering postgres than I am on sqlite3. So I decided to migrate the stuff over.

	I tried a few approaches, but what ultimately worked the best and the fastest fo rmy particular problem was to do the following.

	Use original SQLITE3 connection in settings.py
	1. python manage.py dumpdata > dump.json

	(I read some things here about some options you can pass, at the end what just worked was the following)

	2. Change DB connection string in settings.py to POSTGRES
	# load required libraries
	library(shiny)
	library(plyr)
	library(ggplot2)
	library(googleVis)
	library(reshape2)


	####creation of example data on local directory for uploading####
	"""
	(C) Mathieu Blondel - 2010
	License: BSD 3 clause

	Implementation of the collapsed Gibbs sampler for
	Latent Dirichlet Allocation, as described in

	Finding scientifc topics (Griffiths and Steyvers)
	"""