Skip to content

Instantly share code, notes, and snippets.

View myui's full-sized avatar

Makoto YUI myui

View GitHub Profile
@myui
myui / sklearn-denselr-spark-hdfs.py
Last active December 20, 2015 15:29 — forked from MLnick/sklearn-lr-spark.py
Forked to deal with large dense dataset on HDFS.
import sys
from pyspark.context import SparkContext
from numpy import array, random as np_random
from sklearn import linear_model as lm
from sklearn.base import copy
ITERATIONS = 5
np_random.seed(seed=42)
@myui
myui / sklearn-sparselr-spark-hdfs.py
Last active December 20, 2015 15:19 — forked from MLnick/sklearn-lr-spark.py
Forked to deal with sparse and large dataset on HDFS.
import sys
from pyspark.context import SparkContext
from numpy import array, random as np_random
from sklearn import linear_model as lm
from sklearn.base import copy
from scipy import sparse as sp
#MAX_FEATURES=1000
MAX_FEATURES=16777216