Skip to content

Instantly share code, notes, and snippets.

@lakshay-arora
Last active October 22, 2019 12:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lakshay-arora/7bcdbde9344402a0afac74bd33c7f173 to your computer and use it in GitHub Desktop.
Save lakshay-arora/7bcdbde9344402a0afac74bd33c7f173 to your computer and use it in GitHub Desktop.
from pyspark.mllib.linalg import Vectors
## Dense Vector
print(Vectors.dense([1,2,3,4,5,6,0]))
# >> DenseVector([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 0.0])
### SPARSE VECTOR
### Vectors.sparse( length, index_of_non_zero_values, non_zero_values)
### Indices values should be strictly increasing
print(Vectors.sparse(10, [0,1,2,4,5], [1.0,5.0,3.0,5.0,7]))
# >> SparseVector(10, {0: 1.0, 1: 5.0, 2: 3.0, 4: 5.0, 5: 7.0})
print(Vectors.sparse(10, [0,1,2,4,5], [1.0,5.0,3.0,5.0,7]).toArray())
# >> array([1., 5., 3., 0., 5., 7., 0., 0., 0., 0.])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment