Skip to content

Instantly share code, notes, and snippets.

Hdooster

  • Ghent, Belgium
Block or report user

Report or block Hdooster

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View The Spark Machine Learning Library (MLlib).py
import numpy as np
x = np.array([[1,2],[3,4]])
x[0]
x[:,1]
from pyspark.mllib.linalg import Vectors
x = Vectors.dense([1,2,3,4])
x = [Vectors.dense([1,2,3,4]),
View Analytics With Spark and Hive.md
customers_df = sqlCtx.sql("SELECT * FROM customers")
customers_df.show()
sqlCtx.sql("""SELECT c.category_name, count(order_item_quantity) as count 
FROM order_items oi
INNER JOIN products p
ON oi.order_item_product_id = p.product_id
@Hdooster
Hdooster / Big Data Analytics - logs_df.md
Last active Feb 26, 2016
Spark Analytics with Dataframes on HTTP Server Logs - Code on slides
View Big Data Analytics - logs_df.md
# Import logs -> this will give errors
logs_df=sqlCtx.load(
source="com.databricks.spark.csv",
header='true',
inferSchema='true',
path='file:///usr/lib/hue/apps/search/examples/collections/solr_configs_log_analytics_demo/index_data.csv')
View data.csv
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
Respondent;Age;Gender;OrgType;OrgSize;Position;Role-Marketing;Role-Sales;Role-Finance;Role-Production;RoleRD;RoleIT;Support;Industry;DegreeLevel;DegreeField-ScienceMathematics;DegreeField-Engineering;DegreeField-ICT;DegreeField-Commerce;DegreeField-Social;DegreeField-Law;DegreeField-Other;DataScienceExperience;Techniques-Algo;Techniques-Backend;Techniques-Bayes;Techniques-BigData;Techniques-Business;Techniques-Stats;Techniques-DataMan;Techniques-FrontEnd;Techniques-Graph;Techniques-ML;Techniques-Math;Techniques-Optimization;Techniques-ProdDev;Techniques-Science;Techniques-Simul;Techniques-Spatial;Techniques-StructData;Techniques-SurvMark;Techniques-SysAdmin;Techniques-Temp;Techniques-UnstructData;Techniques-Viz;DataScientistType-Developer;DataScientistType-Engineer;DataScientistType-Researcher;DataScientistType-Scientist;DataScientistType-Statistician;DataScientistType-JackOfAllTrades;DataScientistType-Artist;DataScientistType-Hacker;DataScientistType-Leader;DataScientistType-BizPerson;DataScientistType-Entre
You can’t perform that action at this time.