Skip to content

Instantly share code, notes, and snippets.

View iewaij's full-sized avatar

Jiawei Li iewaij

  • Frankfurt am Main, Germany
View GitHub Profile
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@iewaij
iewaij / spark.py
Last active November 27, 2020 14:51
from pyspark.sql import *
import matplotlib.pyplot as pyplot
import seaborn as sns
import pandas as pd
spark = SparkSession.builder.master("local[*]").appName("MADS 2020").getOrCreate()
data = spark.read.csv("data/machine_log.csv", inferSchema="True", header="True", sep=";")
data_sample = data.sample(fraction=0.1, seed=42)
# some compound have less produced units