Skip to content

Instantly share code, notes, and snippets.

@afranzi
Last active October 23, 2018 22:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save afranzi/f82e71e9cbd4c66531df336874631725 to your computer and use it in GitHub Desktop.
Save afranzi/f82e71e9cbd4c66531df336874631725 to your computer and use it in GitHub Desktop.
Apply a MLflow model to a Spark Dataframe
import mlflow.pyfunc
model_path = 's3://<bucket>/mlflow/artifacts/1/0f8691808e914d1087cf097a08730f17/artifacts/model'
wine_path = '/Users/afranzi/Projects/data/winequality-red.csv'
wine_udf = mlflow.pyfunc.spark_udf(spark, model_path)
df = spark.read.format("csv").option("header", "true").option('delimiter', ';').load(wine_path)
columns = [ "fixed acidity", "volatile acidity", "citric acid",
"residual sugar", "chlorides", "free sulfur dioxide",
"total sulfur dioxide", "density", "pH",
"sulphates", "alcohol"
]
df.withColumn('prediction', wine_udf(*columns)).show(100, False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment