Skip to content

Instantly share code, notes, and snippets.

@mkaranasou
Last active October 13, 2019 11:09
Show Gist options
  • Save mkaranasou/3ad6ee5d41a0dd5d48776f591e06c5be to your computer and use it in GitHub Desktop.
Save mkaranasou/3ad6ee5d41a0dd5d48776f591e06c5be to your computer and use it in GitHub Desktop.
An example feature class
from pyspark.sql import functions as F
class FeatureAToBRatio(object):
feature_name = 'a_to_b_ratio'
default_value = 0.
def calculate(self, df):
"""
Given a dataframe that contains columns a and b,
calculate a to b ratio. If b is 0 then the result will be
the feature's default value.
"""
df = df.withColumn(
self.feature_name,
F.when(
F.col('b') > 0.,
(
F.col('a').cast('float') / F.col('b').cast('float')
)
).otherwise(
self.default_value
)
).fillna({self.feature_name: self.default_value})
return df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment