Skip to content

Instantly share code, notes, and snippets.

@jovianlin
Created March 30, 2017 09:17
Show Gist options
  • Save jovianlin/2c8f930465b89e755ae52cfb2c92e22a to your computer and use it in GitHub Desktop.
Save jovianlin/2c8f930465b89e755ae52cfb2c92e22a to your computer and use it in GitHub Desktop.
Custom Concat Columns for PySpark
from pyspark.sql.functions import col, concat, lit
custom_concat = [col('appName'), lit('|'), col('platform'), lit('|'),
col('carrier'), lit('|'), col('connectionType'), lit('|'),
col('country'), lit('|'), col('city'), lit('|'),
col('userAgent')]
# Add a new column entitled "custom_col"
union_df = union_df.withColumn('custom_col', concat(*custom_concat))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment