Skip to content

Instantly share code, notes, and snippets.

@mostafam
Last active June 3, 2020 01:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mostafam/eb26fb8461aa74234d39a5bd4cd1660e to your computer and use it in GitHub Desktop.
Save mostafam/eb26fb8461aa74234d39a5bd4cd1660e to your computer and use it in GitHub Desktop.
Take2
from pyspark.sql.functions import udf, col
@udf('string')
def get_zip_udf2(latitude, longitude):
search = SearchEngine()
try:
zip = search.by_coordinates(latitude, longitude, returns=1)[0].to_dict()["zipcode"]
except:
zip = 'bad'
return zip
df.withColumn('zip', get_zip_udf2(col("latitude"),col("longitude"))).show()
# +---+---------+-----------+---+
# | id| latitude| longitude|zip|
# +---+---------+-----------+---+
# | 1|33.704045| -90.754334|bad|
# | 2|45.019704|-123.014528|bad|
# | 3|26.306754| -80.259649|bad|
# +---+---------+-----------+---+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment