Skip to content

Instantly share code, notes, and snippets.

@yvan
Last active February 22, 2023 23:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yvan/0a72c8116fe0951934925ab47a88a8db to your computer and use it in GitHub Desktop.
Save yvan/0a72c8116fe0951934925ab47a88a8db to your computer and use it in GitHub Desktop.
# pandas
df[col5] = pd.to_datetime(df[col5], errors='coerce')
# pyspark
data_regex = r"\d{2,4}(\.|\-|\/|\\)+\d{2,4}(\.|\-|\/|\\)+\d{2,4}(\s)*(\d{2}\:\d{2}\:\d{2})?(\.\d{3})?|^$"
df = df.withColumn(col5, F.when(F.regexp_replace(F.col(col5), data_regex, '').isNotNull(),\
F.to_timestamp(F.col(col5), 'yyyy/MM/dd')).otherwise(None))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment