Skip to content

Instantly share code, notes, and snippets.

@kristoff-it
Created January 30, 2019 15:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kristoff-it/6ba8c81bdfbc7500053ea78a4193e9ff to your computer and use it in GitHub Desktop.
Save kristoff-it/6ba8c81bdfbc7500053ea78a4193e9ff to your computer and use it in GitHub Desktop.
>>> full_df = spark.read.csv("pantheon.tsv", sep="\t", quote="", header=True, inferSchema=True)
>>> full_df.dtypes
[('en_curid', 'int'), ('name', 'string'), ('numlangs', 'int'), ('birthcity', 'string'), ('birthstate', 'string'), ('countryName', 'string'), ('countryCode', 'string'), ('countryCode3', 'string'), ('LAT', 'double'), ('LON', 'double'), ('continentName', 'string'), ('birthyear', 'string'), ('gender', 'string'), ('occupation', 'string'), ('industry', 'string'), ('domain', 'string'), ('TotalPageViews', 'int'), ('L_star', 'double'), ('StdDevPageViews', 'double'), ('PageViewsEnglish', 'int'), ('PageViewsNonEnglish', 'int'), ('AverageViews', 'double'), ('HPI', 'double')]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment