Skip to content

Instantly share code, notes, and snippets.

@karol-blaszczyk
Created November 4, 2019 16:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save karol-blaszczyk/e647b2115bcd361ea570d1f75aaf28a9 to your computer and use it in GitHub Desktop.
Save karol-blaszczyk/e647b2115bcd361ea570d1f75aaf28a9 to your computer and use it in GitHub Desktop.
%pyspark
import sys
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.transforms import *
# Create a Glue context
glueContext = GlueContext(SparkContext.getOrCreate())
# Create a DynamicFrame using
# the 'performance_reports' table
performance_DyF = glueContext.create_dynamic_frame\
.from_catalog(
database="reports",
table_name="performance_reports")
# Print out information about this data
print("Records Count: ", performance_DyF.toDF().count())
performance_DyF.printSchema()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment