Skip to content

Instantly share code, notes, and snippets.

@tshaiman
Created February 24, 2020 14:54
Show Gist options
  • Save tshaiman/cd2707f34206e852db784ab01885023e to your computer and use it in GitHub Desktop.
Save tshaiman/cd2707f34206e852db784ab01885023e to your computer and use it in GitHub Desktop.
S3 Sink Job from Avro to Parquet.
{
"name" :"s3-parquet-connector",
"config":
{
"connector.class": "io.confluent.connect.s3.S3SinkConnector",
"storage.class": "io.confluent.connect.s3.storage.S3Storage",
"s3.region": "us-east-1",
"s3.bucket.name": "novarize-data",
"topics.dir": "parquet-demo-ts",
"flush.size": "100",
"rotate.schedule.interval.ms": "20000",
"auto.register.schemas": "false",
"tasks.max": "1",
"s3.part.size": "5242880",
"timezone": "UTC",
"parquet.codec": "snappy",
"topics": "ratings_avro",
"schema.registry.url": "http://schema-registry:8081",
"s3.credentials.provider.class": "com.amazonaws.auth.DefaultAWSCredentialsProviderChain",
"format.class": "io.confluent.connect.s3.format.parquet.ParquetFormat",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"partitioner.class": "io.confluent.connect.storage.partitioner.FieldPartitioner",
"partition.field.name": "USER_ID",
"value.converter.schema.registry.url": "http://schema-registry:8081"
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment