Skip to content

Instantly share code, notes, and snippets.

View mkurian's full-sized avatar

Merrin Kurian mkurian

View GitHub Profile
<!DOCTYPE html>
<meta charset="utf-8">
Hello, world!
More changes ....
@mkurian
mkurian / .gistup
Created April 7, 2020 21:47
hello world
gistup
@mkurian
mkurian / gist:91d7549514728b0fccd3399604dc8a1d
Created December 6, 2019 00:19
cassandra tombstone config
Tombstones are really deleted after period specified by gc_grace_seconds setting of the table (it's 10 days by default). This is done to make sure that any node that was down at time of deletion will pickup these changes after recover. Here are the blog posts that discuss this in great details: from thelastpickle (recommended), 1, 2, and DSE documentation or Cassandra documentation.
You can set the gc_grace_seconds option on the individual table to lower value to remove deleted data faster, but this should be done only for tables with TTLed data. You may also need to tweak tombstone_threshold & tombstone_compaction_interval table options to perform compactions faster. See this document or this document for description of these options.
for (i in 1..9) {
val workerId = "workerId-$i"
val kclConfig = KinesisClientLibConfiguration("fooWorker",
streamConfig.streamArn, awsAuth.credentialsProvider(), workerId)
.withMaxRecords(streamConfig.maxRecords)
.withInitialPositionInStream(InitialPositionInStream.valueOf(streamConfig.streamPosition))
.withRegionName(ddbStreamConfigProperties.region)
.withDynamoDBEndpoint(ddbStreamConfigProperties.dynamoDBEndpoint)
fun kclConfig(streamConfig: StreamConfigProperties): KinesisClientLibConfiguration {
return KinesisClientLibConfiguration(streamConfig.applicationName,
streamConfig.streamArn, awsAuth.credentialsProvider(), streamConfig.workerId)
.withMaxRecords(streamConfig.batchSize)
.withIdleTimeBetweenReadsInMillis(streamConfig.pollingFrequency)
.withInitialPositionInStream(InitialPositionInStream.valueOf(streamConfig.streamPosition))
.withRegionName(streamConfig.region)
.withDynamoDBEndpoint(streamConfig.dynamoDBEndpoint)
.withCallProcessRecordsEvenForEmptyRecordList(true)
}
@mkurian
mkurian / KCLWorker.kt
Last active November 10, 2019 20:14
KCL worker for DynamoDB
val fooStreamWorker = Worker.Builder()
.recordProcessorFactory(fooStreamRecordProcessorFactory)
.config(kclConfig(fooStreamConfig)).kinesisClient(dynamoDBStreamsAdapterClient())
.build()
val fooWorker = Thread(fooStreamWorker)
fooWorker.start()
fun dynamoDBStreamsAdapterClient(): AmazonDynamoDBStreamsAdapterClient {
return AmazonDynamoDBStreamsAdapterClient(awsAuth.credentialsProvider())
CREATE TABLE Invoice(
business_id bigint,
customer_id timeuuid,
invoice_id timeuuid,
created timestamp,
……
PRIMARY KEY (business_id, customer_id, invoice_id)
) WITH CLUSTERING ORDER BY (invoice_id DESC)
{
"AttributeDefinitions": [
{
"AttributeName": "business_id",
"AttributeType": "N"
},
{
"AttributeName": "invoice_id",
"AttributeType": "S"
}
{
"TableName": "Invoice",
"AttributeDefinitions": [
{
"AttributeName": "business_id",
"AttributeType": "N"
},
{
"AttributeName": "customer_id",
"AttributeType": "S"
@mkurian
mkurian / docker-cmds.md
Created May 8, 2019 23:20
Docker commands

Logs for container startup

docker logs CONTAINER

List all containers even dead ones

docker ps -a

Remove all images with a pattern to clean up disk space

docker images -a | grep "pattern" | awk '{print $3}' | xargs docker rmi -f