Skip to content

Instantly share code, notes, and snippets.

@shanthoosh
shanthoosh / Test-deserializing-serialized-null-values-in-coordinator-stream
Last active April 26, 2019 16:54
Logs to demonstrate that ApplicatoinMaster ignores configurations with serialized null values from the coordinator stream.
Published the samza reserved configurations with serialized null values into coordinator stream:
```java
╰─λ./deploy/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic __samza_coordinator_pageview-profile-table-joiner_1 --property print.key=true --property key.separator="-" --from-beginning | grep "set-config"
[1,"set-config","streams.pageview-profile-table-joiner-1-partition_by-join.samza.system"]-{"host":"172.21.224.112","source":"job-runner","values":{"value":"kafka"},"username":"svenkata","timestamp":1556154205737}
@shanthoosh
shanthoosh / testing-change-log-manager.md
Last active April 9, 2019 23:56
Test logs for migrating the change-log manager and config to metadata store in samza

Samza-Yarn ApplicationMaster

Before the fix, there were four redundant metadata fetches of coordinator stream in samza-yarn ApplicationMaster:

[svenkata@svenkata-ld2 usercache]$ sudo grep "Fetching system stream metadata" ./appcache/application_1554757366064_0143/container_e43_1554757365064_0142_01_000001/logs/samza-job-coordinator.log
2019-04-09 14:21:16.275 [main] KafkaSystemAdmin [INFO] Fetching system stream metadata for [__samza_coordinator_samza-count-by-device_i001] from system samzametadatasystem
2019-04-09 14:21:16.275 [main] KafkaSystemAdmin [INFO] Fetching system stream metadata for [__samza_coordinator_samza-count-by-device_i001] from system samzametadatasystem
2019-04-09 14:21:16.275 [main] KafkaSystemAdmin [INFO] Fetching system stream metadata for [__samza_coordinator_samza-count-by-device_i001] from system samzametadatasystem
@shanthoosh
shanthoosh / application-master-logs.md
Created April 8, 2019 17:40
Testing the redundant coordinator stream reads fix by samza ApplicationMaster

Before the fix, coordinator stream was read redundantly 5 times by the metadata store as illustrated by the following logs:

xiliu-mn4:logs xiliu$ cat job-activity-aggregation-runner.out |grep "CoordinatorStreamStore" |grep "Registering system stream partition"

2019-03-31 08:55:09.655 [main] CoordinatorStreamStore [INFO] Registering system stream partition: SystemStreamPartition [samzametadatasystem, __samza_coordinator_job-activity-aggregation_i001, 0] with offset: 0.

2019-03-31 08:55:18.733 [main] CoordinatorStreamStore [INFO] Registering system stream partition: SystemStreamPartition [samzametadatasystem, __samza_coordinator_job-activity-aggregation_i001, 0] with offset: 0.

2019-03-31 08:55:26.661 [main] CoordinatorStreamStore [INFO] Registering system stream partition: SystemStreamPartition [samzametadatasystem, __samza_coordinator_job-activity-aggregation_i001, 0] with offset: 0.
@shanthoosh
shanthoosh / Testing input stream expansion of stateful samza jobs
Created January 11, 2019 03:20
Testing the input stream expansion for stateful samza jobs
Samza job name: StreamTableJoinExample
Samza application implementation: https://github.com/apache/samza-hello-samza/blob/master/src/main/java/samza/examples/cookbook/StreamTableJoinExample.java
Task to SSP assignment before the partition expansion:
```
╭─svenkata at svenkata-ld2 in /home/svenkata/code/samza/samza-hello-samza/samza-hello-samza (latest ✚5)
╰─λ grep -i 'grouped' /home/svenkata/code/samza/samza-hello-samza/samza-hello-samza/deploy/yarn/logs/userlogs/application_1547094655534_0006/container_1547094655534_0006_01_000001/samza-job-coordinator.log 0 < 19:13:09
2019-01-10 18:53:57.567 [main] JobModelManager$ [INFO] SystemStreamPartitionGrouper org.apache.samza.container.grouper.stream.GroupByPartition@58326051 has grouped the SystemStreamPartitions into 1 tasks with the following taskNames: {Partition 0=[SystemStreamPartition [kafka, pageview-profile-table-joiner-1-partition_by-join, 0], SystemStreamPartition [kafka, pageview-join-i
--- Shutdown of current leader. Leader joins the group as follower.
1074 00:26:10.274 [DEBUG] [TestEventLogger] 122723 [ZkClient-EventThread-585-127.0.0.1:54727] WARN org.apache.samza.zk.ZkJobCoordinator - Got Disconnected event for processor=2. Scheduling a coordinator stop.
1075 00:26:10.275 [DEBUG] [TestEventLogger] 122724 [ZkClient-EventThread-585-127.0.0.1:54727] INFO org.apache.samza.zk.ScheduleAfterDebounceTime - Scheduled action: ZK_SESSION_ERROR to run after: 500 milliseconds.
1076 00:26:10.275 [DEBUG] [TestEventLogger] 122724 [ZkClient-EventThread-585-127.0.0.1:54727] WARN org.apache.samza.zk.ZkJobCoordinator - Got Expired event for processor=2. Stopping the container and unregister the processor node.
1077 00:26:10.276 [DEBUG] [TestEventLogger] 122724 [ZkClient-EventThread-585-127.0.0.1:54727] INFO org.apache.samza.processor.StreamProcessor - Shutting down container in onJobModelExpired for processor:2
1078 00:26:10.276 [DEBUG] [TestEventLogger] 122724 [p-2-container-thread-0]