czue/kafka-migration.md

## kafka-migration.md

      
    Raw
  

              kafka-migration.md
            
          
    Kafka Migration

Problem statement

We need to migrate Kafka from one machine to the other, ideally with 0 downtime.
Planned solution


Spin up the new kafka environment
Publish changes from production to both environment in parallel
Confirm new environment is working
Flip a pointer for the feeds that run off kafka
Stop publishing to old kafka environment
Remove old kafka entirely

Code changes

Publishing to multiple kafkas


The ChangeFeedPillow and user_groups_db_kafka_pillow either need to be duplicated or updated to publish to multiple kafkas
This likely means that get_kafka_client_or_none (and get_kafka_client) need to support passing in a client ID or something, similar to the distinction between get_db() and get_db_for_doc_type()
You can leave the pillows ~untouched if you just figure out a way to get the list of kafkas into the processor and then update this line to send changes to all kafkas instead of just one.
For the exact purposes of this change, a get_all_kafkas() utility function could be the easiest way to transition and (for now) we just treat them as peers.
It would be good if this is configurable via some kind of settings dict, since we don’t want to publish to multiple kafkas on all environments

Settings changes

Something like the following could work, where any time someone gets a feed that isn't in the the settings it just returns the default:
KAFKAS = { 
    'default': 'kafka0.internal.commcarehq.org:9092',
    'migration': 'kafka1.internal.commcarehq.org:9092'
}
Reading from new feeds


The KafkaChangeFeed class will need to be update so that _get_consumer isn't hard-coded to pull from the main kafka.
The pillows that use KafkaChangeFeed (e.g. the UCR pillow) will need to be updated to properly initialize the change feed with enough information to fetch the right kafka.