This is a set of instructions for use with the blog article Streaming data from Oracle using Oracle GoldenGate and Kafka Connect.
@rmoff / September 15, 2016
This code is intended to be run in Sense.
-
The
DELETE
is just there so that you can re-run these statements, since Elasticsearch won't update an existing template. -
This is based on a single-node Elasticsearch instance, so setting the number of replicas to zero, and shards to one. In a multi-node Production cluster you'd want to set these differently. If you leave replicas as the default (1) then your Elasticsearch cluster will remain in "YELLOW" health status as there'll forever be unassigned shards.
-
Any index beginning with
soe
will be matched against this template. -
The
non_analysed_string_template
template matches any string field and creates two instances of it; one analyzed and one not. Analyzed is where it gets tokenized which is useful for full-text searching etc, and non-analyzed is necessary for aggregations against the full field value. For example, "New York" would otherwise aggregate as 'New' and a separate instance 'York'. -
The
dates
template matches any field with_ts
suffix and sets it to a Date type. The inbound data must match the format shown. For details of the date format specifics, see the JodaTime documentation. -
Assuming that this is being used with the Elasticsearch Kafka Connect connector, set in the configuration properties:
topic.schema.ignore=myTopic
otherwise the connector will take the schema and force it on Elasticsearch and ignore the dynamic mapping.