Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@rmoff
Last active July 27, 2017 14:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rmoff/4d57a815a539b9f34eeec1ed4cdf8ede to your computer and use it in GitHub Desktop.
Save rmoff/4d57a815a539b9f34eeec1ed4cdf8ede to your computer and use it in GitHub Desktop.

This is a set of instructions for use with the blog article Streaming data from Oracle using Oracle GoldenGate and Kafka Connect.

@rmoff / September 15, 2016


This code is intended to be run in Sense.

  1. The DELETE is just there so that you can re-run these statements, since Elasticsearch won't update an existing template.

  2. This is based on a single-node Elasticsearch instance, so setting the number of replicas to zero, and shards to one. In a multi-node Production cluster you'd want to set these differently. If you leave replicas as the default (1) then your Elasticsearch cluster will remain in "YELLOW" health status as there'll forever be unassigned shards.

  3. Any index beginning with soe will be matched against this template.

  4. The non_analysed_string_template template matches any string field and creates two instances of it; one analyzed and one not. Analyzed is where it gets tokenized which is useful for full-text searching etc, and non-analyzed is necessary for aggregations against the full field value. For example, "New York" would otherwise aggregate as 'New' and a separate instance 'York'.

  5. The dates template matches any field with _ts suffix and sets it to a Date type. The inbound data must match the format shown. For details of the date format specifics, see the JodaTime documentation.

  6. Assuming that this is being used with the Elasticsearch Kafka Connect connector, set in the configuration properties:

     topic.schema.ignore=myTopic
    

otherwise the connector will take the schema and force it on Elasticsearch and ignore the dynamic mapping.

DELETE /_template/kafkaconnect/
PUT /_template/kafkaconnect/
{
"template": "soe*",
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"_default_": {
"dynamic_templates": [
{
"dates": {
"match": "*_ts",
"mapping": {
"type": "date",
"format": "YYYY-MM-dd HH:mm:ss.SSSSSS"
}
}
},
{
"non_analysed_string_template": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment