rmoff/README.md

## README.md

      
    Raw
  

              README.md
            
          
    This is a set of instructions for use with the blog article Streaming data from Oracle using Oracle GoldenGate and Kafka Connect.
@rmoff / September 15, 2016

This code is intended to be run in Sense.


The DELETE is just there so that you can re-run these statements, since Elasticsearch won't update an existing template.


This is based on a single-node Elasticsearch instance, so setting the number of replicas to zero, and shards to one. In a multi-node Production cluster you'd want to set these differently. If you leave replicas as the default (1) then your Elasticsearch cluster will remain in "YELLOW" health status as there'll forever be unassigned shards.


Any index beginning with soe will be matched against this template.


The non_analysed_string_template template matches any string field and creates two instances of it; one analyzed and one not. Analyzed is where it gets tokenized which is useful for full-text searching etc, and non-analyzed is necessary for aggregations against the full field value. For example, "New York" would otherwise aggregate as 'New' and a separate instance 'York'.


The dates template matches any field with _ts suffix and sets it to a Date type. The inbound data must match the format shown. For details of the date format specifics, see the JodaTime documentation.


Assuming that this is being used with the Elasticsearch Kafka Connect connector, set in the configuration properties:
 topic.schema.ignore=myTopic


otherwise the connector will take the schema and force it on Elasticsearch and ignore the dynamic mapping.

  
## kafkaconnect-elasticsearch-index-template.json
DELETE /_template/kafkaconnect/
PUT /_template/kafkaconnect/
{
  "template": "soe*",
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "_default_": {
      "dynamic_templates": [
        {
          "dates": {
            "match": "*_ts",
            "mapping": {
              "type": "date",
              "format": "YYYY-MM-dd HH:mm:ss.SSSSSS"
            }
          }
        },
        {
          "non_analysed_string_template": {
            "match": "*",
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      ]
    }
  }
}
	DELETE /_template/kafkaconnect/
	PUT /_template/kafkaconnect/
	{
	"template": "soe*",
	"settings": {
	"number_of_shards": 1,
	"number_of_replicas": 0
	},
	"mappings": {
	"_default_": {
	"dynamic_templates": [
	{
	"dates": {
	"match": "*_ts",
	"mapping": {
	"type": "date",
	"format": "YYYY-MM-dd HH:mm:ss.SSSSSS"
	}
	}
	},
	{
	"non_analysed_string_template": {
	"match": "*",
	"match_mapping_type": "string",
	"mapping": {
	"type": "string",
	"index": "not_analyzed"
	}
	}
	}
	]
	}
	}
	}