Skip to content

Instantly share code, notes, and snippets.

@jessejlt
Created May 14, 2012 19:02
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jessejlt/2695724 to your computer and use it in GitHub Desktop.
Save jessejlt/2695724 to your computer and use it in GitHub Desktop.
ElasticSearch zen unicast discovery

Goal

I currently have a single machine running an ElasticSearch instance that already contains data and some index configurations. I have brought a new node online (a Linux VM) and would like to create a cluster between the master and said new node.

Issue

The new Linux node can't seem to establish a connection to my master. The log says

[2012-05-14 11:46:59,891][WARN ][discovery.zen.ping.unicast] [media-node1] failed to send ping to [[#zen_unicast_1#][inet[/153.32.228.250:9300]]]
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[/153.32.228.250:9300]][discovery/zen/unicast] request_id [0] timed out after [3752ms]
	at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:347)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:679)

Not reachable?

The log seems to indicate that the specified machine might not be reachable, or at least not on the specified port, so from the same Linux node I tried pinging the machine to get its cluster state:

curl -XGET http://153.32.228.250:9200/_cluster/nodes | python -m json.tool
{
    "cluster_name": "elasticsearch_media", 
    "nodes": {
        "qUxSLpXNTNyO9jlq9OBf4w": {
            "attributes": {
                "master": "true"
            }, 
            "http": {
                "bound_address": "inet[/0.0.0.0:9200]", 
                "publish_address": "inet[/153.32.228.250:9200]"
            }, 
            "http_address": "inet[/153.32.228.250:9200]", 
            "jvm": {
                "mem": {
                    "heap_init": "256mb", 
                    "heap_init_in_bytes": 268435456, 
                    "heap_max": "1011.2mb", 
                    "heap_max_in_bytes": 1060372480, 
                    "non_heap_init": "23.1mb", 
                    "non_heap_init_in_bytes": 24317952, 
                    "non_heap_max": "130mb", 
                    "non_heap_max_in_bytes": 136314880
                }, 
                "pid": 83978, 
                "start_time": 1337019720794, 
                "version": "1.6.0_31", 
                "vm_name": "Java HotSpot(TM) 64-Bit Server VM", 
                "vm_vendor": "Apple Inc.", 
                "vm_version": "20.6-b01-415"
            }, 
            "name": "media-dev", 
            "network": {
                "refresh_interval": 5000
            }, 
            "os": {
                "refresh_interval": 1000
            }, 
            "process": {
                "id": 83978, 
                "max_file_descriptors": 150000, 
                "refresh_interval": 1000
            }, 
            "transport": {
                "bound_address": "inet[/0.0.0.0:9300]", 
                "publish_address": "inet[/153.32.228.250:9300]"
            }, 
            "transport_address": "inet[/153.32.228.250:9300]"
        }
    }
}

So it at least looks like the master node is reachable by the slave node and is listening on the correct port.

cluster.name: elasticsearch_media
node.name: "media-dev"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["153.32.228.250[9300-9400]", "10.122.234.19[9300-9400]"]
cluster.name: elasticsearch_media
node.name: "media-slave1"
node.master: true
node.data: true
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["153.32.228.250[9300-9400]", "10.122.234.19[9300-9400]"]
@jessejlt
Copy link
Author

jessejlt commented Jun 4, 2012

Discovery TRACE

existing master

[2012-06-04 13:38:08,992][INFO ][node                     ] [media-dev] {0.18.7}[46263]: initializing ...
[2012-06-04 13:38:08,998][INFO ][plugins                  ] [media-dev] loaded [], sites []
[2012-06-04 13:38:09,776][DEBUG][discovery.zen.ping.unicast] [media-dev] using initial hosts [], with concurrent_connects [10]
[2012-06-04 13:38:09,777][DEBUG][discovery.zen            ] [media-dev] using ping.timeout [3s]
[2012-06-04 13:38:09,781][DEBUG][discovery.zen.elect      ] [media-dev] using minimum_master_nodes [-1]
[2012-06-04 13:38:09,782][DEBUG][discovery.zen.fd         ] [media-dev] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2012-06-04 13:38:09,784][DEBUG][discovery.zen.fd         ] [media-dev] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2012-06-04 13:38:10,182][INFO ][node                     ] [media-dev] {0.18.7}[46263]: initialized
[2012-06-04 13:38:10,182][INFO ][node                     ] [media-dev] {0.18.7}[46263]: starting ...
[2012-06-04 13:38:10,264][INFO ][transport                ] [media-dev] bound_address {inet[/0.0.0.0:9300]}, publish_address {inet[/153.32.228.250:9300]}
[2012-06-04 13:38:10,312][TRACE][discovery                ] [media-dev] waiting for 30s for the initial state to be set by the discovery
[2012-06-04 13:38:13,318][DEBUG][discovery.zen            ] [media-dev] ping responses: {none}
[2012-06-04 13:38:13,321][INFO ][cluster.service          ] [media-dev] new_master [media-dev][WCOqnRtxSr-DgXy2UA4Amg][inet[/153.32.228.250:9300]]{master=true}, reason: zen-disco-join (elected_as_master)
[2012-06-04 13:38:13,357][TRACE][discovery                ] [media-dev] initial state set from discovery
[2012-06-04 13:38:13,358][INFO ][discovery                ] [media-dev] elasticsearch_media/WCOqnRtxSr-DgXy2UA4Amg
[2012-06-04 13:38:13,504][INFO ][http                     ] [media-dev] bound_address {inet[/0.0.0.0:9200]}, publish_address {inet[/153.32.228.250:9200]}
[2012-06-04 13:38:13,505][INFO ][node                     ] [media-dev] {0.18.7}[46263]: started
[2012-06-04 13:38:14,369][INFO ][gateway                  ] [media-dev] recovered [1] indices into cluster_state

new node

[2012-06-04 13:41:48,303][INFO ][node                     ] [media-slave1] {0.19.3}[29384]: initializing ...
[2012-06-04 13:41:48,319][INFO ][plugins                  ] [media-slave1] loaded [], sites []
[2012-06-04 13:41:50,104][DEBUG][discovery.zen.ping.unicast] [media-slave1] using initial hosts [], with concurrent_connects [10]
[2012-06-04 13:41:50,108][DEBUG][discovery.zen            ] [media-slave1] using ping.timeout [3s]
[2012-06-04 13:41:50,120][DEBUG][discovery.zen.elect      ] [media-slave1] using minimum_master_nodes [-1]
[2012-06-04 13:41:50,123][DEBUG][discovery.zen.fd         ] [media-slave1] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2012-06-04 13:41:50,130][DEBUG][discovery.zen.fd         ] [media-slave1] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2012-06-04 13:41:51,328][INFO ][node                     ] [media-slave1] {0.19.3}[29384]: initialized
[2012-06-04 13:41:51,329][INFO ][node                     ] [media-slave1] {0.19.3}[29384]: starting ...
[2012-06-04 13:41:51,423][INFO ][transport                ] [media-slave1] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.122.234.19:9300]}
[2012-06-04 13:41:51,438][TRACE][discovery                ] [media-slave1] waiting for 30s for the initial state to be set by the discovery
[2012-06-04 13:41:54,446][DEBUG][discovery.zen            ] [media-slave1] ping responses: {none}
[2012-06-04 13:41:54,456][INFO ][cluster.service          ] [media-slave1] new_master [media-slave1][fXMV-7rpRhu0sRWfonzS7Q][inet[/10.122.234.19:9300]]{master=true}, reason: zen-disco-join (elected_as_master)
[2012-06-04 13:41:54,561][TRACE][discovery                ] [media-slave1] initial state set from discovery
[2012-06-04 13:41:54,561][INFO ][discovery                ] [media-slave1] elasticsearch_media/fXMV-7rpRhu0sRWfonzS7Q
[2012-06-04 13:41:54,601][INFO ][http                     ] [media-slave1] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.122.234.19:9200]}
[2012-06-04 13:41:54,603][INFO ][node                     ] [media-slave1] {0.19.3}[29384]: started
[2012-06-04 13:41:54,620][INFO ][gateway                  ] [media-slave1] recovered [0] indices into cluster_state

@syllogismos
Copy link

were you able to make this work? this gist is from 2012.. What did you find and how did you fix it? 😥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment