Last active
August 29, 2015 13:57
-
-
Save thedebugger/9845208 to your computer and use it in GitHub Desktop.
nodetool status shows 2 nodes are down
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Problem: nodetool status (only from node6) shows 2 nodes are down. But they appear to be running. | |
* Environment | |
* Running cassandra 2.0.1 | |
* 6 nodes in the cluster | |
* nodetool status - from node 6 (10.82.23.119) | |
Datacenter: datacenter1 | |
======================= | |
Status=Up/Down | |
|/ State=Normal/Leaving/Joining/Moving | |
-- Address Load Owns Host ID Token Rack | |
UN 10.84.2.135 13.71 GB 30.7% f6eb0cd6-f82d-41c9-91be-3f4e9dc4a59d -9223372036854775808 rack1 | |
DN 10.82.23.116 3.43 GB 14.1% b114fe60-1bb0-4f7b-aaf3-b941f261026e -6614363251442489157 rack1 | |
UN 10.84.13.121 13.86 GB 19.2% 5e4c7c51-447b-4c4a-a14b-b495814fb45e -3074457345618258603 rack1 | |
UN 10.82.22.116 2.86 GB 29.2% 1ee1ec8f-6dc3-4d2c-8a3b-befe082933c1 2317766532957582345 rack1 | |
DN 10.84.14.117 13.57 GB 4.1% 9bf76178-84d7-4c0f-8ad5-dedfd305a269 3074457345618258602 rack1 | |
UN 10.82.23.119 8.03 GB 2.6% a952d445-5b7f-46ee-8238-4451dfaf0e50 3558217197862924070 rack1 | |
* nodetool info - from node 6 | |
Token : 3558217197862924070 | |
ID : a952d445-5b7f-46ee-8238-4451dfaf0e50 | |
Gossip active : true | |
Thrift active : true | |
Native Transport active: true | |
Load : 8.03 GB | |
Generation No : 1395260457 | |
Uptime (seconds) : 790697 | |
Heap Memory (MB) : 3134.13 / 7987.25 | |
Data Center : datacenter1 | |
Rack : rack1 | |
Exceptions : 0 | |
Key Cache : size 104857540 (bytes), capacity 104857600 (bytes), 54295052 hits, 61109135 requests, 0.000 recent hit rate, 14400 save period in seconds | |
Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds | |
* nodetool status - from node 2 (10.84.14.117) | |
|/ State=Normal/Leaving/Joining/Moving | |
-- Address Load Owns Host ID Token Rack | |
UN 10.84.2.135 13.71 GB 30.7% f6eb0cd6-f82d-41c9-91be-3f4e9dc4a59d -9223372036854775808 rack1 | |
UN 10.82.23.116 3.43 GB 14.1% b114fe60-1bb0-4f7b-aaf3-b941f261026e -6614363251442489157 rack1 | |
UN 10.84.13.121 13.86 GB 19.2% 5e4c7c51-447b-4c4a-a14b-b495814fb45e -3074457345618258603 rack1 | |
UN 10.82.22.116 2.86 GB 29.2% 1ee1ec8f-6dc3-4d2c-8a3b-befe082933c1 2317766532957582345 rack1 | |
UN 10.84.14.117 13.57 GB 4.1% 9bf76178-84d7-4c0f-8ad5-dedfd305a269 3074457345618258602 rack1 | |
UN 10.82.23.119 8.03 GB 2.6% a952d445-5b7f-46ee-8238-4451dfaf0e50 3558217197862924070 rack1 | |
* Verified that all the ports are open | |
* Verified running following queries from node 6 using cqlsh with consistency level set to ALL | |
select * from evnts where token(device_guid) > -5614363251442489156 limit 1; | |
select * from evnts where token(device_guid) < -5614363251442489156 limit 1; | |
select * from evnts where token(device_guid) > 3074457345618258603 limit 1; | |
All queries returned result successfully. These should fail if the node is down...right? | |
* Enabled logging on node 6 | |
log4j.logger.org.apache.cassandra.gms.Gossiper=TRACE | |
log4j.logger.org.apache.cassandra.gms.FailureDetector=TRACE | |
Here is the output of logs.. | |
https://gist.github.com/thedebugger/9845383 | |
Everything looks ok to me. I was expecting a problem similar to this - http://thelastpickle.com/blog/2011/12/15/Anatomy-of-a-Cassandra-Partition.html.. but no. | |
* In the logs I saw this only once.. | |
WARN [OptionalTasks:1] 2014-03-28 11:40:57,635 HintedHandoffMetrics.java (line 79) /10.84.14.117 has 1 dropped hints, because node is down past configured hint window | |
* nodetool gossipinfo - from node 6 | |
/10.84.13.121 | |
SEVERITY:0.0 | |
RACK:rack1 | |
RPC_ADDRESS:10.84.13.121 | |
NET_VERSION:7 | |
SCHEMA:8295a71a-1d4f-38f7-a513-9a74c0aa8c5d | |
HOST_ID:5e4c7c51-447b-4c4a-a14b-b495814fb45e | |
RELEASE_VERSION:2.0.1 | |
LOAD:1.4882345955E10 | |
STATUS:NORMAL,-3074457345618258603 | |
DC:datacenter1 | |
/10.84.14.117 | |
SEVERITY:0.0 | |
RACK:rack1 | |
RPC_ADDRESS:10.84.14.117 | |
NET_VERSION:7 | |
SCHEMA:8295a71a-1d4f-38f7-a513-9a74c0aa8c5d | |
HOST_ID:9bf76178-84d7-4c0f-8ad5-dedfd305a269 | |
RELEASE_VERSION:2.0.1 | |
LOAD:1.4574199355E10 | |
STATUS:NORMAL,3074457345618258602 | |
DC:datacenter1 | |
/10.82.23.119 | |
SEVERITY:0.0 | |
RACK:rack1 | |
RPC_ADDRESS:10.82.23.119 | |
NET_VERSION:7 | |
SCHEMA:8295a71a-1d4f-38f7-a513-9a74c0aa8c5d | |
RELEASE_VERSION:2.0.1 | |
HOST_ID:a952d445-5b7f-46ee-8238-4451dfaf0e50 | |
LOAD:8.624908077E9 | |
DC:datacenter1 | |
STATUS:NORMAL,3558217197862924070 | |
/10.84.2.135 | |
SEVERITY:0.0 | |
RACK:rack1 | |
RPC_ADDRESS:10.84.2.135 | |
NET_VERSION:7 | |
SCHEMA:8295a71a-1d4f-38f7-a513-9a74c0aa8c5d | |
HOST_ID:f6eb0cd6-f82d-41c9-91be-3f4e9dc4a59d | |
RELEASE_VERSION:2.0.1 | |
LOAD:1.4718471412E10 | |
STATUS:NORMAL,-9223372036854775808 | |
DC:datacenter1 | |
/10.82.22.116 | |
SEVERITY:0.0 | |
RACK:rack1 | |
RPC_ADDRESS:10.82.22.116 | |
NET_VERSION:7 | |
SCHEMA:8295a71a-1d4f-38f7-a513-9a74c0aa8c5d | |
HOST_ID:1ee1ec8f-6dc3-4d2c-8a3b-befe082933c1 | |
RELEASE_VERSION:2.0.1 | |
LOAD:3.072407862E9 | |
STATUS:NORMAL,2317766532957582345 | |
DC:datacenter1 | |
/10.82.23.116 | |
SEVERITY:0.0 | |
RACK:rack1 | |
RPC_ADDRESS:10.82.23.116 | |
NET_VERSION:7 | |
SCHEMA:8295a71a-1d4f-38f7-a513-9a74c0aa8c5d | |
HOST_ID:b114fe60-1bb0-4f7b-aaf3-b941f261026e | |
RELEASE_VERSION:2.0.1 | |
LOAD:3.685326727E9 | |
STATUS:NORMAL,-6614363251442489157 | |
DC:datacenter1 | |
* Whyyy? Could there be a bug in the nodetool? Or Am I missing something? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Did you ever get a resolution to this? I am having the same problem...