Skip to content

Instantly share code, notes, and snippets.

@tejasmanohar

tejasmanohar/1 Secret

Last active October 15, 2016 08:50
Show Gist options
  • Save tejasmanohar/f0e705fb0d9e96f68e05e1ab20c478be to your computer and use it in GitHub Desktop.
Save tejasmanohar/f0e705fb0d9e96f68e05e1ab20c478be to your computer and use it in GitHub Desktop.
# this is mongo-initial.q:27017. it's the initial node I setup & ran the following mongo script on ->
# rs.initiate(); rs.add("1.mongo-subsequent.q"); rs.add("2.mongo-subsequent.q")
# for reference,
# - mongo-initial.q:27017 => 10.0.11.2
# - 1.mongo-subsequent.q => 10.0.155.142
# - 2.mongo-subsequent.q => 10.0.9.118
# /etc/mongod.conf => {"net":{"bindIp":["127.0.0.1","10.0.155.142","10.0.9.118"]},"replication":{"replSetName":"rs0"}}
# $ mongod --config /etc/mongod.conf # see output below
2016-10-15T08:29:40.064+0000 I CONTROL [initandlisten] MongoDB starting : pid=10 port=27017 dbpath=/data/db 64-bit host=b188dd520426
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] db version v3.2.10
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] git version: 79d9b3ab5ce20f51c272b4411202710a082d0317
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1t 3 May 2016
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] allocator: tcmalloc
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] modules: none
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] build environment:
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] distmod: debian81
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] distarch: x86_64
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] target_arch: x86_64
2016-10-15T08:29:40.065+0000 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: true }, replication: { replSet: "rs0" } }
2016-10-15T08:29:40.083+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2016-10-15T08:29:40.166+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2016-10-15T08:29:40.166+0000 I CONTROL [initandlisten]
2016-10-15T08:29:40.167+0000 I CONTROL [initandlisten]
2016-10-15T08:29:40.167+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2016-10-15T08:29:40.167+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
2016-10-15T08:29:40.167+0000 I CONTROL [initandlisten]
2016-10-15T08:29:40.167+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2016-10-15T08:29:40.167+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
2016-10-15T08:29:40.167+0000 I CONTROL [initandlisten]
2016-10-15T08:29:40.182+0000 I REPL [initandlisten] Did not find local voted for document at startup; NoMatchingDocument: Did not find replica set lastVote document in local.replset.election
2016-10-15T08:29:40.182+0000 I REPL [initandlisten] Did not find local replica set configuration document at startup; NoMatchingDocument: Did not find replica set configuration document in local.system.replset
2016-10-15T08:29:40.182+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
2016-10-15T08:29:40.183+0000 I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
2016-10-15T08:29:40.195+0000 I NETWORK [initandlisten] waiting for connections on port 27017
2016-10-15T08:29:41.021+0000 I NETWORK [initandlisten] connection accepted from 10.0.11.2:39108 #1 (1 connection now open)
2016-10-15T08:29:41.023+0000 I NETWORK [conn1] end connection 10.0.11.2:39108 (0 connections now open)
2016-10-15T08:29:41.088+0000 I NETWORK [initandlisten] connection accepted from 127.0.0.1:35528 #2 (1 connection now open)
2016-10-15T08:29:41.090+0000 I COMMAND [conn2] initiate : no configuration specified. Using a default configuration for the set
2016-10-15T08:29:41.090+0000 I COMMAND [conn2] created this configuration for initiation : { _id: "rs0", version: 1, members: [ { _id: 0, host: "b188dd520426:27017" } ] }
2016-10-15T08:29:41.090+0000 I REPL [conn2] replSetInitiate admin command received from client
2016-10-15T08:29:41.091+0000 I REPL [conn2] replSetInitiate config object with 1 members parses ok
2016-10-15T08:29:41.091+0000 I REPL [conn2] ******
2016-10-15T08:29:41.091+0000 I REPL [conn2] creating replication oplog of size: 1425MB...
2016-10-15T08:29:41.097+0000 I STORAGE [conn2] Starting WiredTigerRecordStoreThread local.oplog.rs
2016-10-15T08:29:41.097+0000 I STORAGE [conn2] The size storer reports that the oplog contains 0 records totaling to 0 bytes
2016-10-15T08:29:41.097+0000 I STORAGE [conn2] Scanning the oplog to determine where to place markers for truncation
2016-10-15T08:29:41.329+0000 I REPL [conn2] ******
2016-10-15T08:29:41.342+0000 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "b188dd520426:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5801e8f587f86fb67ea7b23c') } }
2016-10-15T08:29:41.342+0000 I REPL [ReplicationExecutor] This node is b188dd520426:27017 in the config
2016-10-15T08:29:41.342+0000 I REPL [ReplicationExecutor] transition to STARTUP2
2016-10-15T08:29:41.343+0000 I REPL [conn2] Starting replication applier threads
2016-10-15T08:29:41.343+0000 I COMMAND [conn2] command local.oplog.rs command: replSetInitiate { replSetInitiate: undefined } keyUpdates:0 writeConflicts:0 numYields:0 reslen:130 locks:{ Global: { acquireCount: { r: 5, w: 3, W: 2 }, acquireWaitCount: { W: 1 }, timeAcquiringMicros: { W: 57 } }, Database: { acquireCount: { w: 2, W: 1 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 2 } } } protocol:op_command 252ms
2016-10-15T08:29:41.343+0000 I REPL [ReplicationExecutor] transition to RECOVERING
2016-10-15T08:29:41.345+0000 I REPL [ReplicationExecutor] transition to SECONDARY
2016-10-15T08:29:41.345+0000 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
2016-10-15T08:29:41.345+0000 I REPL [ReplicationExecutor] dry election run succeeded, running for election
2016-10-15T08:29:41.369+0000 I REPL [ReplicationExecutor] election succeeded, assuming primary role in term 1
2016-10-15T08:29:41.369+0000 I REPL [ReplicationExecutor] transition to PRIMARY
2016-10-15T08:29:41.371+0000 I REPL [conn2] replSetReconfig admin command received from client
2016-10-15T08:29:41.372+0000 I REPL [conn2] replSetReconfig config object with 2 members parses ok
2016-10-15T08:29:41.373+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 1.mongo-subsequent.q:27017
2016-10-15T08:29:41.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 1.mongo-subsequent.q:27017
2016-10-15T08:29:41.375+0000 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 2, protocolVersion: 1, members: [ { _id: 0, host: "b188dd520426:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "1.mongo-subsequent.q:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5801e8f587f86fb67ea7b23c') } }
2016-10-15T08:29:41.375+0000 I REPL [ReplicationExecutor] This node is b188dd520426:27017 in the config
2016-10-15T08:29:41.376+0000 I REPL [ReplicationExecutor] Member 1.mongo-subsequent.q:27017 is now in state STARTUP
2016-10-15T08:29:41.378+0000 I REPL [conn2] replSetReconfig admin command received from client
2016-10-15T08:29:41.380+0000 I REPL [conn2] replSetReconfig config object with 3 members parses ok
2016-10-15T08:29:41.381+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 2.mongo-subsequent.q:27017
2016-10-15T08:29:41.382+0000 I REPL [ReplicationExecutor] New replica set config in use: { _id: "rs0", version: 3, protocolVersion: 1, members: [ { _id: 0, host: "b188dd520426:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "1.mongo-subsequent.q:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "2.mongo-subsequent.q:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5801e8f587f86fb67ea7b23c') } }
2016-10-15T08:29:41.382+0000 I REPL [ReplicationExecutor] This node is b188dd520426:27017 in the config
2016-10-15T08:29:41.382+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 2.mongo-subsequent.q:27017
2016-10-15T08:29:41.382+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 2.mongo-subsequent.q:27017
2016-10-15T08:29:41.383+0000 I REPL [ReplicationExecutor] Member 2.mongo-subsequent.q:27017 is now in state STARTUP
2016-10-15T08:29:41.383+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 2.mongo-subsequent.q:27017
2016-10-15T08:29:41.385+0000 I NETWORK [conn2] end connection 127.0.0.1:35528 (0 connections now open)
2016-10-15T08:29:42.344+0000 I REPL [rsSync] transition to primary complete; database writes are now permitted
2016-10-15T08:29:51.382+0000 I REPL [ReplicationExecutor] can't see a majority of the set, relinquishing primary
2016-10-15T08:29:51.382+0000 I REPL [ReplicationExecutor] Stepping down from primary in response to heartbeat
2016-10-15T08:29:51.382+0000 I REPL [replExecDBWorker-2] transition to SECONDARY
2016-10-15T08:29:51.387+0000 I REPL [ReplicationExecutor] Member 1.mongo-subsequent.q:27017 is now in state STARTUP
2016-10-15T08:29:51.387+0000 I REPL [ReplicationExecutor] Member 2.mongo-subsequent.q:27017 is now in state STARTUP
2016-10-15T08:30:02.361+0000 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms
2016-10-15T08:30:02.361+0000 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected
2016-10-15T08:30:02.361+0000 I REPL [ReplicationExecutor] VoteRequester: Got error processing response with status: BadValue: Unexpected field info in ReplSetRequestVotes, resp:{ info: "run rs.initiate(...) if not yet done for the set", ok: 0.0, errmsg: "no replset config has been received", code: 94 }
2016-10-15T08:30:02.361+0000 I REPL [ReplicationExecutor] VoteRequester: Got no vote from 1.mongo-subsequent.q:27017 because: , resp:{ info: "run rs.initiate(...) if not yet done for the set", ok: 0.0, errmsg: "no replset config has been received", code: 94 }
2016-10-15T08:30:02.361+0000 I REPL [ReplicationExecutor] VoteRequester: Got error processing response with status: BadValue: Unexpected field info in ReplSetRequestVotes, resp:{ info: "run rs.initiate(...) if not yet done for the set", ok: 0.0, errmsg: "no replset config has been received", code: 94 }
2016-10-15T08:30:02.361+0000 I REPL [ReplicationExecutor] VoteRequester: Got no vote from 2.mongo-subsequent.q:27017 because: , resp:{ info: "run rs.initiate(...) if not yet done for the set", ok: 0.0, errmsg: "no replset config has been received", code: 94 }
2016-10-15T08:30:02.362+0000 I REPL [ReplicationExecutor] not running for primary, we received insufficient votes
# and so on...
# this is 1.mongo-subsequent.q. it faces the same problems as 2.mongo-subsequent.q.
# i call them "subsequent" (vs "initial") because they were not the first node i ran `rs.initialize()` etc on.
# (though now i realize the naming is confusing since i actually waited for them to boot up before setting up the "initial"
# node so that `rs.add(...)` wouldn't timeout)
# from 1.mongo-subsequent.q, I can `mongo mongo-initial.q:27017` or `mongo 10.0.11.2` so everything seems to resolve properly
# cat /etc/mongod.conf -> {"net":{"bindIp":["127.0.0.1","10.0.9.118","10.0.11.2"]},"replication":{"replSetName":"rs0"}}
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] MongoDB starting : pid=10 port=27017 dbpath=/data/db 64-bit host=e88848632507
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] db version v3.2.10
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] git version: 79d9b3ab5ce20f51c272b4411202710a082d0317
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1t 3 May 2016
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] allocator: tcmalloc
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] modules: none
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] build environment:
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] distmod: debian81
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] distarch: x86_64
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] target_arch: x86_64
2016-10-15T08:29:38.516+0000 I CONTROL [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: true }, replication: { replSet: "rs0" } }
2016-10-15T08:29:38.534+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2016-10-15T08:29:38.618+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2016-10-15T08:29:38.618+0000 I CONTROL [initandlisten]
2016-10-15T08:29:38.618+0000 I CONTROL [initandlisten]
2016-10-15T08:29:38.618+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2016-10-15T08:29:38.618+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
2016-10-15T08:29:38.619+0000 I CONTROL [initandlisten]
2016-10-15T08:29:38.619+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2016-10-15T08:29:38.619+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
2016-10-15T08:29:38.619+0000 I CONTROL [initandlisten]
2016-10-15T08:29:38.637+0000 I REPL [initandlisten] Did not find local voted for document at startup; NoMatchingDocument: Did not find replica set lastVote document in local.replset.election
2016-10-15T08:29:38.638+0000 I REPL [initandlisten] Did not find local replica set configuration document at startup; NoMatchingDocument: Did not find replica set configuration document in local.system.replset
2016-10-15T08:29:38.638+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
2016-10-15T08:29:38.638+0000 I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker
2016-10-15T08:29:38.647+0000 I NETWORK [initandlisten] waiting for connections on port 27017
2016-10-15T08:29:40.017+0000 I NETWORK [initandlisten] connection accepted from 10.0.11.2:36742 #1 (1 connection now open)
2016-10-15T08:29:40.019+0000 I NETWORK [conn1] end connection 10.0.11.2:36742 (0 connections now open)
2016-10-15T08:29:41.371+0000 I NETWORK [initandlisten] connection accepted from 10.0.11.2:36752 #2 (1 connection now open)
2016-10-15T08:29:41.372+0000 I NETWORK [conn2] end connection 10.0.11.2:36752 (0 connections now open)
2016-10-15T08:29:41.373+0000 I NETWORK [initandlisten] connection accepted from 10.0.11.2:36754 #3 (1 connection now open)
2016-10-15T08:29:41.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:29:41.378+0000 I NETWORK [initandlisten] connection accepted from 10.0.11.2:36756 #4 (2 connections now open)
2016-10-15T08:29:41.378+0000 I NETWORK [conn4] end connection 10.0.11.2:36756 (1 connection now open)
2016-10-15T08:29:51.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:29:51.384+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:29:53.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:30:03.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:30:03.378+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:30:05.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:30:15.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:30:15.380+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:30:17.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:30:27.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:30:27.379+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:30:29.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:30:39.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:30:39.379+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:30:41.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:30:51.382+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:30:53.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:31:03.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:31:03.383+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:31:05.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:31:15.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:31:15.379+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
2016-10-15T08:31:17.374+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to b188dd520426:27017
2016-10-15T08:31:27.374+0000 I REPL [ReplicationExecutor] Error in heartbeat request to b188dd520426:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
2016-10-15T08:31:27.382+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Failed to connect to b188dd520426:27017 - HostUnreachable: HostUnreachable
# and so on...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment