Skip to content

Instantly share code, notes, and snippets.

@Alex-Ikanow
Last active December 14, 2015 01:49
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Alex-Ikanow/5008868 to your computer and use it in GitHub Desktop.
Save Alex-Ikanow/5008868 to your computer and use it in GitHub Desktop.
Chunk migration errors after adding second shard
--------------------------------
SHARD1 LOGS:
Soon after adding the second shard:
Mon Feb 18 16:10:31 [conn68] moveChunk updating self version to: 2|1||000000000000000000000000 through { index: "000007bdc6f2a4f19f36149b08cac55" } -> { index: "01d8d8f3fcac4efa684888aa46d9008f" } for collection 'feature.association'
---
Mon Feb 18 16:11:18 [conn29881] assertion 13388 [feature.association] shard version not ok in Client::Context: version mismatch detected for feature.association, stored major version 2 does not match received 1 ( ns : feature.association, received : 1|84||000000000000000000000000, wanted : 2|0||000000000000000000000000, send ) ( ns : feature.association, received : 1|84||000000000000000000000000, wanted : 2|0||000000000000000000000000, send ) ns:feature.$cmd query:{ findandmodify: "association", query: { index: "14be3fa342209dacb92cfa69073829e0", communityId: ObjectId('4c927585d591d31d7b37097a')}, fields: { doccount: 1, entity1: 1, entity2: 1, verb: 1, db_sync_time: 1, db_sync_doccount: 1 }, update: { $addToSet: { entity1: { $each: [ "Fiserv Inc.", "fiserv, inc./company" ] }, entity2: { $each: [ "that its president and chief executive officer, jeffery yabuki, will present at the baird business solutions conference on february 26, ..." ] }, verb: { $each: [ "announce", "generic relations" ] } }, $inc: { doccount: 1 } }, upsert: true }
...
(note that index is specified below outside of the range of the chunk that errors - I'm therefore assuming there's a good chance that this is a red herring)
on Feb 18 16:11:18 [conn29881] stale version detected during query over feature.$cmd : { $err: "[feature.association] shard version not ok in Client::Context: version mismatch detected for feature.association, stored major version 2 does not matc...", code: 13388, ns: "feature.association", vReceived: Timestamp 1000|84, vReceivedEpoch: ObjectId('000000000000000000000000'), vWanted: Timestamp 2000|0, vWantedEpoch: ObjectId('000000000000000000000000') }
......
Mon Feb 18 16:11:18 [conn21390] assertion 13388 [feature.association] shard version not ok in Client::Context: version mismatch detected for feature.association, stored major version 2 does not match received 1 ( ns : feature.association, received : 1|84||000000000000000000000000, wanted : 2|0||000000000000000000000000, send ) ( ns : feature.association, received : 1|84||000000000000000000000000, wanted : 2|0||000000000000000000000000, send ) ns:feature.$cmd query:{ findandmodify: "association", query: { index: "14be3fa342209dacb92cfa69073829e0", communityId: ObjectId('4c927585d591d31d7b37097a')}, fields: { doccount: 1, entity1: 1, entity2: 1, verb: 1, db_sync_time: 1, db_sync_doccount: 1 }, update: { $addToSet: { entity1: { $each: [ "Fiserv Inc.", "fiserv, inc./company" ] }, entity2: { $each: [ "that its president and chief executive officer, jeffery yabuki, will present at the baird business solutions conference on february 26, ..." ] }, verb: { $each: [ "announce", "generic relations" ] } }, $inc: { doccount: 1 } }, upsert: true }
...
Mon Feb 18 16:11:18 [conn21390] stale version detected during query over feature.$cmd : { $err: "[feature.association] shard version not ok in Client::Context: version mismatch detected for feature.association, stored major version 2 does not matc...", code: 13388, ns: "feature.association", vReceived: Timestamp 1000|84, vReceivedEpoch: ObjectId('000000000000000000000000'), vWanted: Timestamp 2000|0, vWantedEpoch: ObjectId('000000000000000000000000') }
---
Then the following pattern of log messages repeated every few seconds:
Mon Feb 18 16:11:29 [conn89] moveChunk data transfer progress: { active: true, ns: "feature.association", from: "replica_set1/10.40.131.234:27018,10.40.213.215:27018", min: { index: "000007bdc6f22a4f19f36149b08cac55" }, max: { index: "01d8d8f3fcac4efa684888aa46d9008f" }, shardKeyPattern: { index: 1 }, state: "clone", counts: { cloned: 0, clonedBytes: 0, catchup: 0, steady: 0 }, ok: 1.0 } my mem used: 0
Then:
Mon Feb 18 16:11:39 [conn89] warning: moveChunk error transferring data caused migration abort: { active: false, ns: "feature.association", from: "replica_set1/10.40.131.234:27018,10.40.213.215:27018", min: { index: "000007bdc6f22a4f19f36149b08cac55" }, max: { index: "01d8d8f3fcac4efa684888aa46d9008f" }, shardKeyPattern: { index: 1 }, state: "fail", errmsg: "Invalid modifier specified: _id", counts: { cloned: 16615, clonedBytes: 8342065, catchup: 0, steady: 0 }, ok: 1.0 }
---------------------
AFTER THIS EVERY TIME...
REPLICA 1
Tue Feb 19 09:56:11 [conn30135] moveChunk data transfer progress: { active: false, ns: "feature.association", from: "replica_set1/10.40.131.234:27018,10.40.213.215:27018", min: { index: "000007bdc6f22a4f19f36149b08cac55" }, max: { index: "01d8d8f3fcac4efa684888aa46d9008f" }, shardKeyPattern: { index: 1 }, state: "fail", errmsg: "Invalid modifier specified: _id", counts: { cloned: 16619, clonedBytes: 8341492, catchup: 0, steady: 0 }, ok: 1.0 } my mem used: 0
Tue Feb 19 09:56:11 [conn30135] warning: moveChunk error transferring data caused migration abort: { active: false, ns: "feature.association", from: "replica_set1/10.40.131.234:27018,10.40.213.215:27018", min: { index: "000007bdc6f22a4f19f36149b08cac55" }, max: { index: "01d8d8f3fcac4efa684888aa46d9008f" }, shardKeyPattern: { index: 1 }, state: "fail", errmsg: "Invalid modifier specified: _id", counts: { cloned: 16619, clonedBytes: 8341492, catchup: 0, steady: 0 }, ok: 1.0 }
--------------------------------
SHARD2 LOGS:
Tue Feb 19 16:09:43 [migrateThread] scoped connection to replica_set1/10.40.131.234:27018,10.40.213.215:27018 not being returned to the pool
Tue Feb 19 16:09:43 [migrateThread] about to log metadata event: { _id: "ip-10-226-114-59-2013-02-19T16:09:43-1599", server: "ip-10-226-114-59", clientAddr: ":27017", time: new Date(1361290183556), what: "moveChunk.to", ns: "feature.association", details: { min: { index: "000007bdc6f22a4f19f36149b08cac55" }, max: { index: "01d8d8f3fcac4efa684888aa46d9008f" }, step1 of 5: 2, step2 of 5: 1271, note: "aborted" } }
Tue Feb 19 16:09:43 [migrateThread] ERROR: migrate failed: Invalid modifier specified: _id
Tue Feb 19 16:09:46 [migrateThread] starting receiving-end of migration of chunk { index: "000007bdc6f22a4f19f36149b08cac55" } -> { index: "01d8d8f3fcac4efa684888aa46d9008f" } for collection feature.association from replica_set1/10.40.131.234:27018,10.40.213.215:27018 (1 slaves detected)
Tue Feb 19 16:09:47 [migrateThread] warning: moveChunkCmd deleted data already in chunk # objects: 16619
mongos> a=db.runCommand({distinct:"association",key:"_id",query:{ index:{"$gte":"000007bdc6f22a4f19f36149b08cac55","$lt":"01d8d8f3fcac4efa684888aa46d9008f"} }})
mongos> a.values.length
27286
--------------------------------
Some browsing of the data (shard 1):
mongos> db.association.count({ index:{"$gte":"000007bdc6f22a4f19f36149b08cac55","$lt":"01d8d8f3fcac4efa684888aa46d9008f"},_id:{$type:7}})
27286
mongos> db.association.count({ index:{"$gte":"000007bdc6f22a4f19f36149b08cac55","$lt":"01d8d8f3fcac4efa684888aa46d9008f"}})
27286
--------------------------------
Some browsing of the data (shard2):
replica_set2:PRIMARY> db.association.find({},{index:1}).sort({index:1}).limit(1)
{ "_id" : ObjectId("506679b9c3deaab555f58ed9"), "index" : "0000114deb4e57a53f8aa31580a5bd5f" }
replica_set2:PRIMARY> db.association.find({},{index:1}).sort({index:-1}).limit(1)
{ "_id" : ObjectId("4f233052275e7a960a01f6ef"), "index" : "01d8d77c8a321837840e64a80595e1cb" }
replica_set1:PRIMARY> db.association.findOne({ "index" : "0000114deb4e57a53f8aa31580a5bd5f" })
{
"_id" : ObjectId("506679b9c3deaab555f58ed9"),
replica_set2:PRIMARY> db.association.count()
16619
replica_set2:PRIMARY> db.association.remove({})
replica_set2:PRIMARY> db.association.count()
0
<a little while later>
replica_set2:PRIMARY> db.association.count()
16619
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment