# note that I see *EXACT* same results as the original post with the original code.
# The fix below makes that not happen anymore (use snapshot mode)
# mongo.jar on my machine is the shipped version of 2.7.0, so same exact Java Driver of original test
# ( I was lead on the 2.7.0 release and did the build/release )
b-mac mongodb/mongo-java-driver ‹master*› » scala -cp mongo.jar Repro.scala
Inserting canary...
Inserting test data...
Paging through records...
Spotted the canary!
Updating canary object...
b-mac mongodb/mongo-java-driver ‹master*› »
import com.mongodb._
import java.util.UUID
// Connect to Mongo
val mongo = new Mongo("localhost", 27017)
val db = mongo.getDB("repro_databoxor")
db.dropDatabase() // make sure our dup also isn't running the test twice and re-inserting...
val collection = db.getCollection("repro")
var canarySightings = 0
// Insert our "canary" object.
println("Inserting canary...")
val canary = new BasicDBObject()
canary.put("name", "canary")
canary.put("value", "value")
// Insert 1,000,000 other objects.
println("Inserting test data...")
for (i <- 1 to 100000) {
val doc = new BasicDBObject()
doc.put("name", UUID.randomUUID.toString)
doc.put("value", UUID.randomUUID.toString)
// The function we'll call to operate on records returned from the DB.
def shipOrderToCustomer(doc: DBObject) {
if (doc.get("name") == "canary") {
canarySightings += 1
println("Spotted the canary!")
if (canarySightings > 1) println("Whoops, shipped the same order multiple times!")
// In one thread (or process or machine, etc.), read through records an act on them.
val reader = new Thread(new Runnable {
def run = {
println("Paging through records...")
// Switch cursor to snapshot mode to ensure we don't see duplicates due to update moves
val cursor = collection.find().snapshot()
while (cursor.hasNext)
// In another thread (or process, machine, etc.), update one of the records.
val updater = new Thread(new Runnable {
def run = {
println("Updating canary object...")
val query = new BasicDBObject()
query.put("name", "canary")
val newDoc = new BasicDBObject()
newDoc.put("name", "canary")
var value = ""
for (i <- 1 to 1000) value += UUID.randomUUID.toString
newDoc.put("value", value)
collection.update(query, newDoc)
For what it's worth, you'll probably be much happier using Casbah (the scala driver) instead of raw Java from Scala; things such as proper iterators and support for native Scala types make life much easier.

As to your issue w/ multiple inserts, it looks likely to be a cursor issue. By default query results are not snapshotted; your update is likely causing the file to exceed it's allocated space on the on-disk file and be relocated to the end, which could cause you to see the same record twice on a previously opened cursor (because it moved from it's original location and the cursor passes it over again).

An 11 character change will fix this problem. (See Above)

