Skip to content

Instantly share code, notes, and snippets.

@hinnerk
Created March 7, 2012 12:20
Show Gist options
  • Save hinnerk/1992792 to your computer and use it in GitHub Desktop.
Save hinnerk/1992792 to your computer and use it in GitHub Desktop.
Timestamp collisions demo for this blog post: http://randnotizen.de/post/18898864701/will-openstack-swift-lose-data
import time
def test_time():
collisions = 0 # number of collisions
dist = {} # distinct values
microsecond = 0.000001
count = 0
while count < 1000000:
count += 1
t1 = time.time()
time.sleep(microsecond)
t2 = time.time()
s1 = "%016.05f" % t1
s2 = "%016.05f" % t2
if s1 == s2:
collisions += 1
else:
diff = t2 - t1
dist[diff] = dist.get(diff, 0) + 1
percent = collisions / (count / 100.0)
print "Collisions:\t%s (%s%%)" % (collisions, percent)
print "Min value:\t%s" % min(dist.iterkeys())
print "Max value:\t%s" % max(dist.iterkeys())
print "Dist values:\t%s" % len(dist)
if __name__ == '__main__':
test_time()
@gholt
Copy link

gholt commented Mar 7, 2012

Cool on the disqus thing, we should probably move over to that. :)

What exactly happens depends somehwat upon the number of replicas Swift is configured for. If we assume the standard 3 replicas, you could theoretically end up with 3 different "answers" to the question: What is the data in this object? But, since all three were at the exact same collision time, there's really no way to know which is most correct anyway.

Many systems use a content hash to resolve these conflicts at sync, but early in Swift's development it was decided that highly concurrent single object writes weren't a use case for Swift needing such extra resolution steps.

For Swift, it would keep those multiple versions until the hardware on one failed, in which case it would get one of the other versions, etc. until there was just one version again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment