Skip to content

Instantly share code, notes, and snippets.

@hbarcelos
Last active July 1, 2019 01:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hbarcelos/f6b2973f490a3795a9f64966d3486416 to your computer and use it in GitHub Desktop.
Save hbarcelos/f6b2973f490a3795a9f64966d3486416 to your computer and use it in GitHub Desktop.
Kleros Questions
1) How and where do you like to work?
I currently live in São Paulo, Brazil. I am open to moving to Lisbon, but it could take a while, so for now I'd choose to work remotely.
2) What is the most interesting programming problem you’ve worked on?
I was once performing a live production MongoDB migration, which couldn't allow any downtime due to business
constraints (lots of money would be lost). To complicate things further, the architecture consisted of multiple
"microservices", each one of them running in multiple instances (a total of about 200 VMs), but all of them shared the
same databases (the MongoDB being migrated and some Redis clusters).
The only available strategy was to spin up the new MongoDB cluster and sync it with the old one.
So we would deploy all applications with the 2 data sources available, but pointing to the old one. Once the clusters
were in sync, we would pull the switch and make the applications point to the new one. Once everything was pointing to
the new cluster and the writes to the old one had stopped, we would make another deploy containing only the new data
source.
While this might look simple, there was an important problem: because the environment was elastic, there could be
scale-in and scale-out events during the MongoDB cluster sync. If we pulled the switch only once, all the live
instances would correctly write to the new cluster, but newly spinned instances would still have the default
configuration of connecting to the old one, so they would try to write to it. This could lead to severe
inconsistencies.
What we did was to take advantage that every instance was connected to a given Redis cluster as well and use it as a
pub/sub channel. Instead of "pulling the switch", we would broadcast a message to all instances to make them use the
new cluster every 200ms. This threshold was chosen because it was lower than the time all applications would take to
become available since their start.
That would ensure that even instances created by a scale-out would get the message to switch to the new cluster before
it was able to receive any requests.
The strategy worked pretty well, as the total revenue loss was roughly 60 USD. We were not able to find the root cause,
but it was probably due to some instance being stuck for a few seconds in the old cluster because of either MongoDB or
Redis driver problems (probably Redis, because we were using an outdated client).
3) What would you improve in one of our products?
I know it might be too early for that, but I'm following the Kleros vs Ricky 50 ETH Doge Escrow case
(https://court.kleros.io/cases/92) and finding it very cool. In the future, probably it will help to bring new people
aboard by having a "Wall of Fame" of infamous cases, showing the main highlights of each one.
For more short term, I noticed that in the Court's Cases tab (https://court.kleros.io/cases) you can only see the cases
in which you are a juror. It would be nice to have a case search tool. It would serve for some purposes:
- When I'm a juror in a given case, I can search for precedents to see how similar cases were judged in the past.
- Given I'm a newbie in the platform, I could better grasp what it is about, by being able to see what has already
been judged.
From a developer perspective, by trying to answer question #4, I realized that there is no central repository holding
the contract addresses, which made it pretty difficult to find it, because Github search is totally broken. There should
probably be a "Developer Center" where all those kind of vital information is easy to access.
4) What is the value of ‘maxDrawingTime’ currently set in the KlerosLiquid contract deployed to mainnet that
https://court.kleros.io uses?
The value of `maxDrawingTime` at June 30th 2019 at 21:26 (GMT -3) is `7200`.
The hardest part was to find the actual address of KlerosLiquid on the mainnet. I happen to know Matheus, who works at
Kleros, and pointed me to the right place to find it.
Here's the link to the gist with the code I used to get the answer:
https://gist.github.com/hbarcelos/f03095f2f7bf33d99d6e607b2136f66e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment