Create a gist now

Instantly share code, notes, and snippets.

Embed
Infrastructure @ CARTO dev technical test - DB iterator

DB iterator

Databases, and specially PostgreSQL, are a core part of CARTO products. Depending on the internal data organization of a database server, running some tasks on it may be tricky. We constantly work towards improving performance and operations in databases.

You will develop a small program that iterates through every table of every database in a PostgreSQL creating a new simple index based on the "name" column.

The program will show at the end two statistic values. The 90 percentile and the max time to create every index. This output must be something easy to read and parse by an hipotetic metrics system.

You must prepopulate the database first. We don't want you to spend time on writing a program to prepopulate the database. We already have a small script that can do that for you. It's compatible with PostgreSQL 9.x and 10. You can grab it from https://gist.github.com/luisbosque/27c33a678e448af16894550c13130b8f.

We don't need you to send us any postgresql deployment or something similar. We will just test your program against one of our local postgresql databases that contains the same data schema that the previous script populates.

We expect

  • The program to be as fast and efficient as you can. We will test your code with a high number of databases/tables/rows. We are talking about hundreds or thousands of objects. That's why the efficiency of the program matters
  • The program to have, at least, two inputs. The IP and username to connect to PostgreSQL
  • The program to handle, at least, simple code exceptions
  • The program, that will iterate the database and show the final metric, to be written in a formal language like Python, Ruby, Perl, Go, Java, C, etc... Please, avoid shell scripting or any other similar thing
  • Clean, not repeated and well documented code
  • You to tell us exactly what language and what version we need in order to test it in a fully compatible environment. It must run under a linux environment
  • That we will be able to run/test the program just ejecuting one command. If the program has any extra dependency, this command must be responsible to install it. In the same way, if the code needs to be compil ed, the main program must be responsible to run the sub commands to compile it before running it. Also, any extra comment about the exercise you must include it within the program (a README file or something like that)

Extra balls

  • Write two simple code tests
  • The code will use different resources of the server and probably even the client. The right balance between different resources usage is important to not collapse the instance. The extra ball is about making t he program smart (or just thinking about it and sending us a brief description) to never collapse the server during the execution and be as fast as possible, no matter of the amount of resources the server has
  • Think about how to make the populate script more efficient and send us a brief description about it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment