GuidoS/.gitignore

## .gitignore
config.json

## 2021_test.md

      
    Raw
  

              2021_test.md
            
          
    20210123 - Carto Test


I have updated the document adding my responses as block quotes to make it clear where I have added to the document. Please let me know if you have any further questions.
-Guido

Warming up with two support tickets:

Ticket #1

Hi folks,
I wanted to add interactivity to my map, but for some reason I can't make my popups work. What am I doing wrong? Please find my code here. Thanks!
Please find my code here, and a live version here. Thanks!
Ticket #1 Response


Hello Client,
I am sorry that you have been having issues with setting up your popup window.
I believe I have found the solution to your issue. After creating the content for the popup object, the content variable is not added to your popup object. Here is an example of code that will resolve this issue:
if (!popup.isOpen()) {
  popup.setContent(content); // add this line to add content to your popup object
  popup.openOn(map);
}
Please let me know if you have any other issues that I can address >for you.
Guido Stein

Carto Support Team

Ticket #2

Hello Support,
I'm going through your materials, I've seen this query somewhere, and I'd need some help from your side on understanding what it does, can you give me a hand?
SELECT
    e.name,
    count(*) AS counts,
    sum(p.population) AS population
FROM
    european_countries e
    JOIN populated_places p
    ON  ST_Intersects(p.the_geom, e.the_geom)
GROUP BY e.name
Ticket #2 Response


Hello Client,
Happy to help you with this SQL explanation.
The following is a plain english translation of what the query is doing:

JOINs european_countries and populated_places tables ON their geometries intersect
Groups all the aggregated data using distinct european_countries names
SELECTs

the name from the european_countries table
the aggregated counts of populated places that intersect the named country
the aggregated population sum across populated places that intersect the named country


Please let me know if you have any other questions.
Guido Stein

Carto Support Team

Python Support Engineer Test - Sample creation

A customer would like to host and provision 10 row data samples to their clients and, since some source data would be geospatial, has decided to use CARTO for this purpose and would like a CARTO Support Team to help with that.
Goal:
Develop an ETL routine that reads a CSV or geospatial dataset, limits the results to 10 rows and then uploads it to a CARTO account.
The routine can be created in a .py script or Notebook format such as Jupyter Notebook.
Code efficiency, style and explanation will be evaluated.
Materials:
Sample dataset: https://data.cityofnewyork.us/Education/NYC-School-District-Boundaries/p5vh-vm7p
Steps for creating a free CARTO account here
Resources
You can use any documentation available.
TIP: Within our Developer Center CARTOframes page you can find some potentially useful resources: https://carto.com/developers/cartoframes/
In addition, here you can find specific useful links:
https://carto.com/developers/cartoframes/guides/Authentication/
https://carto.com/developers/cartoframes/guides/Data-Management/
https://carto.com/developers/cartoframes/examples/#example-upload-to-carto 👀
https://carto.com/developers/cartoframes/reference/
Bonus Track [100% OPTIONAL]
Create a routine that reads and uploads a CSV dataset into CARTO in chunks of approximately 100 MB.
Additional resources:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Python Support Engineer Test - Sample creation Response


I have written a CLI app called upload2Carto that can take the api data endpoints for csv or geojson based on the source identified. I have also created a template config file that can be used to set up username, api_key, url, and limit for this application. This should limit the public exposure of private keys.
I have written the code to be self documenting. I have also added information so that the CLI tool has information for usage.

CARTO On-premise exploration

There is a CARTO on-premise installation that you can connect to using the following command:
ssh -p 22022 -i support_candidate_test_rsa centos@support-engineer-test.carto.solutions
The support-test-rsa.pem key is available here.
For this test, you'll need to log in to the server and find the following information about the installation:
RPM packages related to CARTO that are installed on the system.
Directories that hold CARTO software, data and/or configuration files.
CARTO services/processes running on the system and the ports where they're listening.
Important notes:
(1) the full test can be completed without modifying anything in the system configuration. If you accidentally break something, don't worry, but please get in touch with HR, so we can restore the system.
(2) command history is disabled: take this into account in case you want to save any command
BONUS TRACK [100% OPTIONAL]

The 443 port is closed to the world, but you'll get extra points in the test if you manage to access the web application from your browser :)
Some tips in case you want to try:
The application is configured to work under the "carto.lan" domain name.
User: carto
Password: location

RPM packages related to CARTO that are installed on the system:

I ran the following command to:

list all the installed packages
filter that list to lines that include "carto"
print onoly the first column

yum list installed | grep carto | awk '{print $1}'

Here is the resulting output
carto-builder.x86_64
carto-dataservices.x86_64
carto-deps.x86_64
carto-fonts.x86_64
carto-maps-api.x86_64
carto-nginx.x86_64
carto-platter.x86_64
carto-postgresql.x86_64
carto-redis.x86_64
carto-sql-api.x86_64
carto-varnish.x86_64


Directories that hold CARTO software, data and/or configuration files.

I ran the following command to:

list all the currently running processes
filter lines to ones that include carto
print column 11 if 11 is not empty
remove duplicates

ps -ax | grep carto | awk '{if ($11) print $11}' | awk '!visited[$0]++'

Here is the resulting output
/opt/carto/builder/embedded/cartodb/config.ru
resque:work
-p
/data/config/varnish/default.vcl>

After reviewing the output and investigating the one carto folder that was returned, I believe that all the carto applications and associated files are stored in directory /opt/carto/.

CARTO services/processes running on the system and the ports where they're listening

I ran the following command to:

list all the currently running processes
filter lines to ones that include carto
print PID, name, and port information if there was port information in the 8th column

ps -ax | grep carto | awk '{ if ($8) print $1, $5, $8 }'

Here is the resulting output
82 unicorn /data/config/builder/unicorn.conf.rb
87 /bin/sh exec
90 /opt/carto/sql-api/embedded/bin/node -c
91 /opt/carto/maps-api/embedded/bin/node -c
163 /opt/carto/dataservices/embedded/bin/postgres --config_file=/data/config/dataservices/postgresql.conf
166 /opt/carto/postgresql/embedded/bin/postgres --config_file=/data/config/postgresql/postgresql.conf
178 nginx: /opt/carto/nginx/embedded/sbin/nginx
261 /opt/carto/varnish/embedded/sbin/varnishd -P
273 unicorn /data/config/builder/unicorn.conf.rb
275 unicorn /data/config/builder/unicorn.conf.rb
278 unicorn /data/config/builder/unicorn.conf.rb
283 unicorn /data/config/builder/unicorn.conf.rb
285 unicorn /data/config/builder/unicorn.conf.rb
288 unicorn /data/config/builder/unicorn.conf.rb
291 unicorn /data/config/builder/unicorn.conf.rb
295 unicorn /data/config/builder/unicorn.conf.rb
304 /opt/carto/varnish/embedded/sbin/varnishd -P
766 postgres: 127.0.0.1(48420)
767 postgres: 127.0.0.1(48422)
768 postgres: 127.0.0.1(48426)
769 postgres: 127.0.0.1(48428)
770 postgres: 127.0.0.1(48430)
771 postgres: 127.0.0.1(48432)
772 postgres: 127.0.0.1(48434)
773 postgres: 127.0.0.1(48436)
778 postgres: 127.0.0.1(48438)
6740 postgres: 127.0.0.1(39434)
6741 postgres: 127.0.0.1(39438)
11063 postgres: 127.0.0.1(46782)
11064 postgres: 127.0.0.1(46790)
11066 postgres: 127.0.0.1(46792)
11068 postgres: 127.0.0.1(46802)
11081 postgres: 127.0.0.1(46874)
13688 postgres: 127.0.0.1(47746)
13689 postgres: 127.0.0.1(47750)
13690 postgres: 127.0.0.1(47762)
13691 postgres: 127.0.0.1(47768)
13693 postgres: 127.0.0.1(47770)
13695 postgres: 127.0.0.1(47836)
32425 postgres: 127.0.0.1(42156)
32426 postgres: 127.0.0.1(42160)
32449 postgres: 127.0.0.1(42194)
32451 postgres: 127.0.0.1(42196)
32462 postgres: 127.0.0.1(42240)


## template_config.json
{
  "username": "",
  "api_key": "",
  "url": "https://data.cityofnewyork.us/api/geospatial/p5vh-vm7p?method=export&format=GeoJSON",
  "limit":10
}

## upload2Carto.py
import sys
import json
import argparse
import os
from pandas import read_csv
from geopandas import GeoDataFrame, read_file
from shapely import wkt
from cartoframes.auth import Credentials
from cartoframes import to_carto

CONFIG = 'config.json'
CREDS = Credentials.from_file(CONFIG)


def main():
    # get config data and update if cli args are passed
    args = get_config(CONFIG)
    cli_args = get_cli_args()
    args.update(cli_args)

    # get gdf and send it to carto
    gdf = get_gdf(args.get('url')).head(args.get('limit'))
    to_carto(gdf, 'test_data_upload', if_exists='replace', credentials=CREDS)


def get_config(config_file):
    return_json = {}
    try:
        with open(CONFIG) as f:
            return_json = json.load(f)
    except:
        pass

    return return_json


def get_cli_args():

    my_parser = argparse.ArgumentParser(
        prog='upload2Carto', description='App for uploading data to Carto')

    my_parser.add_argument('--user', dest='user', help='carto username')
    my_parser.add_argument('--key', dest='key', help='carto key')
    my_parser.add_argument('--url', dest='url', help='data source url')
    my_parser.add_argument('--limit', dest='limit',
                           help='limit number of rows', type=int)

    args = my_parser.parse_args()
    args_with_values = {k: v for (k, v) in vars(args).items() if v}
    return args_with_values


def get_gdf(source):
    gdf = None

    # Process csv data
    if source.find('csv') > -1:
        df = read_csv(source)
        df['the_geom'] = df['the_geom'].apply(wkt.loads)
        gdf = GeoDataFrame(df, geometry='the_geom')

    # Process geojson
    if source.find('GeoJSON')> -1:
        gdf = read_file(source)

    return gdf


if __name__ == "__main__":
    # execute only if run as a script
    main()
	{
	"username": "",
	"api_key": "",
	"url": "https://data.cityofnewyork.us/api/geospatial/p5vh-vm7p?method=export&format=GeoJSON",
	"limit":10
	}
	import sys
	import json
	import argparse
	import os
	from pandas import read_csv
	from geopandas import GeoDataFrame, read_file
	from shapely import wkt
	from cartoframes.auth import Credentials
	from cartoframes import to_carto

	CONFIG = 'config.json'
	CREDS = Credentials.from_file(CONFIG)


	def main():
	# get config data and update if cli args are passed
	args = get_config(CONFIG)
	cli_args = get_cli_args()
	args.update(cli_args)

	# get gdf and send it to carto
	gdf = get_gdf(args.get('url')).head(args.get('limit'))
	to_carto(gdf, 'test_data_upload', if_exists='replace', credentials=CREDS)


	def get_config(config_file):
	return_json = {}
	try:
	with open(CONFIG) as f:
	return_json = json.load(f)
	except:
	pass

	return return_json


	def get_cli_args():

	my_parser = argparse.ArgumentParser(
	prog='upload2Carto', description='App for uploading data to Carto')

	my_parser.add_argument('--user', dest='user', help='carto username')
	my_parser.add_argument('--key', dest='key', help='carto key')
	my_parser.add_argument('--url', dest='url', help='data source url')
	my_parser.add_argument('--limit', dest='limit',
	help='limit number of rows', type=int)

	args = my_parser.parse_args()
	args_with_values = {k: v for (k, v) in vars(args).items() if v}
	return args_with_values


	def get_gdf(source):
	gdf = None

	# Process csv data
	if source.find('csv') > -1:
	df = read_csv(source)
	df['the_geom'] = df['the_geom'].apply(wkt.loads)
	gdf = GeoDataFrame(df, geometry='the_geom')

	# Process geojson
	if source.find('GeoJSON')> -1:
	gdf = read_file(source)

	return gdf


	if __name__ == "__main__":
	# execute only if run as a script
	main()