This is a "quick and simple guide" about how to use Terraform in your projects 🤓
First, install Terraform following their [installation guide].
Then check your installation...
By Anthony Vilarim Caliani
This is an experiment using Spark array functions.
In this example I'm using a Fraudulent Transactions Data dataset, so thanks to Chitwan Manchanda for sharing his dataset.
"""Information Script for Apache Spark. | |
How to use? | |
> spark-submit info.py | |
""" | |
import os | |
import platform | |
import sys | |
from contextlib import contextmanager |
import logging as log | |
import sys | |
from functools import wraps | |
from random import choice | |
from typing import Any, List | |
FRUITS = ['🍏', '🍎', '🍐', '🍊', '🍋', '🍌', '🍉', '🍇', '🍓', '🍈', '🍒', '🍑', '🥭', '🍍', '🥥', '🥝', '🍅'] | |
SPORTS = ['⚽️', '🏀', '🏈', '⚾️', '🎾', '🏐', '🎱'] | |
FROM python:3.9 | |
ENV JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64" | |
ENV SPARK_HOME="/opt/spark" | |
ENV SPARK_VERSION="3.1.2" | |
ENV HADOOP_VERSION="3.2" | |
ENV PATH="$SPARK_HOME/bin:$PATH" | |
ENV PYSPARK_PYTHON=python |
import platform | |
import sys | |
if __name__ == '__main__': | |
py_version = platform.python_version() | |
prefix = sys.prefix | |
base_prefix = sys.base_prefix if py_version.startswith('3') else sys.exec_prefix | |
print('------------< venv checker >------------') | |
print('Python Version.: {}'.format(py_version)) | |
print('Prefix.........: {}'.format(prefix)) |
By Anthony Vilarim Caliani
This is an example of writing a single positinal file.
In this example I'm using a Avocado Prices dataset, so thanks to Justin Kiggins for sharing his dataset.
The important thing here is the code, but if you want to execute it there is a run.sh
to help you out.
By Anthony Vilarim Caliani
This is an example of working with Partitioned Parquet, here you will find how to read and write partitioned parquet files.
In this example I'm using a Netflix Shows dataset, so thanks to Shivam Bansal for sharing his dataset.
The important thing here is the code, but if you want to execute it there is a run.sh
to help you out.