This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/python | |
| # -*- coding: utf8 -*- | |
| # SAMPLE SUBMISSION TO THE BIG DATA HACKATHON 13-14 April 2013 'Influencers in a Social Network' | |
| # .... more info on Kaggle and links to go here | |
| # | |
| # written by Ferenc Huszár, PeerIndex | |
| from sklearn import linear_model | |
| from sklearn.metrics import auc_score |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pandas as pd | |
| import os | |
| import shutil | |
| def extract(path: str = "s3://my_bucket_name/file0.parquet") -> pd.DataFrame: | |
| df = pd.read_parquet(path) | |
| return df | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pandas as pd | |
| import numpy as np | |
| import matplotlib | |
| import matplotlib.pyplot as plt | |
| import seaborn as sns | |
| import missingno | |
| import warnings | |
| warnings.filterwarnings("ignore") | |
| %matplotlib inline |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ## Importing libraries | |
| import numpy as np | |
| import pandas as pd | |
| import datetime as dt | |
| import seaborn as sns | |
| import matplotlib.pyplot as plt | |
| #For inline Chart Display | |
| %matplotlib inline |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from pyspark.context import SparkContext | |
| from awsglue.context import GlueContext | |
| from pyspark.sql import SQLContext | |
| sc = SparkContext() | |
| glueContext = GlueContext(sc) | |
| spark = glueContext.spark_session | |
| # EXTRACT: Reading parquet data |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Files for amazon redshift sql queries with aws lambda |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #CODE | |
| #Generate root password | |
| import random, string | |
| password = ''.join(random.choice(string.ascii_letters + string.digits) for i in range(20)) | |
| #Download ngrok | |
| ! wget -q -c -nc https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip | |
| ! unzip -qq -n ngrok-stable-linux-amd64.zip | |
| #Setup sshd |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| COMMAND | FUNCTIONALITY | |
|---|---|---|
| ls | Lists all files and directories in the present working directory | |
| ls -R | Lists files in sub-directories as well | |
| ls -a | Lists hidden files as well | |
| ls -al | Lists files and directories with detailed information. | |
| ls 'path' | more | Show listing one screen at a time | |
| cd or cd ~ | Navigate to HOME directory | |
| cd .. | Move one level up | |
| cd | To change to a particular directory | |
| cd / | Move to the root directory |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| COMMAND | FUNCTIONALITY | |
|---|---|---|
| grep -v ‘^$’ filename > new_filename | Remove Blank Lines in a file | |
| ls -l | grep '^-' | awk '/^-/ {if ($5 !=0 ) print $9 }' | Display zero byte size files | |
| sed 's/honey/pasta/n' < filename | Replace the nth occurrence of the word 'honey' with 'pasta' in a file | |
| echo 'string' | tr [a-z] [A-Z] | command to convert a string from lower case to upper case | |
| grep -i 'search string' filename | Search for a given string in a file (case in-sensitive search) | |
| cal 03 2022 | Display the calendar for the month march in the year 2022 | |
| find -atime n -type f | List the files that are accessed n days ago in the current directory | |
| find -mtime n -type f | List the files that were modified n days ago in the current directory | |
| find -ctime n -type f | List the files that were changed n days ago in the current directory |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from datetime import timedelta | |
| # The DAG object; we'll need this to instantiate a DAG | |
| from airflow import DAG | |
| # Operators; we need this to operate! | |
| from airflow.operators.bash_operator import BashOperator | |
| from airflow.utils.dates import days_ago | |
| # These args will get passed on to each operator | |
| # You can override them on a per-task basis during operator initialization | |
| default_args = { | |
| 'owner': 'Binh Phan', |
OlderNewer