Skip to content

Instantly share code, notes, and snippets.

Avatar

Thabile vatshat

View GitHub Profile
View hive_optimization.md

Five ways to tune Hive performance

1. Use Tez

set hive.execution.engine=tez;

2. Store tables as ORC

@vatshat
vatshat / put_log_events.py
Last active Jan 29, 2020
Example of how to upload logs to CloudWatch using best practices to avoid throttling
View put_log_events.py
import json
import sys
import boto3
import time
## boto3 client
logs = boto3.client('logs', region_name='eu-west-1')
## Define log group and log stream. Default log group for glue logs and logstream defined dynamically upon job run.
log_group = "my_custom_log_group"
@vatshat
vatshat / cwl_insights_parse_regex.sh
Created Jan 29, 2019
An example of how to use regex in the parse statement of a CloudWatch Insights query
View cwl_insights_parse_regex.sh
#!/usr/bin/env bash
query_string=$(cat << EndOfMessage
fields @timestamp, @logStream, headers.X-Amzn-Trace-Id, @transId, @message
| parse @message /(transactionId:[ ]?)(?<@transId>[a-zA-Z0-9]+)/
| filter @transId = a4c475516be5445a87fbb81bb7a4b365
EndOfMessage
) \
&& \
query_id=`aws logs start-query --log-group-name /aws/lambda/console_log \
@vatshat
vatshat / GetCostAndUsage.java
Created Jan 25, 2019
Java example of how to use AWS Cost Explorer API
View GetCostAndUsage.java
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicSessionCredentials;
import com.amazonaws.services.costexplorer.AWSCostExplorer;
import com.amazonaws.services.costexplorer.AWSCostExplorerClientBuilder;
import com.amazonaws.services.costexplorer.model.*;
import com.amazonaws.services.securitytoken.AWSSecurityTokenService;
import com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClientBuilder;
import com.amazonaws.services.securitytoken.model.AssumeRoleRequest;
import com.amazonaws.services.securitytoken.model.Credentials;
@vatshat
vatshat / CloudwatchExamples.go
Last active Aug 6, 2019
FilterLogEvents and ListMetrics example in Go
View CloudwatchExamples.go
package main
import (
"context"
"fmt"
"time"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/aws/client"
@vatshat
vatshat / put_log_events.sh
Last active Sep 18, 2019
Example of how to upload 10 multiple JSON log events once every second
View put_log_events.sh
#!/bin/bash
a=0
while [ $a -lt 20 ]
do
now=$(TZ='UTC' date +%s%3N)
message=$(cat << EndOfMessage
{
@vatshat
vatshat / cloudwatch_alarm_sns_actions.sh
Created Jan 25, 2019
Update multiple Cloudwatch alarms with a new SNS topic
View cloudwatch_alarm_sns_actions.sh
#!/usr/bin/env bash
install_error="Please install jq before launching the script - https://stedolan.github.io/jq/download/"
type jq >/dev/null 2>&1 || { echo >&2 $install_error; exit 1; }
hash jq 2>/dev/null || { echo >&2 $install_error; exit 1; }
if [[ ! $# -eq 2 ]]; then
echo "Please provide CloudWatch alarm and 1 valid SNS topic"
exit 1
fi
@vatshat
vatshat / cloudwatch_dashboard_body_labels.sh
Created Jan 25, 2019
Script to update CloudWatch ashboard body Widgets without label with EC2 name tag
View cloudwatch_dashboard_body_labels.sh
# Gets instances without labels in the widgets
labelLessInstance=$( cat temp.json |
jq '
. |
(
[
.widgets[] |
select(
(.type == "metric") and (.properties.metrics[0][0] == "AWS/EC2")
) |
@vatshat
vatshat / automatic_cloudwatch_alarms.sh
Created Jan 25, 2019
Create alarm for each metric generated with a dimension
View automatic_cloudwatch_alarms.sh
#!/usr/bin/env bash
# metric math alarms
aws cloudwatch put-metric-alarm --alarm-name test2 --evaluation-periods 2 --alarm-actions arn:aws:sns:eu-west-1:037559324442:cloudwatch-sqs --threshold 1 --comparison-operator GreaterThanThreshold --metrics '[{"Id":"e1","Label":"Expression1","ReturnData":true,"Expression":"SUM(METRICS()/2)"},{"Id":"m1","ReturnData":false,"MetricStat":{"Metric":{"MetricName":"Available","Dimensions":[{"Name":"WorkspaceId","Value":"ws-fdw4f2pt8"}],"Namespace":"AWS/WorkSpaces"},"Period":300,"Stat":"Average"}}]'
metrics=$(cat ~/temp/temp2.json |
#temp=$(aws cloudwatch list-metrics --metric-name CPUUtilization --namespace AWS/EC2 --region eu-west-1 --query "Metrics[?Dimensions[0].Name == 'InstanceId']" |
# when using timestamp as an ID - now | tostring | split(".")[1]
@vatshat
vatshat / cloudtrail_jq.sh
Created Jan 25, 2019
Analyzing CloudTrail Logs using jq/bash
View cloudtrail_jq.sh
############################################################################################################################################################
# Cloudtrail recursively search through all events in different folders relating to a specific log group which generated #
############################################################################################################################################################
find . -name '*.json' -exec cat {} \; | jq '.Records[] | select(.requestParameters.logGroupName=="/mnt/log/communications-delivery-stage")' | jq -s '[ .[] | select(.errorCode=="ResourceAlreadyExistsException") ] | unique_by(.eventName)'
find . -name '*.json' -exec cat {} \; | jq '.Records[] | select(.requestID=="3ddb4d1f-41d2-11e8-8533-1dadb66cbff4")'
-- Count number of Exceptions
find . -name '*.json' -exec cat {} \; | jq '.Records[]' | jq -s '[ .[] | select(.errorCode=="ResourceAlreadyExistsException") ] | length'