Skip to content

Instantly share code, notes, and snippets.

@haitaoyao
haitaoyao / get_aws_rds_replica_lag.py
Created July 6, 2017 13:51
Get rds replica lag metric value
def get_cloudwatch_replica_lag(rds_id, start_time, end_time)
local_timezone = pytz.timezone('Asia/Shanghai')
cloudwatch = boto3.client('cloudwatch')
start_time_with_tz = local_timezone.localize(start_time)
end_time_with_tz = local_timezone.localize(end_time)
values = cloudwatch.get_metric_statistics(Namespace='AWS/RDS',
MetricName='ReplicaLag',
Dimensions=[dict(Name='DBInstanceIdentifier', Value=rds_id)],
StartTime=start_time_with_tz,
EndTime=end_time_with_tz,
@haitaoyao
haitaoyao / maven_grpc_git_submodule.xml
Created June 29, 2017 14:20
maven + grpc + git submodule configuration
<build>
<extensions>
<extension>
<groupId>kr.motd.maven</groupId>
<artifactId>os-maven-plugin</artifactId>
<version>1.4.1.Final</version>
</extension>
</extensions>
<plugins>
<plugin>
@haitaoyao
haitaoyao / pom.xml
Created March 22, 2017 15:11
hive exec maven dependency for sql parse onlly
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>${hive.version}</version>
<classifier>core</classifier>
<exclusions>
<exclusion>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
</exclusion>
@haitaoyao
haitaoyao / binlog_time.sql
Created November 4, 2016 05:33
设置 AWS RDS binlog 保存时间
-- http://docs.aws.amazon.com/zh_cn/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.MySQL.html
call mysql.rds_set_configuration('binlog retention hours', 24);
@haitaoyao
haitaoyao / get_s3_estimated_size.sh
Last active November 2, 2016 10:06
get the estimated storage size of AWS buckets without listing all the objects
bucket_name=$1
[ -z "$bucket_name" ] && echo "usage: $0 bucket_name" && exit 1
start_time=`date +'%FT%T' -d '-3 days'`
end_time=`date +'%FT%T'`
aws cloudwatch --region cn-north-1 \
get-metric-statistics \
--namespace AWS/S3 \
--metric-name BucketSizeBytes \
@haitaoyao
haitaoyao / svg_sequence_diagram.py
Created October 18, 2016 08:56
generate svg sequence diagram with seqdiag
def generate_svg_sequence_diagram(text):
"""
:param text:
:return:
"""
tree = parser.parse_string(text)
_, filename = tempfile.mkstemp(suffix='.svg')
diagram = builder.ScreenNodeBuilder.build(tree)
draw = drawer.DiagramDraw("SVG", diagram, filename=filename)
@haitaoyao
haitaoyao / README-Template.md
Created October 13, 2016 09:04 — forked from PurpleBooth/README-Template.md
A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisities

@haitaoyao
haitaoyao / bootstrap_backtotop.html
Last active September 20, 2019 09:17
bootstrap 回到顶部
<style>
.back-to-top {
cursor: pointer;
position: fixed;
bottom: 20px;
right: 20px;
display:none;
}
</style>
<script language="JavaScript">
@haitaoyao
haitaoyao / gist:0862e1ce6060080afafb
Last active August 29, 2015 14:23
Spark Summit 2015 - Day 1

Spark Summit 2015 Day 1

今天是spark summit 2015 第一天, 总体感觉: 人山人海, Data遍地(人多, 大家都在聊data这个data那个). 具体的schedule 见这里https://spark-summit.org/2015/schedule/, 我主要听Developer Track. 记录一些见闻

上午: 广告和广告, 官方和赞助商

这种会议开场基本上都是广告, 最感兴趣的是databricks发布的cloud产品和timeful的的talkA Tale of a Data-Driven Culture

databricks 亮相

@haitaoyao
haitaoyao / gist:654338075c16cd502b13
Last active August 29, 2015 14:23
Data Pipeline Scheduler [0]

Data Pipeline Scheduler [0]

  • Spark Summit 2015 前夜倒时差, 继续扯淡

美好的开始

故事的开始总是美好的, 刚启动的计算任务一般都是这么简单的一个德行:

  1. 获取日志数据/DB数据
  2. 做一些简单的ETL
  3. 计算报表