Haitao Yao haitaoyao

## get_aws_rds_replica_lag.py
def get_cloudwatch_replica_lag(rds_id, start_time, end_time)
    local_timezone = pytz.timezone('Asia/Shanghai')
    cloudwatch = boto3.client('cloudwatch')
    start_time_with_tz = local_timezone.localize(start_time)
    end_time_with_tz = local_timezone.localize(end_time)
    values = cloudwatch.get_metric_statistics(Namespace='AWS/RDS',
                                              MetricName='ReplicaLag',
                                              Dimensions=[dict(Name='DBInstanceIdentifier', Value=rds_id)],
                                              StartTime=start_time_with_tz,
                                              EndTime=end_time_with_tz,

## maven_grpc_git_submodule.xml
<build>
    <extensions>
        <extension>
            <groupId>kr.motd.maven</groupId>
            <artifactId>os-maven-plugin</artifactId>
            <version>1.4.1.Final</version>
        </extension>
    </extensions>
    <plugins>
        <plugin>

## pom.xml
      <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-exec</artifactId>
            <version>${hive.version}</version>
            <classifier>core</classifier>
            <exclusions>
                <exclusion>
                    <groupId>commons-codec</groupId>
                    <artifactId>commons-codec</artifactId>
                </exclusion>

## binlog_time.sql

-- http://docs.aws.amazon.com/zh_cn/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.MySQL.html
call mysql.rds_set_configuration('binlog retention hours', 24);

## get_s3_estimated_size.sh
bucket_name=$1
[ -z "$bucket_name" ] && echo "usage: $0 bucket_name" && exit 1

start_time=`date +'%FT%T' -d '-3 days'`
end_time=`date +'%FT%T'`

aws cloudwatch --region cn-north-1 \
    get-metric-statistics \
    --namespace AWS/S3 \
    --metric-name BucketSizeBytes \

## svg_sequence_diagram.py
def generate_svg_sequence_diagram(text):
    """

    :param text:
    :return:
    """
    tree = parser.parse_string(text)
    _, filename = tempfile.mkstemp(suffix='.svg')
    diagram = builder.ScreenNodeBuilder.build(tree)
    draw = drawer.DiagramDraw("SVG", diagram, filename=filename)

## README-Template.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                haitaoyao
                / README-Template.md
            
            
              Created
              October 13, 2016 09:04
                — forked from PurpleBooth/README-Template.md
            
              
                A template to make good README.md
              
          
    Project Title

One Paragraph of project description goes here
Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
Prerequisities


## bootstrap_backtotop.html
<style>
.back-to-top {
    cursor: pointer;
    position: fixed;
    bottom: 20px;
    right: 20px;
    display:none;
}
</style>
<script language="JavaScript">

## gist:0862e1ce6060080afafb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                haitaoyao
                / gist:0862e1ce6060080afafb
            
            
              Last active
              August 29, 2015 14:23
            
              
                Spark Summit 2015 - Day 1
              
          
    Spark Summit 2015 Day 1

今天是spark summit 2015 第一天, 总体感觉: 人山人海, Data遍地(人多, 大家都在聊data这个data那个). 具体的schedule 见这里https://spark-summit.org/2015/schedule/, 我主要听Developer Track. 记录一些见闻
上午: 广告和广告, 官方和赞助商

这种会议开场基本上都是广告, 最感兴趣的是databricks发布的cloud产品和timeful的的talkA Tale of a Data-Driven Culture
databricks 亮相


## gist:654338075c16cd502b13

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                haitaoyao
                / gist:654338075c16cd502b13
            
            
              Last active
              August 29, 2015 14:23
            
              
                Data Pipeline Scheduler [0]
              
          
    Data Pipeline Scheduler [0]


Spark Summit 2015 前夜倒时差, 继续扯淡

美好的开始

故事的开始总是美好的, 刚启动的计算任务一般都是这么简单的一个德行:

获取日志数据/DB数据
做一些简单的ETL
计算报表
	def get_cloudwatch_replica_lag(rds_id, start_time, end_time)
	local_timezone = pytz.timezone('Asia/Shanghai')
	cloudwatch = boto3.client('cloudwatch')
	start_time_with_tz = local_timezone.localize(start_time)
	end_time_with_tz = local_timezone.localize(end_time)
	values = cloudwatch.get_metric_statistics(Namespace='AWS/RDS',
	MetricName='ReplicaLag',
	Dimensions=[dict(Name='DBInstanceIdentifier', Value=rds_id)],
	StartTime=start_time_with_tz,
	EndTime=end_time_with_tz,
	<build>
	<extensions>
	<extension>
	<groupId>kr.motd.maven</groupId>
	<artifactId>os-maven-plugin</artifactId>
	<version>1.4.1.Final</version>
	</extension>
	</extensions>
	<plugins>
	<plugin>
	<dependency>
	<groupId>org.apache.hive</groupId>
	<artifactId>hive-exec</artifactId>
	<version>${hive.version}</version>
	<classifier>core</classifier>
	<exclusions>
	<exclusion>
	<groupId>commons-codec</groupId>
	<artifactId>commons-codec</artifactId>
	</exclusion>

	-- http://docs.aws.amazon.com/zh_cn/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.MySQL.html
	call mysql.rds_set_configuration('binlog retention hours', 24);
	bucket_name=$1
	[ -z "$bucket_name" ] && echo "usage: $0 bucket_name" && exit 1

	start_time=`date +'%FT%T' -d '-3 days'`
	end_time=`date +'%FT%T'`

	aws cloudwatch --region cn-north-1 \
	get-metric-statistics \
	--namespace AWS/S3 \
	--metric-name BucketSizeBytes \
	def generate_svg_sequence_diagram(text):
	"""

	:param text:
	:return:
	"""
	tree = parser.parse_string(text)
	_, filename = tempfile.mkstemp(suffix='.svg')
	diagram = builder.ScreenNodeBuilder.build(tree)
	draw = drawer.DiagramDraw("SVG", diagram, filename=filename)
	<style>
	.back-to-top {
	cursor: pointer;
	position: fixed;
	bottom: 20px;
	right: 20px;
	display:none;
	}
	</style>
	<script language="JavaScript">