Skip to content

Instantly share code, notes, and snippets.

drocsid

  • Seattle, Washington
Block or report user

Report or block drocsid

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@drocsid
drocsid / gist:0ed6d76d9ea5c804e5a7163c993cad98
Created Apr 9, 2018
AWS SDK Java 2 Scala getObjectFile
View gist:0ed6d76d9ea5c804e5a7163c993cad98
import software.amazon.awssdk.core.pagination.SdkIterable
import software.amazon.awssdk.core.regions.Region
import software.amazon.awssdk.services.s3.S3Client
import software.amazon.awssdk.services.s3.model.{GetObjectRequest, GetObjectResponse, ListObjectsV2Request, S3Object}
import software.amazon.awssdk.services.s3.paginators.ListObjectsV2Iterable
import software.amazon.awssdk.core.auth.{AwsCredentialsProvider, EnvironmentVariableCredentialsProvider, InstanceProfileCredentialsProvider, ProfileCredentialsProvider}
import software.amazon.awssdk.core.sync.{ResponseInputStream, StreamingResponseHandler}
import java.nio.file.Paths
import java.io.File
View gist:79153b1ae228fa6a0f58b9958f552bbb
create temp table event_shp
as (
select cust_key,
case when delivery_channel <> 'JOIN' and
product_type = 'General' and
sale_dt in
(select event_dt
from user_tbls.events
where anniversary_public_event=1 ) then 1
else 0
@drocsid
drocsid / gist:ee5803d7995631abdfc06125b5e739a4
Created Jan 15, 2018
Elasticsearch SocketTimeoutException
View gist:ee5803d7995631abdfc06125b5e739a4
Caused by: UncategorizedExecutionException[Failed execution]; nested: ExecutionException[java.net.SocketTimeoutException]; nested: SocketTimeoutException;
at org.elasticsearch.action.support.AdapterActionFuture.rethrowExecutionException(AdapterActionFuture.java:93)
at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:47)
at org.elasticsearch.action.bulk.Retry.withSyncBackoff(Retry.java:104)
at org.elasticsearch.action.bulk.BulkRequestHandler$SyncBulkRequestHandler.execute(BulkRequestHandler.java:86)
at org.elasticsearch.action.bulk.BulkProcessor.execute(BulkProcessor.java:350)
at org.elasticsearch.action.bulk.BulkProcessor.executeIfNeeded(BulkProcessor.java:341)
at org.elasticsearch.action.bulk.BulkProcessor.internalAdd(BulkProcessor.java:276)
at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:259)
at org.elasticsearch.action.bulk.BulkProcessor.add(BulkProcessor.java:255)
View gist:43180f1227c8f282b2b9b73351c59eff
java.lang.NoSuchMethodError: org.apache.http.conn.ssl.SSLConnectionSocketFactory.<init>(Ljavax/net/ssl/SSLContext;Ljavax/net/ssl/HostnameVerifier;)V
at com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.<init>(SdkTLSSocketFactory.java:56)
at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.getPreferredSocketFactory(ApacheConnectionManagerFactory.java:87)
at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.create(ApacheConnectionManagerFactory.java:65)
at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.create(ApacheConnectionManagerFactory.java:58)
at com.amazonaws.http.apache.client.impl.ApacheHttpClientFactory.create(ApacheHttpClientFactory.java:46)
at com.amazonaws.http.apache.client.impl.ApacheHttpClientFactory.create(ApacheHttpClientFactory.java:37)
at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:213)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceCl
View gist:b0efa4ff6ff4a7c3c8bb56767d0b6877
import org.apache.commons.net.util.Base64;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.spark.Logging;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
View pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>ad-export</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
@drocsid
drocsid / gist:b0da92eb313b1bf71912
Last active Jan 17, 2016
Running out of memory locally launching multiple spark jobs using spark yarn / submit from shell.
View gist:b0da92eb313b1bf71912
I launch around 30-60 of these jobs defined like start-job.sh in the background from a wrapper script. I wait about 30 seconds between launches, then the wrapper monitors yarn to determine when to launch more. There is a limit defined at around 60 jobs, but even if I set it to 30, I run out of memory on the host submitting the jobs. Why does my approach to using spark-submit cause me to run out of memory. I have about 6G free, and I don't feel like I should be running out of memory when submitting jobs.
start-job.sh
export HADOOP_CONF_DIR=/etc/hadoop/conf
spark-submit \
--class sap.whcounter.WarehouseCounter \
--master yarn-cluster \
--num-executors 1 \
--driver-memory 1024m \
@drocsid
drocsid / gist:9741e847ad7dd0c7b16d
Created Oct 15, 2015
etcd2 keeping state, has hostname not defined as an option.
View gist:9741e847ad7dd0c7b16d
core@coreos003 ~ $ sudo rm -rf /var/lib/etcd/*
core@coreos003 ~ $ sudo rm -rf /var/lib/etcd2/*
core@coreos003 ~ $ sudo systemctl stop etcd2
core@coreos003 ~ $ sudo systemctl disable etcd2
core@coreos003 ~ $ sudo systemctl stop etcd
core@coreos003 ~ $ sudo systemctl disable etcd
etcd2 -name coreos002 -initial-advertise-peer-urls http://10.5.29.211:2380 -listen-peer-urls http://10.5.29.211:2380 -listen-client-urls http://10.5.29.211:2379,http://127.0.0.1:2379 -advertise-client-urls http://10.5.29.211:2379 -initial-cluster-token etcd-core-42 -initial-cluster coreos002=http://10.5.29.211:2380,coreos003=http://10.5.29.218:2380,coreos004=http://10.5.29.220:2380 -initial-cluster-state new
etcd2 -name coreos003 -initial-advertise-peer-urls http://10.5.29.218:2380 -listen-peer-urls http://10.5.29.218:2380 -listen-client-urls http://10.5.29.218:2379,http://127.0.0.1:2379 -advertise-client-urls http://10.5.29.218:2379 -initial-cluster-token etcd-core-42 -initial-cluster coreos002=http://10.5
View gist:04fe63f4bb7a5c5a24bf
{
"persistent": {
"action": {
"destructive_requires_name": "true"
},
"indices": {
"store": {
"throttle": {
"max_bytes_per_sec": "60mb"
}
View gist:1438eead63651112dcdc
coreos-test ~ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether d4:ae:52:67:58:4b brd ff:ff:ff:ff:ff:ff
inet 10.51.31.240/22 brd 10.51.31.255 scope global dynamic eno1
You can’t perform that action at this time.