Skip to content

Instantly share code, notes, and snippets.

@abhioncbr
abhioncbr / CircleCi.md
Last active November 29, 2018 01:20
Manage CI/CD efficiently with CircleCI

How to manage CI/CD efficiently with CircleCI?

Continuous integration & deployment(CI/CD) is one of the well-known standard practice in modern-day software development. In a fiercely competitive world, businesses are relying on much more frequent feature releases to target customers. Recently, I came to know about CircleCI which is one of the excellent tools for achieving CI/CD efficiently. In my experience, the following CircleCI tagline is entirely apt:

Automate your development process quickly, safely, and at scale.

In this post, I will be sharing how to build and deploy Docker images using CircleCI quickly.

Introduction to a CircleCI config file

First, enable CircleCI webhook for your public GitHub repository. CirclecCi expects config.yml in the .circleci sub-folder of the project root directory. The config file should follow the rule specified here. CircleCI config file consists of three basic definitions as follows:

  • [version](h
@abhioncbr
abhioncbr / Apache_Superset.md
Last active November 11, 2023 09:53
Apache Superset in the production environment

Apache Superset in the production environment

Visualising data helps in building a much deeper understanding of the data and fastens analytics around the data. There are several mature paid products available in the market. Recently, I explored an open-source product name Apache-Superset which I found a very upbeat product in this space. Some prominent features of Superset are:

  • A rich set of data visualisations
  • An easy-to-use interface for exploring and visualising data
  • Create and share dashboards

After reading about Superset, I wanted to try it, and as Superset is a python programming language based project, we can easily install it using pip, but I decided to set it up as a container based on Docker. Apache-Superset GitHub Repo contains code for building and running Superset as a container. Since I wan

docker run -p 8088:8088 \
-v config:/home/superset/config/ abhioncbr/docker-superset:<tag> \
cluster server <db_url> <redis_url>
docker run -p 5555:5555 \
-v config:/home/superset/config/ \
abhioncbr/docker-superset:<tag> cluster worker <db_url> <redis_url>
docker pull abhioncbr/docker-superset:<tag>
cd docker-files/ && SUPERSET_ENV=<local | prod> \
SUPERSET_VERSION=<tag> docker-compose up -d
@abhioncbr
abhioncbr / Druid.md
Last active March 7, 2019 13:52
Making S3A Hadoop connector workable with Druid

Apache Druid is a high-performance real-time analytics database. Druid is a unique type of database that combines ideas from OLAP/analytic databases, timeseries databases, and search systems to enable new use cases in real-time architectures. For building a framework for time series trend analysis, prediction model and anomaly detection, I decided to use Druid. As per the requirements, apart from real-time data ingestion, there is a need for batch-based data ingestion too in Druid. After reading several blogs and articles around the production environment setup of Druid cluster for handling petabytes of data, I decided to follow the below architecture:

  • 2
"tuningConfig": {
"type": "hadoop",
"jobProperties": {
"fs.s3a.endpoint": "s3.ca-central-1.amazonaws.com",
"fs.s3.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
"fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
"io.compression.codecs": "org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec"
}
}
Caused by: java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManager.<init>(Lcom/amazonaws/services/s3/AmazonS3;Ljava/util/concurrent/ThreadPoolExecutor;)V
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:287) ~[?:?]
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669) ~[?:?]
at org.apache.hadoop.fs.FileSystem.access00(FileSystem.java:94) ~[?:?]
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) ~[?:?]
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) ~[?:?]
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) ~[?:?]
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) ~[?:?]
"ioConfig" : {
"type" : "hadoop",
"inputSpec" : {
"type" : "static",
"paths" : "s3a://experiment-druid/input_data/wikiticker-2015-09-12-sampled.json.gz"
},
"metadataUpdateSpec" : null,
"segmentOutputPath" : "s3n://experiment-druid/deepstorage"
},