Skip to content

Instantly share code, notes, and snippets.

Avatar
:octocat:

Igor Costa igorcosta

:octocat:
View GitHub Profile
@igorcosta
igorcosta / README.md
Created Aug 17, 2022 — forked from osy/README.md
Local caching for GitHub Actions self hosted runner using Squid Proxy
View README.md

One of the biggest issues with using a self hosted GitHub runner is that actions that require downloading large amounts of data will bottleneck at the network. [actions/cache][1] does not support locally caching objects and artifacts stored on GitHub's servers will require a lot of bandwidth to fetch on every job. We can, however, set up a content proxy using [Squid][2] with SSL bumping to locally cache requests from jobs.

Patching Squid

A major challenge is that [actions/cache][1] uses Azure storage APIs which makes HTTP range requests. While Squid supports range requests, it is not good at caching them. There is an option, range_offset_limit none which, according to the [documentation][3]:

A size of 'none' causes Squid to always fetch the object from the beginning so it may cache the result. (2.0 style)

However, after extensive debugging, I discovered that the feature does not work if the client closes the connection. When range_offset_limit is set, Squid will make a full request to the server,

@igorcosta
igorcosta / dispatch_input.md
Created Dec 20, 2021
Public workflow_dispatch_hard_limit
View dispatch_input.md

Limitations in our products

Because the customer uses infrastructure as a code (IaC) into their pipelines using AWS CloudFormattion, they were used to the YAML format and the advanced features from the AWS schema and functionalities available, Actions YAML was simple to use and teach, however, introduced last year the workflow_dispatch allows customers to create workflows that are manually trigger with the option of input parameters. Our product supports only a limit of 10 inputs and throws an error, not allowing customer with multi-cloud environment setup more control for those manually trigger workflows.

We introduced a few options for the customer with the goal of continue maintaining the developer experience, interoperability, compliance and security and the ability to use the product in a multi-cloud environment.

Using commit message in JSON format

@igorcosta
igorcosta / gitlab_to_ghec_playbook.md
Last active Jun 20, 2022
Gitlab to GHEC Migration playbook
View gitlab_to_ghec_playbook.md

GitLab to GitHub Enterprise Cloud (GHEC) migration playbook

This playbook is a step-by-step guide to assist you with migration from GitLab to GitHub.com Enterprise Cloud GHEC.

Steps & Tasks Description
Step One Let's get ready for the migration. This step gives you an overview of what is required to start the migration process
Step Two Creating the artefact to be imported on GitHub requires special access to the Enterprise Cloud Import tool. This step will help you understand what is required to get access to the tool.
Step Three With the file ready to be imported, this step will guide you on how to connect and upload the file to your GitHub Enterprise Cloud instance.
@igorcosta
igorcosta / billion.py
Last active Mar 12, 2020
Running 1 Billion func in Python3 with joblib
View billion.py
#pip3 install joblib first
from joblib import Parallel, delayed
import time
start_time = time.time()
def BillionFuncJob(key):
@igorcosta
igorcosta / MainActivity.java
Created Jul 3, 2018 — forked from mjohnsullivan/MainActivity.java
Android Wear activity that reads and displays sensor data from the device
View MainActivity.java
package com.example.wear;
import android.app.Activity;
import android.hardware.Sensor;
import android.hardware.SensorEvent;
import android.hardware.SensorEventListener;
import android.hardware.SensorManager;
import android.os.Bundle;
import android.support.wearable.view.WatchViewStub;
import android.util.Log;
@igorcosta
igorcosta / gist:835152315d756377a9d297dcdad0f5ff
Created Mar 27, 2018 — forked from kyledrake/gist:d7457a46a03d7408da31
Creating a self-signed SSL certificate, and then verifying it on another Linux machine
View gist:835152315d756377a9d297dcdad0f5ff
# Procedure is for Ubuntu 14.04 LTS.
# Using these guides:
# http://datacenteroverlords.com/2012/03/01/creating-your-own-ssl-certificate-authority/
# https://turboflash.wordpress.com/2009/06/23/curl-adding-installing-trusting-new-self-signed-certificate/
# https://jamielinux.com/articles/2013/08/act-as-your-own-certificate-authority/
# Generate the root (GIVE IT A PASSWORD IF YOU'RE NOT AUTOMATING SIGNING!):
openssl genrsa -aes256 -out ca.key 2048
openssl req -new -x509 -days 7300 -key ca.key -sha256 -extensions v3_ca -out ca.crt
@igorcosta
igorcosta / file_to_sftp.py
Created Feb 14, 2018
SFTP with Python using paramiko
View file_to_sftp.py
#!/usr/bin/env python
import sys
import time
import paramiko
## How to use it?
##
## You have to install a dependecy called paramiko, which is a ssh protocol implementation that helps you to connect to sftp.
## pip install paramiko
## Commands in your terminal:
@igorcosta
igorcosta / elastic_bulk_ingest.py
Created Jan 6, 2017 — forked from scotthaleen/elastic_bulk_ingest.py
Bulk Index json to elastic search
View elastic_bulk_ingest.py
from pyspark import SparkContext, SparkConf
import json
import argparse
def fn_to_doc(line):
try:
doc = {}
data = json.loads(line)
doc['data'] = data
@igorcosta
igorcosta / ElasticSearch.md
Created Jan 6, 2017 — forked from tokhi/ElasticSearch.md
Elastic Search in simple words
View ElasticSearch.md

Elastic Search

Elasticsearch is a real time search engine where a change to an index will be propegated to the whole cluster within a second.

An elasticsearch cluster indicated as one or more nodes, collection of nodes containing all the data, default cluster name is elasticserach.

A node is a single server and part of a cluster, node participate in searching and indexing.

Index is collection of documents equavalent to a database within a relational system, index name must be lowercase Type is represetn a class = table