Skip to content

Instantly share code, notes, and snippets.

View mdrakiburrahman's full-sized avatar
💤
living the data dream

Raki mdrakiburrahman

💤
living the data dream
View GitHub Profile
@mdrakiburrahman
mdrakiburrahman / 1_kafka_to_delta_scala.scala
Created February 14, 2022 16:25
Spark Streaming in Synapse from Kafka
// Kafka Topic ➡ Delta tables: `scene` via Spark Streaming
// `SOURCE`: Connect to Kafka topic as a streaming Dataframe: `raw_DF`
import org.apache.spark.sql.types._
import org.apache.spark.sql.functions._
import org.apache.spark.sql._
import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.functions.from_json
// Pull from Key Vault for non-sandbox
@mdrakiburrahman
mdrakiburrahman / elastic-cleaner.sh
Last active February 17, 2022 13:09
Cleanup of Elastic indices in Arc
URL="http://localhost:9200"
from=20
to=30
for i in `seq $from $to`
do
# Delete index if older than i days
DATE=`date -d "$dataset_date - $i days" +%Y.%m.%d`
echo "Deleting day: $DATE"
curl -XDELETE "$URL/logstash-$DATE" # Comment out to see what range is deleted
done
@mdrakiburrahman
mdrakiburrahman / arc-scc.yaml
Last active May 6, 2022 17:04
SCCs for Bootstrapper 1.3.0_2022-01-27
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: arc-data-scc
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: true
-- Create DB
CREATE DATABASE raki_pitr_test

CREATE TABLE table1 (ID int, value nvarchar(10))
GO

INSERT INTO table1 VALUES (1, 'demo1')
INSERT INTO table1 VALUES (2, 'demo2')
@mdrakiburrahman
mdrakiburrahman / create-kubeconfig.sh
Created April 7, 2022 15:54
Create Kubeconfig from Service Account
# the Namespace and ServiceAccount name that is used for the config
namespace=arc
serviceAccount=arcOnboard
######################
# actual script starts
set -o errexit
secretName=$(kubectl --namespace $namespace get serviceAccount $serviceAccount -o jsonpath='{.secrets[0].name}')
ca=$(kubectl --namespace $namespace get secret/$secretName -o jsonpath='{.data.ca\.crt}')
@mdrakiburrahman
mdrakiburrahman / tina-onboarder-rbac.yaml
Last active April 7, 2022 22:35
ClusterRoles needed to deploy Arc Data Services in Indirect Mode
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: arc-data-deployer-cluster-role
rules:
# CRDs in general
- apiGroups: ["apiextensions.k8s.io"]
resources: ["customresourcedefinitions"]
verbs: ['create', 'update', 'patch', 'delete', 'list', 'get', 'watch'] # Can figure out lesser scope if needed
# All Arc Data apiGroups, I think some of these may be redundant - K8s doesn't give an easy way to get custom apiGroups so I put everything in
@mdrakiburrahman
mdrakiburrahman / arc-april-PITR.sh
Created April 12, 2022 10:57
Testing Arc PITR with incorrect dates
# Correct date
cat <<EOF | kubectl create -f -
apiVersion: tasks.sql.arcdata.microsoft.com/v1
kind: SqlManagedInstanceRestoreTask
metadata:
name: sql-restore-raki-correct
namespace: arc
spec:
source:
name: sql-gp-1
@mdrakiburrahman
mdrakiburrahman / Arc-onboarding-ansible-mimic.sh
Last active April 21, 2022 18:17
Arc Onboarding Script - working version 1.2.3
# Login Into Azure Using Service Principal
az_cli_cmd login --service-principal -u az_client_id -p az_client_secret --tenant az_tenant_id
# Set Azure Subscription
az_cli_cmd account set --subscription az_subscription_id
# Create AZ Resource Group: Arc K8s
az_cli_cmd group create -l az_location -n az_resource_group
# OCP Login
@mdrakiburrahman
mdrakiburrahman / Synapse-Logical-DWH.sql
Last active May 5, 2022 19:39
Creating a simple External on top of Parquet file that exposes a subset of the columns to end user
CREATE DATABASE Ldw
COLLATE Latin1_General_100_BIN2_UTF8;
USE Ldw;
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'MyRand0mPa33W0rd1!';
CREATE DATABASE SCOPED CREDENTIAL WorkspaceIdentity
WITH IDENTITY = 'Managed Identity';
GO
@mdrakiburrahman
mdrakiburrahman / fluentbit_otlp_json_kafka_export.json
Created May 10, 2022 22:57
A sample JSON export from the Open Telemetry Kafka Exporter as OTLP_JSON
{
"resourceLogs": [
{
"resource": {},
"scopeLogs": [
{
"scope": {},
"logRecords": [
{
"timeUnixNano": "1652102891470000028",