Skip to content

Instantly share code, notes, and snippets.

Define an array of tuples

tuples=("p-prodfix-ds-psprxp-svcp 52c2496a-3264-4174-8f98-f6cbb5750e6a" "p-qa-ds-psprxp-svcp 07cf00fd-4d8e-41a5-8f7e-a32202b63d91" "p-prod-ds-psprxp-svcp 90e2872c-5cf6-4fce-a7d3-e4de9c7a917b")

Print the tuples

echo $tuples
@dvu4
dvu4 / create_databricks_hive_metastore.md
Last active September 25, 2023 21:32
This script will create Hive metastore table in Databricks Catalog

Create Hive metastore table

# Create database schema
spark.sql("CREATE DATABASE IF NOT EXISTS csgci")


# Create Hive Internal table
df.write.mode("overwrite").option("overwriteSchema", "true").saveAsTable("csgci.drug_ic")
for i in  {"04eb34f2-3778-46e7-a1f5-391291a2bc6c","6af03cc1-8262-4e81-9fc0-bd6131b23996","7eecf3f5-67f0-4792-a9c2-c9f60ff5a96c","c5a71e40-4636-41bb-8232-b8b00fe470fb","026520ce-c449-43e1-ac4f-215db4827af0"}; do az ad sp show --id api://${i} --verbose ; done

from pyspark.dbutils import DBUtils
import pyspark.sql.functions as F
import pyspark
import pandas as pd
from itertools import chain
from functools import reduce


# mapping Walgreens status to RX-Lighting status and substatus
@dvu4
dvu4 / save_multiple_dataframes_into_text_file_in_Pyspark.md
Last active July 21, 2023 18:59
This function save multiple dataframes with different headers into one pipe delimited file (.txt file)
from pyspark.dbutils import DBUtils
import pyspark.sql.functions as F
import pyspark
import pandas as pd
from itertools import chain
from functools import reduce

List all files in container ds-tmartch-output in storage account prodfixdseus2tmartchsa01

path = "abfss://ds-tmartch-output@prodfixdseus2tmartchsa01.dfs.core.windows.net/acc-activation"
dbutils.fs.ls(path)

The result looks like this

How to list the secret scope

from pyspark.dbutils import DBUtils
dbutils.secrets.listScopes()

the ouput :

Out[16]: [SecretScope(name='mleng-secrets')]

import yaml
from pyspark.dbutils import DBUtils
import pyspark.sql.functions as F
import time



def get_dir_content(path):

!pip install pyyaml

import yaml
from pyspark.dbutils import DBUtils
import pyspark.sql.functions as F
import time

search for "service" using the grep

alias grep="grep --color=auto" 
grep -rn . -e "service"  

Here's what each of the options and arguments mean:

  • -r: Recursively search all files in the specified directory and its subdirectories.