Skip to content

Instantly share code, notes, and snippets.

David Howell davoscollective

Block or report user

Report or block davoscollective

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@davoscollective
davoscollective / SparkSQLRowCount.scala
Created Apr 12, 2018
Spark SQL generate rowcounts for all tables in a hive database
View SparkSQLRowCount.scala
val dbname = "datalake"
val rowCountSQL = spark.catalog.listTables("$dbname")
.select($"name").as[(String)]
.map(name => spark.sql(s"SELECT 'datalake.$name' as table, count(*) as rowcount FROM $dbname.$name")
.collect()
.mkString("\nUNION ALL\n")
val rowCounts = spark.sql(rowCountSQL).as[(String, Long)]
@davoscollective
davoscollective / AddBCPandMSSQLToolsRedhatCentosAmazon.sh
Last active Feb 16, 2018
Install MSSQL Tools including ODBC driver & dependencies, bulk copy program (BCP), SQLCMD to Redhat 6 and similar e.g. CentOS, Amazon Linux
View AddBCPandMSSQLToolsRedhatCentosAmazon.sh
#!/bin/bash
sudo su <<HERE
curl https://packages.microsoft.com/keys/microsoft.asc | rpm --import -
curl https://packages.microsoft.com/config/rhel/6/prod.repo > /etc/yum.repos.d/mssql-release.repo
HERE
sudo yum -y remove unixODBC-utf16 unixODBC-utf16-devel
sudo ACCEPT_EULA=Y yum -y install msodbcsql-13.1.4.0-1 mssql-tools-14.0.3.0-1 unixODBC-devel --disableplugin=priorities
echo -e '\nexport PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
echo -e '\nexport PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
@davoscollective
davoscollective / AddBCPandMSSQLToolsUbuntu.sh
Created Feb 15, 2018
Install MSSQL Tools including ODBC driver, bulk copy program ( BCP ) and SQLCMD to an Ubuntu 16.04 machine
View AddBCPandMSSQLToolsUbuntu.sh
#!/bin/bash
sudo su <<HERE
curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
HERE
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install --yes msodbcsql
sudo ACCEPT_EULA=Y apt-get install --yes unixodbc-dev
sudo apt-get install --yes mssql-tools unixodbc-dev
@davoscollective
davoscollective / urlParse.scala
Last active Dec 21, 2017
A scala function to parse a URL (parse a URI) into sections. Useful for processing log files to extract a core domain for aggregations and analytics.
View urlParse.scala
import scala.util.matching.Regex
/**
* parse a URI / URL into a core domain or the trailing path
* e.g.
* the core domain of https://wwww.epicwebsite.com.au/path/to/asset/cute_cat_picture.png
* is epicwebsite.com.au
* returns Option type so you might need to use getOrElse(something)
* e.g. urlParse(url,1).getOrElse(somedefault)
*/
@davoscollective
davoscollective / text_strip_margin.py
Created Sep 19, 2017 — forked from sekimura/text_strip_margin.py
Text (heredoc) strip margin in Python
View text_strip_margin.py
import re
def strip_margin(text):
return re.sub('\n[ \t]*\|', '\n', text)
def strip_heredoc(text):
indent = len(min(re.findall('\n[ \t]*(?=\S)', text) or ['']))
pattern = r'\n[ \t]{%d}' % (indent - 1)
return re.sub(pattern, '\n', text)
@davoscollective
davoscollective / mssql_to_csv.py
Created May 9, 2017 — forked from tinybike/mssql_to_csv.py
simple mssql -> csv file example using pymssql
View mssql_to_csv.py
#!/usr/bin/env python
"""
simple mssql -> csv file example using pymssql
@author jack@tinybike.net
"""
import csv
import datetime
import pymssql
from decimal import Decimal
@davoscollective
davoscollective / SQL Server Get distinct values for each column in a table.sql
Created Nov 18, 2015
SQL Server 2008+ Get distinct values for each column in a table
View SQL Server Get distinct values for each column in a table.sql
--==================================
-- Get a list of distinct values for every column in a table
-- Change the @Table parameter to the one you are interested in.
-- Change the VALUES in @ExcludedColumns if you want to exclude certain columns
--==================================
-- Set the @Table you are interested in
DECLARE @Table Varchar(max) = ''
--Optionally exclude some columns
@davoscollective
davoscollective / SQLPackageDeploy.ps1
Last active Oct 17, 2018
Script to publish SQL Server database dacpac using PowerShell and SQLPackage.exe
View SQLPackageDeploy.ps1
#=================================================================================
# Designed to deploy a database from a dacpac
#
# Usage:
# .\sqlPackageDeploymentCMD.ps1 -targetServer "LOCALHOST" -targetDB "IamADatabase" -sourceFile "C:\ProjectDirectory\bin\Debug\IamADatabase.dacpac" -SQLCMDVariable1 "IamASQLCMDVariableValue"
#
# So, why would you do this when you could just call the sqlpackage.exe directly?
# Because Powershell provides a higher level of orchestration; I plan to call this script from another script that
# first calls a script to build the dacpac that is then used in this script.
#=================================================================================
You can’t perform that action at this time.