Skip to content

Instantly share code, notes, and snippets.

David Howell davoscollective

Block or report user

Report or block davoscollective

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
davoscollective /
Last active Apr 18, 2018
List versus generator for a code interview question - find largest sum of digits for some range of exponential expressions
lds - large digit sum
from itertools import product
from timeit import timeit
import tracemalloc
import matplotlib.pyplot as plt
davoscollective / SparkSQLRowCount.scala
Created Apr 12, 2018
Spark SQL generate rowcounts for all tables in a hive database
View SparkSQLRowCount.scala
val dbname = "datalake"
val rowCountSQL = spark.catalog.listTables("$dbname")
.map(name => spark.sql(s"SELECT 'datalake.$name' as table, count(*) as rowcount FROM $dbname.$name")
.mkString("\nUNION ALL\n")
val rowCounts = spark.sql(rowCountSQL).as[(String, Long)]
davoscollective /
Last active Feb 16, 2018
Install MSSQL Tools including ODBC driver & dependencies, bulk copy program (BCP), SQLCMD to Redhat 6 and similar e.g. CentOS, Amazon Linux
sudo su <<HERE
curl | rpm --import -
curl > /etc/yum.repos.d/mssql-release.repo
sudo yum -y remove unixODBC-utf16 unixODBC-utf16-devel
sudo ACCEPT_EULA=Y yum -y install msodbcsql- mssql-tools- unixODBC-devel --disableplugin=priorities
echo -e '\nexport PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bash_profile
echo -e '\nexport PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
davoscollective /
Created Feb 15, 2018
Install MSSQL Tools including ODBC driver, bulk copy program ( BCP ) and SQLCMD to an Ubuntu 16.04 machine
sudo su <<HERE
curl | sudo apt-key add -
curl > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install --yes msodbcsql
sudo ACCEPT_EULA=Y apt-get install --yes unixodbc-dev
sudo apt-get install --yes mssql-tools unixodbc-dev
davoscollective / urlParse.scala
Last active Dec 21, 2017
A scala function to parse a URL (parse a URI) into sections. Useful for processing log files to extract a core domain for aggregations and analytics.
View urlParse.scala
import scala.util.matching.Regex
* parse a URI / URL into a core domain or the trailing path
* e.g.
* the core domain of
* is
* returns Option type so you might need to use getOrElse(something)
* e.g. urlParse(url,1).getOrElse(somedefault)
davoscollective /
Created Sep 19, 2017 — forked from sekimura/
Text (heredoc) strip margin in Python
import re
def strip_margin(text):
return re.sub('\n[ \t]*\|', '\n', text)
def strip_heredoc(text):
indent = len(min(re.findall('\n[ \t]*(?=\S)', text) or ['']))
pattern = r'\n[ \t]{%d}' % (indent - 1)
return re.sub(pattern, '\n', text)
davoscollective /
Created May 9, 2017 — forked from tinybike/
simple mssql -> csv file example using pymssql
#!/usr/bin/env python
simple mssql -> csv file example using pymssql
import csv
import datetime
import pymssql
from decimal import Decimal
davoscollective / SQL Server Get distinct values for each column in a table.sql
Created Nov 18, 2015
SQL Server 2008+ Get distinct values for each column in a table
View SQL Server Get distinct values for each column in a table.sql
-- Get a list of distinct values for every column in a table
-- Change the @Table parameter to the one you are interested in.
-- Change the VALUES in @ExcludedColumns if you want to exclude certain columns
-- Set the @Table you are interested in
DECLARE @Table Varchar(max) = ''
--Optionally exclude some columns
davoscollective / SQLPackageDeploy.ps1
Last active Oct 17, 2018
Script to publish SQL Server database dacpac using PowerShell and SQLPackage.exe
View SQLPackageDeploy.ps1
# Designed to deploy a database from a dacpac
# Usage:
# .\sqlPackageDeploymentCMD.ps1 -targetServer "LOCALHOST" -targetDB "IamADatabase" -sourceFile "C:\ProjectDirectory\bin\Debug\IamADatabase.dacpac" -SQLCMDVariable1 "IamASQLCMDVariableValue"
# So, why would you do this when you could just call the sqlpackage.exe directly?
# Because Powershell provides a higher level of orchestration; I plan to call this script from another script that
# first calls a script to build the dacpac that is then used in this script.
You can’t perform that action at this time.