Skip to content

Instantly share code, notes, and snippets.

View gregrahn's full-sized avatar

Greg Rahn gregrahn

View GitHub Profile
@gregrahn
gregrahn / demo_rdbms.mk
Created August 20, 2012 21:46
demo_rdbms.mk from Oracle 11.2
#
# Example for building demo OCI programs:
#
# 1. All OCI demos (including extdemo2, extdemo4 and extdemo5):
#
# make -f demo_rdbms.mk demos
#
# 2. A single OCI demo:
#
# make -f demo_rdbms.mk build EXE=demo OBJS="demo.o ..."
@gregrahn
gregrahn / gist:3877498
Created October 12, 2012 05:34
ethtool eth0
$ sudo ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
@gregrahn
gregrahn / create_tables.sql
Created December 2, 2012 06:05
Experiment to see if it is possible to rewrite the SQL query to not use a join and compute the count(distinct) in a single pass of the data.
drop table t2;
create table t2
(
day_id int not null,
time_id date not null,
begin_time date not null,
end_time date not null
);
begin
@gregrahn
gregrahn / ASH_plots.R
Created December 20, 2011 18:15
Example on how to read ASH data from Oracle into R and do scatter plots
library(RJDBC)
#
# set up the JDBC connection
# configure this for your env
#
drv <-JDBC("oracle.jdbc.driver.OracleDriver","/Users/grahn/code/jdbc/ojdbc6.jar")
conn<-dbConnect(drv,"jdbc:oracle:thin:@zulu.us.oracle.com:1521:orcl","grahn","grahn")
#
# import the data into a data.frame
@gregrahn
gregrahn / values.sql
Created June 20, 2013 21:20
Example of PostgreSQL VALUES() functionality in Cloudera Impala. Examples: https://issues.cloudera.org/browse/IMPALA-68 http://www.postgresql.org/docs/9.2/static/sql-values.html
[impala1:21000] >
select *
from (values ('2013-06-01' as col1),
('2013-06-02'),
('2013-06-02'),
('2013-06-03'),
('2013-06-04'),
('2013-06-05')
) x;
@gregrahn
gregrahn / bart-2012-salaries.tsv
Last active December 19, 2015 06:59
2012 Bart Salaries. Extracted from Public Employee Salaries Database. http://www.mercurynews.com/salaries/bay-area?Entity=Bay%20Area%20Rapid%20Transit
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 11 columns, instead of 12. in line 1.
Entity Name Title Base Overtime Other (vacation, sick, bonus, etc) Medical/Dental/Visual Employeer Contribution to Pension Employee Contribution to Pension Paid By Employer Employer Contribution to Deferred Compensation (401k) Misc Total Cost of Employment
Bay Area Rapid Transit Dugger, Dorothy General Mgr 298700 0 34500 14951 39521 23324 1869 6796 419661
Bay Area Rapid Transit Crunican, Grace General Mgr 312461 0 3846 19141 37513 17500 1869 7591 399921
Bay Area Rapid Transit Tietz, Forrest Police Sergeant 136746 111902 33921 18200 60630 156 0 1107 362662
Bay Area Rapid Transit Pangilinan, Edgardo Asst Controller 107785 0 214322 10903 13017 7650 1869 5734 361279
Bay Area Rapid Transit Lucarelli, Frank Police Lieutenant 173811 46280 33422 23364 76427 233 0 5019 358556
Bay Area Rapid Transit Collier, Roberta Asst Treasurer 33971 0 289534 1797 4072 2378 1869 4897 338518
Bay Area Rapid Transit Parker, Thomas Exec Mgr Transit System Compl 136544 0 145633 19139 16863 9923 1869 5584 335554
Bay Area Rapid Transit Ra
/* Instructions on compilation and execution
* =========================================
*
* Compile this program with pthreads:
*
* g++ -Wall -lpthread -o graphdb-simulator graphdb-simulator.cpp
*
* Before you run this program, you need to create the following
* directories:
*

Keybase proof

I hereby claim:

  • I am gregrahn on github.
  • I am gregrahn (https://keybase.io/gregrahn) on keybase.
  • I have a public key whose fingerprint is 9C32 D323 4E55 8113 FE4B CFEB FA4D 0C79 A267 C6C4

To claim this, I am signing this object:

@gregrahn
gregrahn / 1-tpcds-query92.sql
Last active April 17, 2017 02:04
When switching from the legacy Oracle top-n syntax using rownum to the ANSI/ISO SQL:2008 fetch first syntax, the WinMagic SQL transformation is lost.
-- Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
-- Using 1GB TPC-DS
-- Table DDL: https://raw.githubusercontent.com/gregrahn/tpcds-kit/master/tools/tpcds.sql
-- WinMagic paper: "WinMagic: subquery elimination using window aggregation"
-- https://pdfs.semanticscholar.org/0bfa/e505ad588d00d4b204acf8ba4b5646eac244.pdf
alter session set nls_date_format = 'YYYY-MM-DD';
-- start query 1 in stream 0 using template query92.tpl
Three comparison points:
Presto + RCFile vs Impala + RCFile vs Impala + Parquet
Note: Query time, CPU utilization, Disk read tput (KBRead)
Impala v1.1.1
Presto v0.52
================================================================================================================================
Presto + RCFile:
select ss_sold_date_sk, count(*) from store_sales_rcfile group by 1 order by 1 limit 2000;