Skip to content

Instantly share code, notes, and snippets.

View Stiivi's full-sized avatar

Stefan Urbanek Stiivi

View GitHub Profile
@Stiivi
Stiivi / crosstable_example.py
Created June 8, 2012 22:41
Cubes OLAP - Drill-down Cross Table
import sqlalchemy
import cubes
import cubes.tutorial.sql as tutorial
DATA = "../examples/hello_world/data.csv"
MODEL = "../examples/hello_world/model.json"
engine = sqlalchemy.create_engine('sqlite:///:memory:')
tutorial.create_table_from_csv(engine,
DATA,
@Stiivi
Stiivi / sqlalchemy.py
Last active May 4, 2022 14:45
Cubes SQLAlchemy imports
"""Aliases for SQL/SQLAlchemy objects that are assured to be correctly
type-checked."""
import sqlalchemy
# Engine
# ======
Engine = sqlalchemy.engine.base.Engine
Connection = sqlalchemy.engine.base.Connection
@Stiivi
Stiivi / filestosql.zsh
Created October 9, 2019 10:54
Create/update a SQLite3 database file with all files in a given path.
#!/bin/zsh
#
# Create/update a SQLite3 database file with all files in a given path.
#
# Usage:
#
# filestosql SQLITE_DATABASE [PATH]
#
DB_FILE=${1}
@Stiivi
Stiivi / apriori_toy.py
Created February 7, 2013 10:58
Data toy: Apriori algorithm in Python
from collections import Counter
from itertools import combinations
def distinct_items(transactions, support=None):
"""Returns counted set of distinct items in transactions"""
counter = Counter()
for trans in transactions:
counter.update(trans)
if support is not None:
DEF TAG polymerase
DEF SLOT complement
# Template site
DEF SLOT t_site
# Complementary site
DEF SLOT c_site
DEF TAG nucleotide
DEF TAG promoter
DEF TAG free
DEF TAG taken
@Stiivi
Stiivi / brewery_example-aggregate.py
Created April 3, 2012 20:32
Data Brewery - Aggregate a Remote CSV File
"""
Data Brewery Example
Aggregate a remote CSV file.
"""
import brewery
main = brewery.create_builder()
main.csv_source("https://raw.github.com/Stiivi/cubes/master/examples/hello_world/data.csv")
@Stiivi
Stiivi / dallas_data_brewery-answers.markdown
Last active June 23, 2018 04:15
Dallas Data Brewery meetup group answers

What tools do you use?

  • Propreitary Software, R, Python, SQL, Gephi
  • Tableau; Excel; Access
  • SPSS in the application of psych statistics and research methods
  • Tableau, SQL, SPSS, R and other statistical tools.
  • SSMS, R, SSAS
  • Python, Matplotlib, Disco...
  • Proprietary
  • R, SPSS, SAS, Relational DB
@Stiivi
Stiivi / bubbles_pipeline_join.py
Last active January 19, 2018 12:00
Demonstration of new graph-based pipeline, graph execution (see debug logs) and joins on the pipeline. Also demonstrates dynamic dispatch when the pipeline is redirected to a SQL table. Works with bubbles commit 5e108ad3a3f46580ebfe16168c58308bc914cf30 from Jul 2 2013.
import bubbles
stores = { "target": bubbles.open_store("sql", "sqlite:///") }
p = bubbles.Pipeline(stores=stores)
p.source_object("csv_source", resource="data.csv", infer_fields=True)
# Uncomment this and see the difference in logs - SQL will be used
# p.create("target", "data")
@Stiivi
Stiivi / cubes2.0-goals.md
Last active August 5, 2017 00:51
Cubes 2.0 Goals

Cubes 2.0

Hi there. After almost two years of none or very sparse activity due to life and career situation, I’m committing myself back to the Cubes project. It will take some time to ramp-up, but we will eventually get there. I apologize for not meeting expectations lately and for letting the framework, mailing list and discussions go stale.

I got quite a lot of useful feedback and recommendations from users and people in the domain and that revived my motivation to spend more of my spare time to make Cubes better and modern OLAP toolkit.

Now, let’s move forward. To do any improvements or changes, Cubes needs quite a lot of housekeeping. The whole 2.0 release addresses that. Only when we have consistent, well-defined interface, when we have goals and equally importantly non-goals set, we can start growing Cubes again.

Links:

@Stiivi
Stiivi / interfaces.pyi
Created March 19, 2017 21:38
interfaces.pyi
from typing import Any, Iterator, List, Mapping, Optional
# Note: The value type `Any` should be a DB API 2 value type once defined
# TODO: See #1037
class RowProxy(Mapping[str, Any]): ...
class ResultProxy(Iterator[RowProxy]):
def keys(self) -> List[str]: ...
def close(self) -> None: ...