Skip to content

Instantly share code, notes, and snippets.

View erp12's full-sized avatar

Eddie Pantridge erp12

View GitHub Profile
@erp12
erp12 / upush.clj
Last active August 30, 2023 01:53
Untyped Push Interpreter Prototype
(ns upush
(:require [clojure.math :refer [log]]
[clojure.string :as str]
[clojure.math.combinatorics :refer [selections]]))
(def instructions
{'+ {:fn +
:arity 2
:invariant (fn [a b]
@erp12
erp12 / cbgp_tiny_compile.py
Last active October 13, 2021 03:40
Tiny Code Building GP Program Compilation
import ast
import operator as op
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import List, Callable, Dict, Any, Optional, Tuple
import astor
from pyrsistent import pvector, PVector
@erp12
erp12 / README.md
Last active September 24, 2021 23:33
Spark Serialization ClassCastException

This gist documents an issue I have had when performing Spark interop from Clojure. When higher order functions are used, a serialization error is thrown that I can't make sense of.

  • not_working.clj has the minimal Clojure to reproduce the issue.
  • working.scala has a direct translation of the Clojure code into Scala. It does not throw the exception.
  • logs_and_exception.log has the Spark logs and exception trace that are produced when running not_working.clj.

Below is addition information about when the exception does/doesn't occur.

  • The exception is not raised (and -main behaves correctly) when:
  • not_working.clj is compiled into an uberjar.
@erp12
erp12 / build.clj
Created August 5, 2021 20:06
tools.build with multiple basis to control classes that end up in uberjar
(ns build
(:require [clojure.tools.build.api :as b]))
(def lib 'com.nortia-solutions/ppi-core)
(def version "0.0.1")
(def class-dir "target/classes")
(def uber-file (format "target/%s-%s-standalone.jar" (name lib) version))
@erp12
erp12 / README.md
Created October 4, 2020 17:10
Clojure fixture for clearing Spark managed tables

Background on Spark Testing

Spark has two stateful locations that are used to manage tables when running locally.

  1. The spark_warehouse is a directory specified by the session setting spark.sql.warehouse.dir. This is the location where every table's data files are stored.

  2. The metastore is an in-memory database that stores metadata about each table (its location, etc.)

Ideal Testing Setup

This file has been truncated, but you can view the full file.
{"nodes":[{"id":3466,"group":8},{"id":10310,"group":13},{"id":5052,"group":29},{"id":5346,"group":20},{"id":15159,"group":4},{"id":19640,"group":25},{"id":10243,"group":14},{"id":18648,"group":4},{"id":16470,"group":1},{"id":17822,"group":1},{"id":14265,"group":37},{"id":19738,"group":5},{"id":8612,"group":18},{"id":10822,"group":2},{"id":16258,"group":13},{"id":21194,"group":1},{"id":14123,"group":13},{"id":2710,"group":33},{"id":18757,"group":8},{"id":16148,"group":18},{"id":10794,"group":2},{"id":7050,"group":6},{"id":4846,"group":22},{"id":824,"group":13},{"id":2133,"group":12},{"id":6610,"group":68},{"id":6700,"group":31},{"id":11082,"group":12},{"id":14419,"group":14},{"id":17330,"group":17},{"id":18487,"group":27},{"id":22779,"group":11},{"id":23382,"group":30},{"id":12928,"group":11},{"id":13740,"group":11},{"id":13096,"group":22},{"id":22393,"group":5},{"id":3872,"group":8},{"id":23096,"group":1},{"id":8862,"group":7},{"id":22598,"group":18},{"id":8254,"group":13},{"id":17309,"group":1},{"id":24833,"
@erp12
erp12 / README.md
Created April 6, 2018 23:48
Occupancy Detection - COMPSCI590V HW4 - Edward Pantridge

The Occupancy Detection Dataset

Experimental data used for binary classification (room occupancy) from Temperature, Humidity, Light and CO2. Ground-truth occupancy was obtained from time stamped pictures that were taken every minute.

Dataset contains 20,560 records.

Source: https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+

Explanation of dataset

The dataset is a geojson file that specifies the shape of all the counties in the united states. I got shapefiles from the census website and converted into them into geojson using the QGIS software.

The color attribute is mapped to the length of the name of the county. The buttons at the top allow for zooming. The sliders along the left and bottom of the map allow for panning.

@erp12
erp12 / README.md
Last active February 19, 2018 04:51
COMPSCI590V HW2

1. Familiarize yourself with scatterplots:

  • Summary of what they are and how they are created.

    • Scatter plots are a type of vsiualization where each record in the dataset shown by a shape placed at a certain pos
  • How they are used.

    • The primary use of scatter plots is typically to compare two continuous features. This is done by useing one continous feature value as the x coordinate of each point and the other continous feature value as the y coordinate.