Skip to content

Instantly share code, notes, and snippets.

@hgavert
hgavert / Bloomfilter.scala
Last active January 2, 2016 22:49
Scalding and Monoid introduction talk at Helsinki Data Science meetup on 2014-01-09. Added the BloomFilter and HyperLogLog examples later on.
import com.twitter.algebird._
import com.twitter.algebird.Operators._
// generate 2 lists
val A = (1 to 300).toList
val B = (201 to 400).toList
// Generate a Bloomfilter
val NUM_HASHES = 6
val WIDTH = 6000 // bits
@hgavert
hgavert / Mac for data science.md
Created May 28, 2016 09:39
Setting up OS X for Data Science

Setting up OS X for Data Science

I had to reinstall my laptop and at the same time I had new team member joining to the team. Therefore I started to write this as a tutorial or check list on how to setup a new MacBook Pro OS X for typical data science development. This is geared towards Scala based development and Spark as that's what we do at the moment. However, I'll start slightly more generally and will add some other things too. Let's start from the basics...

OS X

OS X is great for data science. However, it's missing configurations and apps that you need. Let's get started.

We need a good package manager, text editor, github source control, code editors and so on. But first will look at the command line, Terminal.

Terminal

Open up Terminal. If you don't know where to find it, open Spotlight search and type Terminal into it. Now, right click on it's icon in the Dock. Select Options - Keep in Dock. This way, it's always there when you need it. And you'll need it.

How to map GPS locations to Grid Database from Statistics Finland

Keywords: Tilastokeskus, Ruututietokanta, GPS, WGS84, EUREF-FIN, ETRS89-TM35FIN

The grid of the DB

The Grid database contains 250m x 250m squares. The square is definied by the x- and y-koordinates of the lower left corner of the grid. This corner defines the square. In the database, they are even meters without decimals.

Matching GPS point to Grid DB

This example is using Scala and the Sanoma-CDA / maxmind-geoip2-scala library. To use that library, you should doanload it and publish either locally or to your artifactory.

@hgavert
hgavert / RuuviTag_scanning.js
Created March 26, 2021 11:27
Node-RED RuuviTag scanning sub-flow
[{"id":"6480da.d93d7f28","type":"subflow","name":"RuuviTag scanning","info":"","category":"","in":[],"out":[{"x":1480,"y":180,"wires":[{"id":"c9ba8525.266648","port":0}]}],"env":[],"color":"#DDAA99"},{"id":"9a6ad327.1492f8","type":"function","z":"6480da.d93d7f28","name":"Stop scanning","func":"msg.payload.scan=false;\nreturn msg;","outputs":1,"noerr":0,"x":400,"y":100,"wires":[["ddf1ec6f.5dcf"]]},{"id":"43844f9d.dc98c","type":"inject","z":"6480da.d93d7f28","name":"Start scanning (interval)","topic":"","payload":"{\"scan\": true}","payloadType":"json","repeat":"60","crontab":"","once":true,"onceDelay":"1.0","x":150,"y":200,"wires":[["9a6ad327.1492f8","ca0944a3.c6315","b3655771.475a28"]]},{"id":"ddf1ec6f.5dcf","type":"delay","z":"6480da.d93d7f28","name":"","pauseType":"delay","timeout":"15","timeoutUnits":"seconds","rate":"1","nbRateUnits":"1","rateUnits":"second","randomFirst":"1","randomLast":"5","randomUnits":"seconds","drop":false,"x":580,"y":100,"wires":[["ca0944a3.c6315","b3655771.475a28"]]},{"id":"ca0944