Skip to content

Instantly share code, notes, and snippets.

View hkilter's full-sized avatar
😐
Meh

H. Kemal İlter hkilter

😐
Meh
View GitHub Profile
@hkilter
hkilter / Mac for data science.md
Created April 23, 2024 15:24 — forked from hgavert/Mac for data science.md
Setting up OS X for Data Science

Setting up OS X for Data Science

I had to reinstall my laptop and at the same time I had new team member joining to the team. Therefore I started to write this as a tutorial or check list on how to setup a new MacBook Pro OS X for typical data science development. This is geared towards Scala based development and Spark as that's what we do at the moment. However, I'll start slightly more generally and will add some other things too. Let's start from the basics...

OS X

OS X is great for data science. However, it's missing configurations and apps that you need. Let's get started.

We need a good package manager, text editor, github source control, code editors and so on. But first will look at the command line, Terminal.

Terminal

Open up Terminal. If you don't know where to find it, open Spotlight search and type Terminal into it. Now, right click on it's icon in the Dock. Select Options - Keep in Dock. This way, it's always there when you need it. And you'll need it.

@hkilter
hkilter / veri.csv
Last active October 25, 2020 19:09
We can make this file beautiful and searchable if this error is corrected: It looks like row 4 should actually have 41 columns, instead of 1. in line 3.
iso_code,continent,location,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,total_cases_per_million,new_cases_per_million,new_cases_smoothed_per_million,total_deaths_per_million,new_deaths_per_million,new_deaths_smoothed_per_million,total_tests,new_tests,total_tests_per_thousand,new_tests_per_thousand,new_tests_smoothed,new_tests_smoothed_per_thousand,tests_per_case,positive_rate,tests_units,stringency_index,population,population_density,median_age,aged_65_older,aged_70_older,gdp_per_capita,extreme_poverty,cardiovasc_death_rate,diabetes_prevalence,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index
OWID_WRL,,World,2019-12-31,,,,,0.0,,,,,,0.0,,,,,,,,,,,,7794798729.0,58.045,30.9,8.696,5.355,15469.207,10.0,233.07,8.51,6.434,34.635,60.13,2.705,72.58,
OWID_WRL,,World,2020-01-01,,,,,0.0,,,,,,0.0,,,,,,,,,,,,7794798729.0,58.045,30.9,8.696,5.355,15469.207,10.0,233.07,8.51,6.434,34.635,60.13,2.705,72.58,
OWID_WRL
@hkilter
hkilter / gist:b43bce0b5968dac42c9243c15124ccea
Created August 22, 2019 01:55
An A-Z Index of the Apple macOS command line (OS X bash)
# https://ss64.com/osx/
# 21.08.2019
An A-Z Index of the Apple macOS command line (OS X bash)
afconvert Audio File Convert
afinfo Audio File Info
afplay Audio File Play
airport Manage Apple AirPort
alias Create an alias •
@hkilter
hkilter / README-Template.md
Created December 12, 2018 06:37 — forked from PurpleBooth/README-Template.md
A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

@hkilter
hkilter / datatool-01.tex
Created November 17, 2017 17:19
How to put content from multiple databases in one table using datatool?
%% https://tex.stackexchange.com/questions/117747/how-to-put-content-from-multiple-databases-in-one-table-using-datatool
\documentclass{article}
\usepackage{datatool}
% generate first test database
\begin{filecontents}{first.csv}
Year,Number
2001,10
@hkilter
hkilter / gist:b0bf77cc31c1cbd47f141a218e9ce3f8
Created October 6, 2017 12:53
Simple Google Apps Script to export a single sheet to PDF and email it to a contact list
// Simple function to send Weekly Status Sheets to contacts listed on the "Contacts" sheet in the MPD.
// Load a menu item called "Project Admin" with a submenu item called "Send Status"
// Running this, sends the currently open sheet, as a PDF attachment
function onOpen() {
var submenu = [{name:"Send Status", functionName:"exportSomeSheets"}];
SpreadsheetApp.getActiveSpreadsheet().addMenu('Project Admin', submenu);
}
function exportSomeSheets() {
@hkilter
hkilter / citation-analysis-sketch.R
Created September 14, 2016 15:05 — forked from benmarwick/citation-analysis-sketch.R
sketch of citation analysis
# sources:
# http://www.jgoodwin.net/?p=1223
# http://orgtheory.wordpress.com/2012/05/16/the-fragile-network-of-econ-soc-readings/
# http://nealcaren.web.unc.edu/a-sociology-citation-network/
# http://kieranhealy.org/blog/archives/2014/11/15/top-ten-by-decade/
# http://www.jgoodwin.net/lit-cites.png
###########################################################################
# This first section scrapes content from the Web of Science webpage. It takes

AsciiDoc Writer’s Guide

@hkilter
hkilter / falsepositivesversion2.R
Created September 27, 2015 11:02 — forked from vasishth/falsepositivesversion2.R
False positives in a lifetime [revised 23 Nov 2014; comments and corrections welcome]
## Our simulated scientist will declare
## significance only if he/she gets
## 2 replications with p<0.05:
stringent<-FALSE
## Set the above to FALSE if you want to
## have the scientist publish a single
## expt. as soon as it's significant:
#stringent <- FALSE
## num of scientists to simulate:
@hkilter
hkilter / weasel.py
Last active August 29, 2015 14:23
Weasel program by R. Dawkins
# code - http://rosettacode.org/wiki/Evolutionary_algorithm
from string import letters
from random import choice, random
target = list("METHINKS IT IS LIKE A WEASEL")
charset = letters + ' '
parent = [choice(charset) for _ in range(len(target))]
minmutaterate = .09
C = range(100)