Skip to content

Instantly share code, notes, and snippets.

View fabsta's full-sized avatar

Fabian Schreiber fabsta

View GitHub Profile
@fabsta
fabsta / pandas_columns_to_dict.py
Created January 27, 2020 14:28
[pandas columns to dict] converts two columns to a dictionary #python #pandas
area_dict = dict(zip(lakes.area, lakes.count))
@fabsta
fabsta / pandas_read_csv.py
Created January 27, 2020 14:25
[pandas read csv] read in a csv file #python
# often works
df = pd.read_csv('file.csv')
df = pd.read_csv('file.csv', header=0, index_col=0, quotechar='"',sep=':', na_values = ['na', '-', '.', ''])
# specifying "." and "NA" as missing values in the Last Name column and "." as missing values in Pre-Test Score column
df = pd.read_csv('../data/example.csv', na_values={'Last Name': ['.', 'NA'], 'Pre-Test Score': ['.']})
# skipping the top 3 rows
df = pd.read_csv('../data/example.csv', na_values=sentinels, skiprows=3)
# interpreting "," in strings around numbers as thousands separators
df = pd.read_csv('../data/example.csv', thousands=',')
@fabsta
fabsta / python_for_list.py
Last active January 27, 2020 14:28
[python iterate list] Iterate through a python list #python #pandas
# list = [1, 3, 5, 7, 9]
# Using for loop
for i in list:
print(i)
# 1
# 3
# for index
for i in range(length):
print(list[i])
@fabsta
fabsta / reading_data.md
Last active December 5, 2017 17:40
Reading data #deeplearning #InputOutput

minst

from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(X_train.shape, y_train.shape, X_test.shape, y_test.shape)
>> ((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))

X_test = np.expand_dims(X_test,1)
X_train = np.expand_dims(X_train,1)
@fabsta
fabsta / adding_data_augmentation.md
Last active November 26, 2017 17:23
[Reduce Overfitting] #deeplearning

About data augmentation

Keras comes with very convenient features for automating data augmentation. You simply define what types and maximum amounts of augmentation you want, and keras ensures that every item of every batch randomly is changed according to these settings. Here's how to define a generator that includes data augmentation: In [26]:

dim_ordering='tf' uses tensorflow dimension ordering, which is the same order as matplotlib uses for display. Therefore when just using for display purposes, this is more convenient

gen = image.ImageDataGenerator(rotation_range=10, width_shift_range=0.1, 
@fabsta
fabsta / examples.md
Last active November 25, 2017 13:36
Visualization #datascience

Observing Model Predictions

source: https://www.cs.utah.edu/~cmertin/dogs+cats+redux.html

First, we need to calculate the predictions on the validation set, since we know those labels, rather than looking at the test set. In [19]:

vgg.model.load_weights(latest_weights_filename)

In [20]:

@fabsta
fabsta / clipping_predictions.md
Last active November 30, 2017 13:21
[Kaggle tipps] useful kaggle tips collected along the way #deeplearning

Input

array([[  1.9247e-01,   7.2496e-04,   3.7586e-05,   2.4820e-05,   8.0483e-01,   1.4839e-03,
          3.4440e-06,   4.3349e-04],
       [  7.4949e-02,   2.5567e-04,   9.0141e-05,   2.7097e-04,   3.8967e-01,   8.0172e-04,
          4.2277e-04,   5.3354e-01],
       [  7.3892e-02,   8.5835e-04,   4.3923e-05,   8.5646e-04,   4.6396e-01,   4.9485e-05,
          1.5451e-03,   4.5879e-01],
       [  8.8657e-01,   2.1959e-03,   9.6101e-05,   3.6997e-04,   6.2324e-02,   1.6894e-05,
          3.1924e-05,   4.8398e-02]], dtype=float32)
@fabsta
fabsta / Filelink.md
Last active December 15, 2017 08:40
Jupyter useful stuff #jupyter