Skip to content

Instantly share code, notes, and snippets.

View Keiku's full-sized avatar
🐢
Slowly but surely.

Keiichi Kuroyanagi Keiku

🐢
Slowly but surely.
View GitHub Profile
@Keiku
Keiku / read_column_containing_list.py
Created June 28, 2020 06:57
Read a column containing a list as a list
# Reference: python - Reading csv containing a list in Pandas - Stack Overflow https://stackoverflow.com/questions/20799593/reading-csv-containing-a-list-in-pandas
# Copy this text.
"""
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:58:57.973614']"
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:58:59.237387']"
HK,"[u'5328.1', u'5329.3', '2013-12-27 13:59:00.346325']"
"""
import ast
@Keiku
Keiku / concat_strings_in_all_combinations.py
Created June 28, 2020 06:49
Concatenate strings in all combinations
import itertools
['%s%s' % (x, y) for x, y in itertools.product(['a', 'b'], ['1', '2'])]
# ['a1', 'a2', 'b1', 'b2']
@Keiku
Keiku / remove_s3_files_before_specified_last_update_time.sh
Last active June 26, 2020 12:00
Remove S3 files before the specified last update time
# Remove S3 files before the specified last update time
# Reference: amazon s3 - aws cli s3 bucket remove object with date condition - Stack Overflow https://stackoverflow.com/questions/51375531/aws-cli-s3-bucket-remove-object-with-date-condition
aws s3 ls --recursive s3://path/to/ | awk '$1 < "2020-06-25 12:00:00" {print $4}' | xargs -n1 -t -I 'KEY' aws s3 rm s3://path/to/'KEY'
@Keiku
Keiku / get_image_paths.py
Created June 18, 2020 03:17
Get image paths.
import pathlib
# get image paths list in a directory
image_dir = pathlib.Path('images').resolve()
exts = ['.jpg', '.png']
image_paths = [path for path in image_dir.rglob('*') if path.suffix.lower() in exts]
# include parent directory
image_paths = [pathlib.Path(*path.parts[-2:]).as_posix() for path in image_dir.rglob('*') if path.suffix.lower() in exts]
@Keiku
Keiku / reset_seaborn_settings.py
Created June 9, 2020 03:50
Reset the seaborn setting once set.
# Reset the seaborn setting once set. It can be used in the middle of a notebook.
# Reference: python seaborn to reset back to the matplotlib - Stack Overflow https://stackoverflow.com/questions/26899310/python-seaborn-to-reset-back-to-the-matplotlib
# Either of the following may be used
# in matplotlib
import matplotlib as mpl
mpl.rcParams.update(mpl.rcParamsDefault)
# in seaborn
@Keiku
Keiku / read_copytext.py
Created January 19, 2018 10:25
Read copy text to pandas DataFrame.
import pandas as pd
from io import StringIO
def read_copytext(text):
text1 = StringIO(text)
df = pd.read_table(text1)
df.columns = ["col1"]
df["col1"] = df["col1"].str.replace("\s+", ",")
@Keiku
Keiku / split_KFold.py
Last active May 2, 2017 07:10
Split K-fold validation dataset.
import string
import numpy as np
import pandas as pd
from sklearn.model_selection import KFold, StratifiedKFold
X_train = np.random.random((10, 2))
y_train = np.array([1, 1, 1, 1, 1, 0, 0, 0, 0, 0])
column = "pred"
n_fold = 5
@Keiku
Keiku / get_wordnet_synonyms.py
Created April 28, 2017 07:04
Extract the synonyms by using wordnet.
from itertools import chain
from nltk.corpus import wordnet
synonyms = wordnet.synsets('change')
lemmas = set(chain.from_iterable([word.lemma_names() for word in synonyms]))
lemmas
# Out[31]:
# {'alter',
# 'alteration',
# 'change',
@Keiku
Keiku / stack_sparse_matrix.py
Created April 28, 2017 02:18
Stack the sparse matrices.
import numpy as np
import scipy as sp
import pandas as pd
df1 = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df2 = pd.DataFrame({"C": [5, 6]})
X1 = sp.sparse.csr_matrix(df1.values)
X1_dense = X1.todense()
# Out[28]:
@Keiku
Keiku / list_operations.py
Created April 18, 2017 07:43
list operations.
import numpy as pd
# Python
list(map(lambda x: x + 1, range(1, 6, 1)))
# Out[1]: [2, 3, 4, 5, 6]
# Numpy
list(np.array(range(1, 6, 1)) + 1)
# Out[2]: [2, 3, 4, 5, 6]