Skip to content

Instantly share code, notes, and snippets.

View geniusnhu's full-sized avatar

Nhu Hoang geniusnhu

View GitHub Profile
@geniusnhu
geniusnhu / chunk_train.py
Created September 1, 2021 03:45
Loading and Training data in chunk
>>> from sklearn.linear_model import SGDRegressor
>>> from sklearn.datasets import make_regression
>>> import numpy as np
>>> import pandas as pd
>>> ### Load original data
>>> original_data = pd.read_csv('sample.csv')
>>> print(f'Shape of original data {original_data.shape:.f02}')
Shape of original data (100000, 21)
@geniusnhu
geniusnhu / string_concat.py
Created September 1, 2021 03:42
Concatenate strings
### Concatenate string using '+' operation
def add_string_with_plus(iters):
s = ""
for i in range(iters):
s += "abc"
assert len(s) == 3*iters
### Concatenate strings using join() function
def add_string_with_join(iters):
l = []
@geniusnhu
geniusnhu / slots.py
Created September 1, 2021 03:41
Comparing class with and without __slots__
import numpy as np
import pandas as pd
import objgraph
### class without __slots__
class PointWithDict():
def __init__(self, iters):
self.iters = iters
def convert(self):
s = ["xyz"]*self.iters
import numpy as np
import itertools
import sys
def append_matrix_with_itertools(X, Y):
""" Loop matrix using itertools.product()
"""
MTX = np.zeros((X, Y))
@geniusnhu
geniusnhu / tracemalloc_example.py
Last active August 29, 2021 04:28
Example of using tracemalloc in Python
>>> import tracemalloc
>>> import numpy as np
>>> def create_array(x, y):
>>> x = x**2
>>> y = y**3
>>> return np.ones((x, y, 1024, 3), dtype=np.uint8)
>>> tracemalloc.start()
>>> ### Run application
@geniusnhu
geniusnhu / check_memory.py
Last active August 30, 2021 09:33
Check memory usage of an object in Python
>>> import numpy as np
>>> import sys
>>> import objgraph
>>> import psutil
>>> import pandas as pd
>>> ob = np.ones((1024, 1024, 1024, 3), dtype=np.uint8)
### Check object 'ob' size
@geniusnhu
geniusnhu / list_generator.py
Created August 29, 2021 02:22
List vs Generator
>>> import sys
>>> my_generator_list = (i for i in range(100000))
>>> print(f"My generator is {sys.getsizeof(my_generator_list)} bytes")
My generator is 128 bytes
>>> timeit(my_generator_list)
10000000 loops, best of 5: 32 ns per loop
>>> my_list = [i for i in range(100000)]
>>> print(f"My list is {sys.getsizeof(my_list)} bytes")
@geniusnhu
geniusnhu / data_optimize.py
Created August 28, 2021 12:05
Optimize dtype function in Python
def data_optimize(df, object_option=False):
"""Reduce the size of the input dataframe
Parameters
----------
df: pd.DataFrame
input DataFrame
object_option : bool, default=False
if true, try to convert object to category
@geniusnhu
geniusnhu / YourFunction.py
Created August 23, 2021 01:12
Example of code annotation for a function and class
## Example of code annotation for a function
def your_function(X):
""" Explanation of what the function does
Parameters
----------
X: dtype
explanation of X
y: dtype
@geniusnhu
geniusnhu / main.py
Last active August 23, 2021 01:09
Main code pipeline example
## Example of the main code pipeline in .py format
import sys, os
import pandas as pd
import numpy as np
from your_classes.ClassOne import load_funtion, preprocess_function, training_function, save_result_function
import your_classes.PATH
def main():