-
-
Save Sanlap1997/5d1e279dcf4b655cf13685344f7c3932 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# function to reshape features into (samples, time steps, features) | |
def gen_sequence(id_df, seq_length, seq_cols): | |
""" Only sequences that meet the window-length are considered, no padding is used. This means for testing | |
we need to drop those which are below the window-length. An alternative would be to pad sequences so that | |
we can use shorter ones """ | |
# for one id I put all the rows in a single matrix | |
data_matrix = id_df[seq_cols].values | |
num_elements = data_matrix.shape[0] | |
# Iterate over two lists in parallel. | |
# For example id1 have 192 rows and sequence_length is equal to 50 | |
# so zip iterate over two following list of numbers (0,112),(50,192) | |
# 0 50 -> from row 0 to row 50 | |
# 1 51 -> from row 1 to row 51 | |
# 2 52 -> from row 2 to row 52 | |
# ... | |
# 111 191 -> from row 111 to 191 | |
for start, stop in zip(range(0, num_elements-seq_length), range(seq_length, num_elements)): | |
yield data_matrix[start:stop, :] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment