Skip to content

Instantly share code, notes, and snippets.

@Sanlap1997
Created October 3, 2020 06:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Sanlap1997/5d1e279dcf4b655cf13685344f7c3932 to your computer and use it in GitHub Desktop.
Save Sanlap1997/5d1e279dcf4b655cf13685344f7c3932 to your computer and use it in GitHub Desktop.
# function to reshape features into (samples, time steps, features)
def gen_sequence(id_df, seq_length, seq_cols):
""" Only sequences that meet the window-length are considered, no padding is used. This means for testing
we need to drop those which are below the window-length. An alternative would be to pad sequences so that
we can use shorter ones """
# for one id I put all the rows in a single matrix
data_matrix = id_df[seq_cols].values
num_elements = data_matrix.shape[0]
# Iterate over two lists in parallel.
# For example id1 have 192 rows and sequence_length is equal to 50
# so zip iterate over two following list of numbers (0,112),(50,192)
# 0 50 -> from row 0 to row 50
# 1 51 -> from row 1 to row 51
# 2 52 -> from row 2 to row 52
# ...
# 111 191 -> from row 111 to 191
for start, stop in zip(range(0, num_elements-seq_length), range(seq_length, num_elements)):
yield data_matrix[start:stop, :]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment