Skip to content

Instantly share code, notes, and snippets.

@santhalakshminarayana
Last active January 6, 2020 07:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save santhalakshminarayana/b0b4b66dbf9a6b8d1ddf477ad39b0ea3 to your computer and use it in GitHub Desktop.
Save santhalakshminarayana/b0b4b66dbf9a6b8d1ddf477ad39b0ea3 to your computer and use it in GitHub Desktop.
window = 5 # max_seq_length
sequences, next_words = [], []
for quote in quotes:
words = quote.split(' ')
for i in range(0,len(words) - window + 1):
sequences.append(words[i:i+window])
if (i + window) < len(words):
next_words.append(words[i+window])
else:
next_words.append(';')
tot_seq = len(next_words)
X = np.zeros((tot_seq, window, num_dim))
Y = np.zeros((tot_seq,vocab_len))
for i,seq in enumerate(sequences):
for j,word in enumerate(seq):
num_id = word_to_int[word]
X[i][j] = np.squeeze(emb[num_id])
num_id = word_to_int[next_words[i]]
Y[i][num_id] = 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment