Skip to content

Instantly share code, notes, and snippets.

View rishisidhu's full-sized avatar
🏠
Working from home

Rishi Sidhu rishisidhu

🏠
Working from home
  • https://aigraduate.com
View GitHub Profile
@rishisidhu
rishisidhu / tokenizer_padding.py
Created September 1, 2020 03:44
Adding Padding to the tokenization process
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
#Let's add custom sentences
sentences = [
"Apples are red",
"Apples are round",
"Oranges are round",
'Grapes are sour, oranges are sweet'
]
@rishisidhu
rishisidhu / oov_tokens.py
Created September 1, 2020 03:30
Working with OOV Tokens
from tensorflow.keras.preprocessing.text import Tokenizer
#Let's add custom sentences
sentences = [
"Apples are red",
"Apples are round",
"Oranges are round",
"Grapes are green"
]
@rishisidhu
rishisidhu / unseen_words.py
Created September 1, 2020 03:14
What happens when Python tokenizer comes across new words
# Unseen Words
test_data = [
'Grapes are sour but oranges are sweet',
]
test_seq = myTokenizer.texts_to_sequences(test_data)
print("\nTest Sequence = ", test_seq, " => ", [x for x in myTokenizer.sequences_to_texts_generator(test_seq)])
@rishisidhu
rishisidhu / word_index.py
Created September 1, 2020 02:38
Building a word index from training sentences
from tensorflow.keras.preprocessing.text import Tokenizer
#Let's add custom sentences
sentences = [
"Apples are red",
"Apples are round",
"Oranges are round",
"Grapes are green"
]
@rishisidhu
rishisidhu / sequence_to_text.py
Created August 25, 2020 06:00
Converting sequences back to text
from tensorflow.keras.preprocessing.text import Tokenizer
#Let's add custom sentences
sentences = [
"One plus one is two!",
"Two plus two is four!"
]
for num_w in range(1,7):
myTokenizer = Tokenizer(num_words=num_w)
@rishisidhu
rishisidhu / text_sequencing.py
Last active August 25, 2020 05:38
Looking at sequence length based on num_words parameter
from tensorflow.keras.preprocessing.text import Tokenizer
#Let's add custom sentences
sentences = [
"One plus one is two!",
"Two plus two is four!"
]
#most frequent words
for num_w in range(1,8):
@rishisidhu
rishisidhu / word_index.py
Created August 25, 2020 05:09
Getting word index from tokenizer
from tensorflow.keras.preprocessing.text import Tokenizer
#Let's add custom sentences
sentences = [
"One plus one is two!",
"Two plus two is four!"
]
#Tokenize the sentences
myTokenizer = Tokenizer(num_words=10)
@rishisidhu
rishisidhu / sets.js
Created July 10, 2020 13:09
Modern JS ES6+ features - Sets
//Create set from an array
var set1 = new Set([1, 2, 2, 3, 3, 3, 4, 4, 4, 4]);
console.log(set1);
//Add a new element to it
set1.add(5);
console.log(set1);
@rishisidhu
rishisidhu / default_parameters.js
Created July 10, 2020 13:04
Modern JS ES6+ features - Default Parameters
//Default parameter z
var sum = function (x, y, z = 10) {
return x + y + z;
};
console.log(`Sum = ${sum(10, 20)}`);
console.log(`Sum = ${sum(10, 20, 5)}`);
@rishisidhu
rishisidhu / arrow_functions.js
Created July 10, 2020 12:55
Modern JS ES6+ features - Arrow Functions
//The old way of writing functions
var addf = function add(x, y) {
var sum = x + y;
return sum;
};
console.log(`The sum is : ${addf(10, 20)}`);
//The new way of writing functions
var newAdd = (x, y) => x + y;
console.log(`The sum is : ${newAdd(10, 20)}`);