Skip to content

Instantly share code, notes, and snippets.

View sayakpaul's full-sized avatar
:octocat:
Learn, unlearn and relearn.

Sayak Paul sayakpaul

:octocat:
Learn, unlearn and relearn.
View GitHub Profile

Podcasts for Data Science & Stuff

I asked the Twittersphere for data science (& tangentially-related) podcasts recommendations, and got a much bigger response than I expected with some really superb recommendations, so I created a gist with the suggestions I received. They're arranged alphabetically by name below, along with relevant Twitter accounts, links, and names of the hosts (if I could find them).

Shoot me a tweet @bennyjtang if you have more suggestions to add to this list!

Original Twitter thread

Adversarial Learning

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
def separate_frames(output_dir, df):
if not os.path.exists(output_dir):
os.mkdir(output_dir)
# storing the frames from training videos
for i in tqdm(range(df.shape[0])):
count = 0
videoFile = df['video_name'][i]
videoPath = os.path.join("Videos", "UCF-101", videoFile)
cap = cv2.VideoCapture(videoPath) # capturing the video from the given path
frameRate = cap.get(5) #frame rate
single_digits = [i for i in range(10)]
def extract_frames(df, output_dir):
if not os.path.exists(output_dir):
os.mkdir(output_dir)
for i in tqdm(range(df.shape[0])):
count = 0
videoFile = df['video_name'][i]
videoPath = os.path.join("Videos", "UCF-101", videoFile)
# print(videoFile, videoPath)

This document walks you through the steps to prepare a wget compatible link from a file that is located in your Google Drive.

Motivation: When working in Deep Learning, we often use Google Colab, Kaggle Kernels, or Cloud Instances for training our models on GPUs. But the problem that comes with it is we often have to upload all the necessary files required to get things up and running. This is particularly problematic when we have a large dataset and this cannot be uploaded/gathered directly (sometimes, scp does not work as well). We may have a dataset stored in our Google Drives. In situations like that, we generally create a wget compatible link from the file (typically the dataset) located in our Google Drive (this document only deals with Google Drive).

Steps:

  • Right click on the file (located in Google Drive) and click on "Share".
  • In the Link sharing on section, change the permissions of your file to "Anyone with the link can view" and copy the link.
  • Now, the link should resemble `
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
# Load the MobileNetV2 model but exclude the classification layers
EXTRACTOR = MobileNetV2(weights='imagenet', include_top=False,
input_shape=(224, 224, 3))
# We are fine-tuning
EXTRACTOR.trainable = True
# Construct the head of the model that will be placed on top of the
# the base model
class_head = EXTRACTOR.output
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = get_training_model()
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = get_training_model()
# Prepare batches and randomly shuffle the training images (this time with prefetch)
train_batches = train.shuffle(1024).repeat().batch(batch_size).\
prefetch(tf.data.experimental.AUTOTUNE)
valid_batches = valid.repeat().batch(batch_size).\
prefetch(tf.data.experimental.AUTOTUNE)