Skip to content

Instantly share code, notes, and snippets.

@misho-kr
Last active April 20, 2020 07:04
Show Gist options
  • Save misho-kr/56e90615bf9bab942de1758b80402c4b to your computer and use it in GitHub Desktop.
Save misho-kr/56e90615bf9bab942de1758b80402c4b to your computer and use it in GitHub Desktop.
Summary of "Feature Engineering" from Coursera.Org

How you can improve the accuracy of your machine learning models? How to find which data columns make the most useful features? Here we will discuss the elements of good vs bad features and how you can preprocess and transform them for optimal use in your machine learning models.

Hands-on practice choosing features and preprocessing them inside of Google Cloud Platform with interactive labs. Code solutions which will be made public for your reference as you work on your own future data science projects.

Course Objectives:

  • Describe the major areas of Feature Engineering
  • Turn raw data to features
  • Compare good vs bad features
  • Represent features
  • Get started with preprocessing and feature creation
  • Use Apache Beam and Cloud Dataflow for feature engineering
  • Recognize where feature crosses are a powerful way to help machines learn
  • Implement feature crosses in TensorFlow
  • Incorporate feature creation as part of your ML pipeline
  • Improve the taxifare model using feature crosses
  • Implement feature preprocessing and feature creation using tf.transform
  • Carry out feature processing efficiently, at scale and on streaming data

Labs and Demos: Lab: Training Data Analyst

Introduction

Raw Data to Features

Representing Features

Lab: Improve model accuracy with new features

Preprocessing and Feature Creation

Apache Beam and Cloud Dataflow

Lab: Simple Dataflow Pipeline (Python) -- grep.py and grepc.py

Lab: MapReduce in Dataflow (Python) -- is_popular.py

Preprocessing with Cloud Dataprep

Lab: Computing Time-Windowed Features in Cloud Dataprep

Introducing Feature Crosses

Lab: Feature Crosses to create a good classifier

Implementing Feature Crosses

Lab: Improve ML Model with Feature Engineering

TensorFlow Transform

Lab: Exploring tf.transform

Summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment