misho-kr/Feature Engineering.md

## Feature Engineering.md

      
    Raw
  

              Feature Engineering.md
            
          
    Feature Engineering

How you can improve the accuracy of your machine learning models? How to find which data columns make the most useful features? Here we will discuss the elements of good vs bad features and how you can preprocess and transform them for optimal use in your machine learning models.
Hands-on practice choosing features and preprocessing them inside of Google Cloud Platform with interactive labs. Code solutions which will be made public for your reference as you work on your own future data science projects.
Course Objectives:

Describe the major areas of Feature Engineering
Turn raw data to features
Compare good vs bad features
Represent features
Get started with preprocessing and feature creation
Use Apache Beam and Cloud Dataflow for feature engineering
Recognize where feature crosses are a powerful way to help machines learn
Implement feature crosses in TensorFlow
Incorporate feature creation as part of your ML pipeline
Improve the taxifare model using feature crosses
Implement feature preprocessing and feature creation using tf.transform
Carry out feature processing efficiently, at scale and on streaming data

Labs and Demos: Lab: Training Data Analyst
Introduction

Raw Data to Features

Representing Features

Lab: Improve model accuracy with new features
Preprocessing and Feature Creation

Apache Beam and Cloud Dataflow

Lab: Simple Dataflow Pipeline (Python) -- grep.py and grepc.py
Lab: MapReduce in Dataflow (Python) -- is_popular.py
Preprocessing with Cloud Dataprep

Lab: Computing Time-Windowed Features in Cloud Dataprep
Introducing Feature Crosses

Lab: Feature Crosses to create a good classifier
Implementing Feature Crosses

Lab: Improve ML Model with Feature Engineering
TensorFlow Transform

Lab: Exploring tf.transform
Summary