nitred/about_ml.md

## about_ml.md

      
    Raw
  

              about_ml.md
            
          
    About

Machine learning utilities
Table of Contents


Installation Guides

cuda_cudnn_installation.md

CUDA 9.1 and cuDNN 7.1.2 on Ubuntu 16.04
CUDA 9.0 and cuDNN 7.0.5 on Ubuntu 16.04 (Same tutorial but download different files from the NVIDIA website)


Jetson OpenCV Installation


Tensorflow Datasets

Importing image data from filenames: Single iterator for both training and testing datasets
Importing image data from filenames: Multiple iterators for training, eval_training and eval_testing datasets


Tensorflow Sharing Variables

Not training or optimizing shared variables
Training or optimizing shared variables
Copying variables from model one to model two
Optimizing original variables that have been updated using tf.assign


Tensorflow Variable Updates

How variables are updated when update operations are called explicity and implicitly
Is feed_dict required for placeholder independent parts of the graph? - Apparently not
Computing gradients in one sess.run and then applying gradients in another sess.run


## cuda_cudnn_installation.md

      
    Raw
  

              cuda_cudnn_installation.md
            
          
    About

Instructions for installing CUDA 9.1 and cuDNN 7.1.2 on Ubuntu 16.04
A similar process can be followed for installing other version (especially from CUDA 7.5 onwards). Just download the official CUDA and Cudnn files from Nvidia.
Instructions


Purge existing CUDA installations

SO: Guide


Download all the required files

Download official CUDA file: cuda-repo-ubuntu1604-9-1-local_9.1.85-1_amd64.deb
Download official CUDA patch file: cuda-repo-ubuntu1604-9-1-local-cublas-performance-update-3_1.0-1_amd64.deb
Download official cuDNN file: libcudnn7_7.1.2.21-1+cuda9.1_amd64.deb
Download official cuDNN file: libcudnn7-dev_7.1.2.21-1+cuda9.1_amd64.deb


Install CUDA

sudo dpkg -i cuda-repo-ubuntu1604-9-1-local_9.1.85-1_amd64.deb cuda-repo-ubuntu1604-9-1-local-cublas-performance-update-3_1.0-1_amd64.deb
sudo apt-get udpdate
sudo apt-get install cuda-9-1


Setup PATH variables in .bashrc

export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}


Verify CUDA installation by using nvcc. If nvcc hasn't been detected then something went wrong. Just look for a solution online.

nvcc --version


Install cuDNN

sudo dpkg -i libcudnn7_7.1.2.21-1+cuda9.1_amd64.deb libcudnn7-dev_7.1.2.21-1+cuda9.1_amd64.deb
sudo apt-get update
sudo apt-get install libcudnn7 libcudnn7-dev


Restart your computer. I've faced issues where the correct libcudnn was not being recognized until after the restart.


## jetson_opencv.sh
#!/bin/bash
# License: MIT. See license file in root directory
# Copyright(c) JetsonHacks (2017)
cd $HOME
sudo apt-get install -y \
    libglew-dev \
    libtiff5-dev \
    zlib1g-dev \
    libjpeg-dev \
    libpng12-dev \
    libjasper-dev \
    libavcodec-dev \
    libavformat-dev \
    libavutil-dev \
    libpostproc-dev \
    libswscale-dev \
    libeigen3-dev \
    libtbb-dev \
    libgtk2.0-dev \
    cmake \
    pkg-config

# Python 2.7
sudo apt-get install -y python3-dev python3-numpy python3-py python3-pytest -y
# sudo apt-get install -y python-dev python-numpy python-py python-pytest -y
# GStreamer support
sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev

# git clone https://github.com/opencv/opencv.git
# cd opencv
# git checkout -b v3.3.0 3.3.0
# This is for the test data
# cd $HOME
# git clone https://github.com/opencv/opencv_extra.git
# cd opencv_extra
# git checkout -b v3.3.0 3.3.0

cd $HOME/opencv
mkdir build
cd build
# Jetson TX2
cmake \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX=/usr \
    -DBUILD_PNG=OFF \
    -DBUILD_TIFF=OFF \
    -DBUILD_TBB=OFF \
    -DBUILD_JPEG=OFF \
    -DBUILD_JASPER=OFF \
    -DBUILD_ZLIB=OFF \
    -DBUILD_EXAMPLES=ON \
    -DBUILD_opencv_java=OFF \
    -DBUILD_opencv_python2=OFF \
    -DBUILD_opencv_python3=ON \
    -DENABLE_PRECOMPILED_HEADERS=OFF \
    -DWITH_OPENCL=OFF \
    -DWITH_OPENMP=OFF \
    -DWITH_FFMPEG=ON \
    -DWITH_GSTREAMER=ON \
    -DWITH_GSTREAMER_0_10=OFF \
    -DWITH_CUDA=OFF \
    -DWITH_GTK=ON \
    -DWITH_VTK=OFF \
    -DWITH_TBB=ON \
    -DWITH_1394=OFF \
    -DWITH_OPENEXR=OFF \
    -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-9.0 \
    -DCUDA_ARCH_BIN=6.2 \
    -DCUDA_ARCH_PTX="" \
    -DINSTALL_C_EXAMPLES=ON \
    -DINSTALL_TESTS=ON \
    -DOPENCV_TEST_DATA_PATH=../opencv_extra/testdata \
    ../

# Consider using all 6 cores; $ sudo nvpmodel -m 2 or $ sudo nvpmodel -m 0
make -j4
make
sudo make install

## tf_dataset_multiple_iterator.md

      
    Raw
  

              tf_dataset_multiple_iterator.md
            
          
    About


In this usecase or pattern, we will be using a multiple iterators for generating training dataset as well as eval_training and eval_testing datasets.
This pattern is slightly complex but it allows us to use training and evaluation datasets simultaneously without resetting each other.
This pattern is useful if you wanted to evaluate your model in the middle of your training epoch. Usually for large image datasets, a lot of learning takes places within an epoch. It would be nice to have access to evaluation loss and accuracy every 100 batches. This lets us use separate iterators for the training and evaluation datasets.

WARNING: The _check_accuracy function is quite useful but it is not documented well enough.
Code Template

def _check_accuracy(sess, correct_prediction, dataset_init_op, merged_summary, file_writer, global_step, is_training, use_dataset, use_dataset_str):
    """Check the accuracy of the model on either train or val (depending on dataset_init_op).

    Source: https://gist.github.com/omoindrot/dedc857cdc0e680dfb1be99762990c9c/

    Args:
        sess (tf.Session): Current session.
        correct_prediction (tf.Tensor): Boolean tensor of correct predictions of one batch.
            It is True for the indexes within a batch where the prediction was correct
            and False for the indexes where the prediction was wrong.
        dataset_init_op (tf.Dataset.Iterator.Initializer): The training or validation
            iterator initializer.
    """
    # Initialize the correct dataset
    sess.run([dataset_init_op], feed_dict={is_training: False, use_dataset: use_dataset_str})

    num_correct, num_samples = 0, 0
    while True:
        try:
            correct_pred, summary, step = sess.run(
                [correct_prediction, merged_summary, global_step], feed_dict={is_training: False, use_dataset: use_dataset_str})
            num_correct += correct_pred.sum()
            num_samples += correct_pred.shape[0]
        except tf.errors.OutOfRangeError:
            break
    file_writer.add_summary(summary, step)

    # Return the fraction of datapoints that were correctly classified
    acc = float(num_correct) / num_samples
    return acc


def _preprocess_function(image, label, config):
    """Preprocess the dataset item where the inputs are an image and a label."""
    if config['standardize']:
        image = tf.image.per_image_standardization(image)
    return image, label
    
def _parsing_function(filename, label, config):
    """A function that parses one item of a dataset."""
    file_string = tf.read_file(filename)
    image = tf.image.decode_jpeg(contents=file_string)
    image = tf.reshape(image, shape=config['image_shape'])
    image = tf.cast(image, tf.float32)
    return image, label

########################################################################
# Placeholders which help us decide which dataset_iterator to use
########################################################################
is_training = tf.placeholder(tf.bool)
use_dataset = tf.placeholder(tf.string)
use_dataset_train = tf.constant("train", dtype=tf.string)  # For comparison
use_dataset_eval_train = tf.constant("eval_train", dtype=tf.string)  # For comparison
use_dataset_eval_test = tf.constant("eval_test", dtype=tf.string)  # For comparison

########################################################################
# Dataset Functions, partial fill
########################################################################
config = read_config(config_filename)
from functools import partial
parsing_function = partial(_parsing_function, config=config)
preprocess_function = partial(_preprocess_function, config=config)

########################################################################
# Train tf.Dataset
########################################################################
train_dataset = tf.data.Dataset.from_tensor_slices((train_filenames, train_labels))
train_dataset = train_dataset.shuffle(10000)
train_dataset = train_dataset.map(parsing_function,
                                  num_parallel_calls=dataset_config['num_parallel_calls'])
train_dataset = train_dataset.map(preprocess_function,
                                  num_parallel_calls=dataset_config['num_parallel_calls'])
train_dataset = train_dataset.prefetch(buffer_size=2 * dataset_config['batch_size'])
batched_train_dataset = train_dataset.batch(batch_size=dataset_config['batch_size'])
logger.info("batched_train_dataset.shape: {}".format(batched_train_dataset.output_shapes))

########################################################################
# Eval Train tf.Dataset
########################################################################
eval_train_dataset = tf.data.Dataset.from_tensor_slices((train_filenames, train_labels))
eval_train_dataset = eval_train_dataset.map(parsing_function,
                                            num_parallel_calls=dataset_config['num_parallel_calls'])
eval_train_dataset = eval_train_dataset.map(preprocess_function,
                                            num_parallel_calls=dataset_config['num_parallel_calls'])
eval_train_dataset = eval_train_dataset.prefetch(buffer_size=2 * dataset_config['eval_batch_size'])
eval_train_dataset = eval_train_dataset.shuffle(buffer_size=2 * dataset_config['eval_batch_size'])
batched_eval_train_dataset = eval_train_dataset.batch(batch_size=dataset_config['eval_batch_size'])
logger.info("batched_eval_train_dataset.shape: {}".format(batched_eval_train_dataset.output_shapes))

########################################################################
# Eval Test tf.Dataset
########################################################################
eval_test_dataset = tf.data.Dataset.from_tensor_slices((test_filenames, test_labels))
eval_test_dataset = eval_test_dataset.map(parsing_function,
                                          num_parallel_calls=dataset_config['num_parallel_calls'])
eval_test_dataset = eval_test_dataset.map(preprocess_function,
                                          num_parallel_calls=dataset_config['num_parallel_calls'])
eval_test_dataset = eval_test_dataset.prefetch(buffer_size=2 * dataset_config['eval_batch_size'])
eval_test_dataset = eval_test_dataset.shuffle(buffer_size=2 * dataset_config['eval_batch_size'])
batched_eval_test_dataset = eval_test_dataset.batch(batch_size=dataset_config['eval_batch_size'])
logger.info("batched_eval_test_dataset.shape: {}".format(batched_eval_test_dataset.output_shapes))

########################################################################
# Dataset Iterators
########################################################################
iterator_train = tf.data.Iterator.from_structure(output_types=batched_train_dataset.output_types,
                                                 output_shapes=batched_train_dataset.output_shapes)
train_iterator_init_op = iterator_train.make_initializer(batched_train_dataset,
                                                         name='train_iterator')

iterator_eval_train = tf.data.Iterator.from_structure(output_types=batched_train_dataset.output_types,
                                                      output_shapes=batched_train_dataset.output_shapes)
eval_train_iterator_init_op = iterator_eval_train.make_initializer(batched_eval_train_dataset,
                                                                   name='eval_train_iterator')

iterator_eval_test = tf.data.Iterator.from_structure(output_types=batched_train_dataset.output_types,
                                                     output_shapes=batched_train_dataset.output_shapes)
eval_test_iterator_init_op = iterator_eval_test.make_initializer(batched_eval_test_dataset,
                                                                 name='eval_test_iterator')

def f_train(): return iterator_train.get_next()

def f_eval_train(): return iterator_eval_train.get_next()

def f_eval_test(): return iterator_eval_test.get_next()

########################################################################
# Get batches from the iterators
########################################################################
# The depending on which initializer is `sess.run`, that data is generated
_batch_x, _batch_y = tf.case(pred_fn_pairs={
    tf.equal(use_dataset, use_dataset_train): f_train,
    tf.equal(use_dataset, use_dataset_eval_train): f_eval_train,
    tf.equal(use_dataset, use_dataset_eval_test): f_eval_test
}, exclusive=True)
logger.info("batch_x.shape: {}, batch_y.shape: {}".format(_batch_x.shape, _batch_y.shape))

#
# ... use batch_x and batch_y
# ... your architecture
# ... your loss and optimization operations
#


############################################################################
# Start session
############################################################################
with tf.Session(graph=graph) as sess:
    ########################################################################
    # Learning
    ########################################################################
    n_epochs = 10
    start_epoch = last_epoch + 1
    for epoch in range(n_epochs):
        ####################################################################
        # Training: Run until the training iterator has ended
        ####################################################################
        start_time = time.time()
        # Initialize training iterator
        sess.run(train_iterator_init_op)
        sess.run(eval_train_iterator_init_op)
        sess.run(eval_test_iterator_init_op)

        ####################################################################
        # Training: Run until the training iterator has ended
        ####################################################################
        step = 0
        while True:
            ####################################################################
            # Summary Section, occurs only once in a while (is_training=False)
            ####################################################################
            if step % admin_config['train_summary_step'] == 0:
                # Eval Train
                batch_loss, batch_accuracy, merged_summary, global_step = sess.run(
                    [_loss, _accuracy, _merged_summary, _global_step],
                    feed_dict={is_training: False, use_dataset: 'eval_train'})
                logger.info("Eval-Train: Epoch {}/{} ::: step {:5} ::: global_step {:5} ::: batch_loss {:10.4f} ::: batch_accuracy {:10.2f}"
                            .format(epoch, n_epochs, step, global_step, float(batch_loss), float(batch_accuracy)))

                # Eval Test
                batch_loss, batch_accuracy, merged_summary, global_step = sess.run(
                    [_loss, _accuracy, _merged_summary, _global_step],
                    feed_dict={is_training: False, use_dataset: 'eval_test'})
                logger.info("Eval-Test : Epoch {}/{} ::: step {:5} ::: global_step {:5} ::: batch_loss {:10.4f} ::: batch_accuracy {:10.2f}"
                            .format(epoch, n_epochs, step, global_step, float(batch_loss), float(batch_accuracy)))

                # Prefetch for the next iteration
                sess.run(eval_train_iterator_init_op)
                sess.run(eval_test_iterator_init_op)

            ####################################################################
            # Training Section, occurs every step, even on summary steps (is_training= True)
            ####################################################################
            try:
                sess.run(_train, feed_dict={is_training: True, use_dataset: 'train'})
            except tf.errors.OutOfRangeError:
                end_time = time.time()
                logger.info("Training epoch ended: {:<5d}, Time taken: {:0.2f}"
                            .format(epoch, end_time - start_time))
                break

            step = step + 1

        ####################################################################
        # Validation: Check overall training and test accuracy after every epoch
        ####################################################################
        train_acc = _check_accuracy(sess, _y_correct, train_iterator_init_op,
                                    _merged_summary, train_writer, _global_step, is_training, use_dataset, 'eval_train')
        test_acc = _check_accuracy(sess, _y_correct, eval_test_iterator_init_op,
                                   _merged_summary, test_writer, _global_step, is_training, use_dataset, 'eval_test')
        logger.error("Epoch {}/{} ::: END OF EPOCH ::: train_acc {:0.4f} ::: test_acc {:0.4f}"
                     .format(epoch, n_epochs, train_acc, test_acc))

  
## tf_dataset_single_iterator.md

      
    Raw
  

              tf_dataset_single_iterator.md
            
          
    About


In this usecase or pattern, we will be using a single iterator for generating both training and testing dataset.
This pattern is simple to use and easy to understand.
One drawback of this pattern is that you cannot use the training and testing datasets simultaneously.

This pattern would be useful only if you intend to train your model using the entire training dataset once (i.e. one entire epoch) and then run the evaluation pipeline on the entire test dataset once and then repeat this process several times or for several epochs.


Code Template

def _check_accuracy(sess, correct_prediction, dataset_init_op):
    """Check the accuracy of the model on either train or val (depending on dataset_init_op).

    Source: https://gist.github.com/omoindrot/dedc857cdc0e680dfb1be99762990c9c/

    Args:
        sess (tf.Session): Current session.
        correct_prediction (tf.Tensor): Boolean tensor of correct predictions of one batch.
            It is True for the indexes within a batch where the prediction was correct
            and False for the indexes where the prediction was wrong.
        dataset_init_op (tf.Dataset.Iterator.Initializer): The training or validation
            iterator initializer.
    """
    # Initialize the correct dataset
    sess.run(dataset_init_op)
    num_correct, num_samples = 0, 0
    while True:
        try:
            correct_pred = sess.run(correct_prediction)
            num_correct += correct_pred.sum()
            num_samples += correct_pred.shape[0]
        except tf.errors.OutOfRangeError:
            break

    # Return the fraction of datapoints that were correctly classified
    acc = float(num_correct) / num_samples
    return acc

def _preprocess_function(image, label, config):
    """Preprocess the dataset item where the inputs are an image and a label."""
    if config['standardize']:
        image = tf.image.per_image_standardization(image)
    return image, label
    
def _parsing_function(filename, label, config):
    """A function that parses one item of a dataset."""
    file_string = tf.read_file(filename)
    image = tf.image.decode_jpeg(contents=file_string)
    image = tf.reshape(image, shape=config['image_shape'])
    image = tf.cast(image, tf.float32)
    return image, label

########################################################################
# Dataset Functions, partial fill
########################################################################
config = read_config(config_filename)
from functools import partial
parsing_function = partial(_parsing_function, config=config)
preprocess_function = partial(_preprocess_function, config=config)

########################################################################
# Train tf.Dataset
########################################################################
train_dataset = tf.data.Dataset.from_tensor_slices((train_filenames, train_labels))
train_dataset = train_dataset.map(parsing_function, num_parallel_calls=config['num_parallel_calls'])
train_dataset = train_dataset.map(preprocess_function, num_parallel_calls=config['num_parallel_calls'])
train_dataset = train_dataset.prefetch(buffer_size=2 * config['batch_size'])
batched_train_dataset = train_dataset.batch(batch_size=config['batch_size'])

########################################################################
# Test tf.Dataset
########################################################################
test_dataset = tf.data.Dataset.from_tensor_slices((test_filenames, test_labels))
test_dataset = test_dataset.map(parsing_function, num_parallel_calls=config['num_parallel_calls'])
test_dataset = test_dataset.map(preprocess_function, num_parallel_calls=config['num_parallel_calls'])
test_dataset = test_dataset.prefetch(buffer_size=2 * config['batch_size'])
batched_test_dataset = test_dataset.batch(batch_size=config['batch_size'])

########################################################################
# Dataset Iterators
########################################################################
iterator = tf.data.Iterator.from_structure(output_types=batched_train_dataset.output_types,
                                           output_shapes=batched_train_dataset.output_shapes)
train_iterator_init_op = iterator.make_initializer(batched_train_dataset, name='train_iterator')
test_iterator_init_op = iterator.make_initializer(batched_test_dataset, name='test_iterator')

########################################################################
# Get batches from the iterators
########################################################################
# The depending on which initializer is `sess.run`, that data is generated
batch_x, batch_y = iterator.get_next()
logger.info("batch_x.shape: {}, batch_y.shape: {}".format(batch_x.shape, batch_y.shape))

#
# ... use batch_x and batch_y
# ... your architecture
# ... your loss and optimization operations
#


############################################################################
# Start session
############################################################################
with tf.Session(graph=graph) as sess:
    ########################################################################
    # Learning
    ########################################################################
    n_epochs = 10
    for epoch in range(n_epochs):
        ####################################################################
        # Training: Run until the training iterator has ended
        ####################################################################
        # Initialize training iterator
        sess.run(train_iterator_init_op)
        
        step = 0
        while True:
            try:
                sess.run(training_step)
            except tf.errors.OutOfRangeError:
                logger.info("Training epoch ended: {}".format(epoch + 1))
                break
            step = step + 1

        ####################################################################
        # Validation: Check overall training and test accuracy after every epoch
        ####################################################################
        train_acc = _check_accuracy(sess, _y_correct, train_iterator_init_op)
        test_acc = _check_accuracy(sess, _y_correct, test_iterator_init_op)
        logger.info("Epoch {}/{} ::: END OF EPOCH ::: train_acc {:0.2f} ::: test_acc {:0.2f}"
                    .format(epoch + 1, n_epochs, train_acc, test_acc))

  
## tf_sharing_variables.md

      
    Raw
  

              tf_sharing_variables.md
            
          
    About


Not training or optimizing shared variables: In this pattern we comment out the shared variables from the optimization scope. This will lead to shared variables not being trained. Run the code snippet below and you will notice that the values of the shared variables do not change between two training steps. Keep in mind that the output of the shared layer will change if you change the input even if the shared variables do not get updated. This is because the output of the layers are just activations and they depend on the input.
Training or optimizing shared variables: In this pattern we optimize the shared variables as well and we notice that the variables values changes between training steps.
Copying variables from model one to model two: In this pattern we optimize variables of model one and then later transfer them to model two.
Optimizing original variables that have been updated using tf.assign: In this pattern we try to optimize variables that have their values assigned from another set of variables. When using tf.assign it returns updated references to the original variables. So I tried to optimize these updated variables. Unfortunately these optimizer would not accept these updated references. So instead I had to optimize the original variable references itself with the HOPE that it would optimize the updated values. This script shows that my HOPES have been realized and using the original variable references is fine so long as you use control_dependencies and make sure that the optimization step happens only after the update or the tf.assign step.

Script: Not training or optimizing shared variables

def print_train(sess_run, label):
    from pprint import pprint, pformat
    x, fc_shared, fc1, fc2, loss, vars_shared, optimize = sess_run
    pprint("###########################################################")
    pprint("{:20s} ######################################".format(label))
    pprint("###########################################################")
    print("--------- x \n{}".format(x))
    print("--------- fc_shared \n{}".format(fc_shared))
    print("--------- fc1 \n{}".format(fc1))
    print("--------- fc2 \n{}".format(fc2))
    print("--------- loss \n{}".format(loss))
    print("--------- vars_shared \n{}".format(pformat(vars_shared)))

def do_stuff():
    from pprint import pprint
    import tensorflow as tf
    import numpy as np
    # Allocate only 20% of GPU memory (remember to add this to the session configProto)
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
    
    # Creating placehodlers for tasks 1 & 2
    x_t1 = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='x_t1')
    y_t1 = tf.placeholder(dtype=tf.int32, shape=[None, 1], name='y_t1')
    
    x_t2 = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='x_t2')
    y_t2 = tf.placeholder(dtype=tf.int32, shape=[None, 1], name='y_t2')
    
    
    # Creating shared variables with different references to the same shared variables
    with tf.variable_scope('shared') as scope:
        fc_shared_t1 = tf.layers.dense(inputs=x_t1, units=2, activation=None, name='fc_shared')
        
    with tf.variable_scope('shared', reuse=True) as scope:
        fc_shared_t2 = tf.layers.dense(inputs=x_t2, units=2, activation=None, name='fc_shared')
        
    vars_shared = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'shared')
    
    # Creating task 1 specific variables, losses and optimizers
    with tf.variable_scope('t1') as scope:
        fc1_t1 = tf.layers.dense(inputs=fc_shared_t1, units=2, activation=None, name='fc1')
        fc2_t1 = tf.layers.dense(inputs=fc1_t1, units=1, name='fc2')
        # y_pred_t1 = tf.to_int32(fc2_t1 > 0.5)
        
        # cross_entropy_t1 = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y_t1, logits=fc2_t1, name='cross_entropy')
        loss_t1 = tf.losses.mean_squared_error(labels=y_t1, predictions=fc2_t1)
        
        vars_t1 = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 't1')
        # vars_t1 += tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'shared')  # UNCOMMENT TO MAKE SHARED VARIABLES TRINABLE
        pprint("vars_t1-------------------------------------------------")
        pprint(vars_t1)
        
        optimizer_t1 = tf.train.AdamOptimizer()
        optimize_t1 = optimizer_t1.minimize(loss_t1, var_list=vars_t1)
        
    # Creating task 2 specific variables, losses and optimizers
    with tf.variable_scope('t2') as scope:
        fc1_t2 = tf.layers.dense(inputs=fc_shared_t2, units=2, activation=None, name='fc1')
        fc2_t2 = tf.layers.dense(inputs=fc1_t2, units=1, name='fc2')
        # y_pred_t2 = tf.to_int32(fc2_t2 > 0.5)
        
        # cross_entropy_t2 = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y_t2, logits=fc2_t2, name='cross_entropy')
        loss_t2 = tf.losses.mean_squared_error(labels=y_t2, predictions=fc2_t2)
        
        vars_t2 = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 't2')
        # vars_t2 += tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'shared')  # UNCOMMENT TO MAKE SHARED VARIABLES TRINABLE
        pprint("vars_t2-------------------------------------------------")
        pprint(vars_t2)
        
        optimizer_t2 = tf.train.AdamOptimizer()
        optimize_t2 = optimizer_t1.minimize(loss_t2, var_list=vars_t2)
        
    pprint("global_vars-----------------------------------------------")    
    pprint(tf.global_variables())
    pprint("vars 1-----------------------------------------------")    
    pprint(vars_t1)
    pprint("vars 2-----------------------------------------------")    
    pprint(vars_t2)
        
    
    with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
        sess.run(tf.local_variables_initializer())
        sess.run(tf.global_variables_initializer())
        
        print_train(sess_run=sess.run([x_t1, fc_shared_t1, fc1_t1, fc2_t1, loss_t1, vars_shared, optimize_t1],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                 y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='train 1: task 1')

        
        print_train(sess_run=sess.run([x_t1, fc_shared_t1, fc1_t1, fc2_t1, loss_t1, vars_shared, optimize_t1],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                 y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='train 2: task 1')
Script: Training or optimizing shared variables

Note: Update the script above by uncommenting two lines:
# Uncomment
vars_t1 += tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'shared')
# Uncomment
vars_t2 += tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'shared')
Script: Copying variables from model one to model two

def print_train(sess_run, label):
    from pprint import pprint, pformat
    x, fc1_1, fc2_1, loss_1, vars_1, updated_1, fc1_2, fc2_2, loss_2, vars_2, updated_2, optimize = sess_run
    pprint("###########################################################")
    pprint("{:20s} ######################################".format(label))
    pprint("###########################################################")
    print(" --------------------------------------------- x \n{}".format(x))
    print(" --------------------------------------------- fc1_1 \n{}".format(fc1_1))
    print(" --------------------------------------------- fc2_1 \n{}".format(fc2_1))
    print(" --------------------------------------------- loss_1 \n{}".format(loss_1))
    print(" --------------------------------------------- vars_1 \n{}".format(pformat(vars_1)))
    print(" --------------------------------------------- updated_1 \n{}".format(pformat(updated_1)))
    print(" ######################################################")
    print(" --------------------------------------------- fc1_2 \n{}".format(fc1_2))
    print(" --------------------------------------------- fc2_2 \n{}".format(fc2_2))
    print(" --------------------------------------------- loss_2 \n{}".format(loss_2))
    print(" --------------------------------------------- vars_2 \n{}".format(pformat(vars_2)))
    print(" --------------------------------------------- updated_2 \n{}".format(pformat(updated_2)))

def do_stuff():
    from pprint import pprint
    import tensorflow as tf
    import numpy as np
    # Allocate only 20% of GPU memory (remember to add this to the session configProto)
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
    
    # Creating placehodlers for tasks 1 & 2
    x_t1 = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='x_t1')
    y_t1 = tf.placeholder(dtype=tf.int32, shape=[None, 1], name='y_t1')
    
    x_t2 = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='x_t2')
    y_t2 = tf.placeholder(dtype=tf.int32, shape=[None, 1], name='y_t2')
    
    
    # Creating task 1 specific variables, losses and optimizers
    with tf.variable_scope('t1') as scope:
        fc1_t1 = tf.layers.dense(inputs=x_t1, units=2, activation=None, name='fc1')
        fc2_t1 = tf.layers.dense(inputs=fc1_t1, units=1, name='fc2')
        # y_pred_t1 = tf.to_int32(fc2_t1 > 0.5)
        
        # cross_entropy_t1 = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y_t1, logits=fc2_t1, name='cross_entropy')
        loss_t1 = tf.losses.mean_squared_error(labels=y_t1, predictions=fc2_t1)
        
        vars_t1 = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 't1')
        pprint("vars_t1-------------------------------------------------")
        pprint(vars_t1)
        
        optimizer_t1 = tf.train.AdamOptimizer()
        optimize_t1 = optimizer_t1.minimize(loss_t1, var_list=vars_t1)
        
    # Creating task 2 specific variables, losses and optimizers
    with tf.variable_scope('t2') as scope:
        fc1_t2 = tf.layers.dense(inputs=x_t1, units=2, activation=None, name='fc1')
        fc2_t2 = tf.layers.dense(inputs=fc1_t2, units=1, name='fc2')
        # y_pred_t2 = tf.to_int32(fc2_t2 > 0.5)
        
        # cross_entropy_t2 = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y_t2, logits=fc2_t2, name='cross_entropy')
        loss_t2 = tf.losses.mean_squared_error(labels=y_t1, predictions=fc2_t2)
        
        vars_t2 = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 't2')
        pprint("vars_t2-------------------------------------------------")
        pprint(vars_t2)
        
        optimizer_t2 = tf.train.AdamOptimizer()
        optimize_t2 = optimizer_t1.minimize(loss_t2, var_list=vars_t2)
    
    updated_v2 = [tf.assign(v2, v1) for v1, v2 in zip(vars_t1, vars_t2)]
    updated_v1 = [tf.assign(v1, v2) for v1, v2 in zip(vars_t1, vars_t2)]
    t1_to_t2_transfer = tf.group(updated_v2, name='t1_t2_transfer') 
    t2_to_t1_transfer = tf.group(updated_v1, name='t2_t1_transfer')
    no_op = tf.no_op()
        
    pprint("global_vars-----------------------------------------------")    
    pprint(tf.global_variables())
    pprint("vars 1-----------------------------------------------")    
    pprint(vars_t1)
    pprint("vars 2-----------------------------------------------")    
    pprint(vars_t2)
        
    
    with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
        sess.run(tf.local_variables_initializer())
        sess.run(tf.global_variables_initializer())
        
        print_train(sess_run=sess.run([x_t1, fc1_t1, fc2_t1, loss_t1, vars_t1, updated_v1, fc1_t2, fc2_t2, loss_t2, vars_t2, updated_v2, optimize_t1],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32),
                                                  x_t2: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t2: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='train 1: task 1')

        
        print_train(sess_run=sess.run([x_t1, fc1_t1, fc2_t1, loss_t1, vars_t1, updated_v1, fc1_t2, fc2_t2, loss_t2, vars_t2, updated_v2, optimize_t1],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32),
                                                  x_t2: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t2: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='train 1: task 1')
        
        print_train(sess_run=sess.run([x_t1, fc1_t1, fc2_t1, loss_t1, vars_t1, updated_v1, fc1_t2, fc2_t2, loss_t2, vars_t2, updated_v2, t1_to_t2_transfer],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32),
                                                  x_t2: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t2: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='transfer t1 to t2')
Script: Optimizing original variables that have been updated using tf.assign

def print_train(sess_run, label):
    from pprint import pprint, pformat
    x, fc1_1, fc2_1, loss_1, vars_1, updated_1, fc1_2, fc2_2, loss_2, vars_2, updated_2, optimize = sess_run
    pprint("###########################################################")
    pprint("{:20s} ######################################".format(label))
    pprint("###########################################################")
    print(" --------------------------------------------- x \n{}".format(x))
    print(" --------------------------------------------- fc1_1 \n{}".format(fc1_1))
    print(" --------------------------------------------- fc2_1 \n{}".format(fc2_1))
    print(" --------------------------------------------- loss_1 \n{}".format(loss_1))
    print(" --------------------------------------------- vars_1 \n{}".format(pformat(vars_1)))
    print(" --------------------------------------------- updated_1 \n{}".format(pformat(updated_1)))
    print(" ######################################################")
    print(" --------------------------------------------- fc1_2 \n{}".format(fc1_2))
    print(" --------------------------------------------- fc2_2 \n{}".format(fc2_2))
    print(" --------------------------------------------- loss_2 \n{}".format(loss_2))
    print(" --------------------------------------------- vars_2 \n{}".format(pformat(vars_2)))
    print(" --------------------------------------------- updated_2 \n{}".format(pformat(updated_2)))

def do_stuff():
    from pprint import pprint
    import tensorflow as tf
    import numpy as np
    # Allocate only 20% of GPU memory (remember to add this to the session configProto)
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
    
    # Creating placehodlers for tasks 1 & 2
    x_t1 = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='x_t1')
    y_t1 = tf.placeholder(dtype=tf.int32, shape=[None, 1], name='y_t1')
    
    x_t2 = tf.placeholder(dtype=tf.float32, shape=[None, 2], name='x_t2')
    y_t2 = tf.placeholder(dtype=tf.int32, shape=[None, 1], name='y_t2')
    
    
    # Creating task 1 specific variables, losses and optimizers
    with tf.variable_scope('t1') as scope:
        fc1_t1 = tf.layers.dense(inputs=x_t1, units=2, activation=None, name='fc1')
        fc2_t1 = tf.layers.dense(inputs=fc1_t1, units=1, name='fc2')
        # y_pred_t1 = tf.to_int32(fc2_t1 > 0.5)
        
        # cross_entropy_t1 = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y_t1, logits=fc2_t1, name='cross_entropy')
        loss_t1 = tf.losses.mean_squared_error(labels=y_t1, predictions=fc2_t1)
        
        vars_t1 = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 't1')
        pprint("vars_t1-------------------------------------------------")
        pprint(vars_t1)
        
        optimizer_t1 = tf.train.AdamOptimizer()
        optimize_t1 = optimizer_t1.minimize(loss_t1, var_list=vars_t1)
        
    # Creating task 2 specific variables, losses and optimizers
    with tf.variable_scope('t2') as scope:
        fc1_t2 = tf.layers.dense(inputs=x_t1, units=2, activation=None, name='fc1')
        fc2_t2 = tf.layers.dense(inputs=fc1_t2, units=1, name='fc2')
        # y_pred_t2 = tf.to_int32(fc2_t2 > 0.5)
        
        # cross_entropy_t2 = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y_t2, logits=fc2_t2, name='cross_entropy')
        loss_t2 = tf.losses.mean_squared_error(labels=y_t1, predictions=fc2_t2)
        
        vars_t2 = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 't2')
        pprint("vars_t2-------------------------------------------------")
        pprint(vars_t2)
    
    
    updated_v2 = [tf.assign(v2, v1) for v1, v2 in zip(vars_t1, vars_t2)]
    updated_v1 = [tf.assign(v1, v2) for v1, v2 in zip(vars_t1, vars_t2)]
    t1_to_t2_transfer = tf.group(updated_v2, name='t1_t2_transfer') 
    t2_to_t1_transfer = tf.group(updated_v1, name='t2_t1_transfer')
    no_op = tf.no_op()
    
    with tf.control_dependencies([t1_to_t2_transfer]):
        optimizer_t2 = tf.train.AdamOptimizer()
        optimize_t2 = optimizer_t1.minimize(loss_t2, var_list=vars_t2)
        
    pprint("global_vars-----------------------------------------------")    
    pprint(tf.global_variables())
    pprint("vars 1-----------------------------------------------")    
    pprint(vars_t1)
    pprint("vars 2-----------------------------------------------")    
    pprint(vars_t2)
        
    
    with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
        sess.run(tf.local_variables_initializer())
        sess.run(tf.global_variables_initializer())
        
        print_train(sess_run=sess.run([x_t1, fc1_t1, fc2_t1, loss_t1, vars_t1, no_op, fc1_t2, fc2_t2, loss_t2, vars_t2, no_op, optimize_t1],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32),
                                                  x_t2: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t2: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='optimize_t1')

        
        print_train(sess_run=sess.run([x_t1, fc1_t1, fc2_t1, loss_t1, vars_t1, no_op, fc1_t2, fc2_t2, loss_t2, vars_t2, no_op, optimize_t1],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32),
                                                  x_t2: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t2: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='optimize_t1')
        
        print_train(sess_run=sess.run([x_t1, fc1_t1, fc2_t1, loss_t1, vars_t1, no_op, fc1_t2, fc2_t2, loss_t2, vars_t2, updated_v2, optimize_t2],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32),
                                                  x_t2: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t2: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='optimize_t2')
    
        print_train(sess_run=sess.run([x_t1, fc1_t1, fc2_t1, loss_t1, vars_t1, no_op, fc1_t2, fc2_t2, loss_t2, vars_t2, no_op, no_op],
                                       feed_dict={x_t1: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t1: np.array([[0], [1], [1], [0]], dtype=np.int32),
                                                  x_t2: np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32),
                                                  y_t2: np.array([[0], [1], [1], [0]], dtype=np.int32)}),
                    label='optimize_t2')

  
## tf_updating_variables.md

      
    Raw
  

              tf_updating_variables.md
            
          
    About

In these patterns we explore how the graph behaves when we ask to run only some portions of the graph and how dependency nodes automatically run when required and what the behavior is on corner cases.


How variables are updated on update operations: This pattern shows how variables and tensors are updated and take on values only when updated operations are explicity called unless there's some natural dependency in the graph or tf.control_dependencies() is called.


Is feed_dict required for placeholder independent parts of the graph? - Apparently not. This pattern shows that we can execute parts of the graph which don't directly rely on placeholders without using feed_dict. It also shows that you can update a variable via an assign operation on one sess.run() using a feed_dict and in another sess.run() you can use that updated variable to do other things without providing a feed_dict.


Computing gradients in one sess.run and then applying gradients in another sess.run: This pattern shows how to compute gradients in one sess.run and then in the same sess.run save the gradients to a backup variables. In the next sess.run, we use the saved gradients in the backup variables to apply the gradients to optimize the weights. This has the wonderful property of sharing gradients across sess.runs and not having to use feed_dict at the end.


Script: How variables are updated when update operations are called explicity and implicitly

def do_stuff():
    from pprint import pprint
    import tensorflow as tf
    import numpy as np
    # Allocate only 20% of GPU memory (remember to add this to the session configProto)
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
   
    x = tf.placeholder(tf.float32, shape=[None, 1])
    y = tf.placeholder(tf.float32, shape=[None, 1])
    
    W = tf.get_variable('W', shape=[1, 1], dtype=tf.float32, initializer=None)
    b = tf.get_variable('b', shape=[1], dtype=tf.float32, initializer=None)
    
    pred = tf.add(tf.matmul(x, W), b)
    loss = tf.reduce_mean(tf.losses.mean_squared_error(y, pred))
    
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.05)
    opt_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'W')
    opt_vars += tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'b')
    optimize = optimizer.minimize(loss=loss, var_list=opt_vars, name='optimize')
    
    copy_W = tf.get_variable('copy_W', shape=[1, 1], dtype=tf.float32, initializer=None)
    copy_b = tf.get_variable('copy_b', shape=[1], dtype=tf.float32, initializer=None)
    
    with tf.control_dependencies([optimize]):
        assign_op_W = copy_W.assign(W)
        assign_op_b = copy_b.assign(b)
        assign_op = tf.group([assign_op_W, assign_op_b], name='assign_op')
        
    copy_pred = tf.add(tf.matmul(x, copy_W), copy_b)
    copy_loss = tf.reduce_mean(tf.losses.mean_squared_error(y, copy_pred))
    
    pprint("global_vars-----------------------------------------------")    
    pprint(tf.global_variables())
    
    with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
        sess.run(tf.local_variables_initializer())
        sess.run(tf.global_variables_initializer())
        
        
        for i in range(5):
            print("#####################################################")
            print("---------------------------------------------------- no-op")
            # No change in any losses, tf.no_op()
            sess.run([tf.no_op()], feed_dict={x: [[1], [2]], y: [[1], [2]]})
            pprint(sess.run([loss, copy_loss],
                            feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            # Change only in loss and not in copy_loss, optimize
            print("---------------------------------------------------- optimize")
            sess.run([optimize], feed_dict={x: [[1], [2]], y: [[1], [2]]})
            pprint(sess.run([loss, copy_loss],
                            feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("---------------------------------------------------- no-op --> get current values")
            # No change in any losses, tf.no_op()
            sess.run([tf.no_op()], feed_dict={x: [[1], [2]], y: [[1], [2]]})
            pprint(sess.run([loss, copy_loss],
                            feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            # Change in both loss and copy_loss, assign_op
            print("---------------------------------------------------- optimize, assign-op")
            sess.run([assign_op], feed_dict={x: [[1], [2]], y: [[1], [2]]})
            pprint(sess.run([loss, copy_loss],
                            feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("---------------------------------------------------- no-op --> get current values")
            # No change in any losses, tf.no_op()
            sess.run([tf.no_op()], feed_dict={x: [[1], [2]], y: [[1], [2]]})
            pprint(sess.run([loss, copy_loss],
                            feed_dict={x: [[1], [2]], y: [[1], [2]]}))
Script: Is feed_dict required for placeholder independent parts of the graph? - Apparently not

def do_stuff():
    from pprint import pprint
    import tensorflow as tf
    import numpy as np
    # Allocate only 20% of GPU memory (remember to add this to the session configProto)
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
    
    is_training = tf.placeholder(tf.bool, shape=[1])

    x = tf.placeholder(tf.float32, shape=[None, 1])
    y = tf.placeholder(tf.float32, shape=[None, 1])
    
    W = tf.get_variable('W', shape=[1, 1], dtype=tf.float32, initializer=None)
    b = tf.get_variable('b', shape=[1], dtype=tf.float32, initializer=None)
    
    dependent = tf.add(tf.matmul(x, W), b)
    
    
    x_independent = tf.get_variable('independent', shape=[2, 1], dtype=tf.float32, initializer=None)
    
    x_independent_updated = tf.assign(x_independent, dependent)
    
    independent = tf.add(tf.matmul(x_independent, W), b)
    
    
    pprint("global_vars-----------------------------------------------")    
    pprint(tf.global_variables())
    
    
    with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
        sess.run(tf.local_variables_initializer())
        sess.run(tf.global_variables_initializer())
        
        
        for i in range(1):
            print("#####################################################")
            print("---------------------------------------------------- no-op with feed_dict")
            pprint(sess.run([tf.no_op()], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("---------------------------------------------------- dependent with feed_dict")
            pprint(sess.run([dependent], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("---------------------------------------------------- x_independent, independent with no feed_dict")
            pprint(sess.run([x_independent, independent]))
            
            print("---------------------------------------------------- x_independent_updated with feed_dict")
            pprint(sess.run([x_independent_updated], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("---------------------------------------------------- x_independent, independent with no feed_dict")
            pprint(sess.run([x_independent, independent]))
Script: Computing gradients in one sess.run and then applying gradients in another sess.run

def do_stuff():
    from pprint import pprint
    import tensorflow as tf
    import numpy as np
    # Allocate only 20% of GPU memory (remember to add this to the session configProto)
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
    
    # is_training = tf.placeholder(tf.bool, shape=[1])

    x = tf.placeholder(tf.float32, shape=[None, 1])
    y = tf.placeholder(tf.float32, shape=[None, 1])

    W = tf.get_variable('W', shape=[1, 1], dtype=tf.float32, initializer=None)
    b = tf.get_variable('b', shape=[1], dtype=tf.float32, initializer=None)

    pred = tf.add(tf.matmul(x, W), b)
    loss = tf.reduce_mean(tf.losses.mean_squared_error(y, pred))

    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.05)
    opt_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'W')
    opt_vars += tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'b')
    
    
    # After model variables have been created
    # Create gradient variables
    with tf.control_dependencies(opt_vars):
        grad_vars = []
        print("#######################################################################")
        for i, var in enumerate(opt_vars):
            print('----------------------------------------------------------------------- creating gradient variable from variable')
            print("var.name: {} ::: var.shape: {}".format(var.name, var.shape))
            grad_var = tf.get_variable(name='grad_var_{}'.format(i),
                                       shape=var.shape,
                                       trainable=False,
                                       dtype=var.dtype,
                                       initializer=None)

            grad_vars.append(grad_var)
    
    # After gradient variables have been created.
    # Compute gradients
    with tf.control_dependencies(grad_vars):
        grads_and_vars = optimizer.compute_gradients(loss=loss, var_list=[opt_vars])
        grads = [g_and_v[0] for g_and_v in grads_and_vars]
    

    # If you have gradients & variables after compute_gradients
    # Backup
    with tf.control_dependencies([g_or_v for g_and_v in grads_and_vars for g_or_v in g_and_v]):
        grad_vars_backup = []
        print("#######################################################################")
        for i, ((grad, var), grad_var) in enumerate(zip(grads_and_vars, grad_vars)):
            print('----------------------------------------------------------------------- assigning grad_var the value of grad')
            print("var.name:      {:50s}  ::: var.shape:      {}".format(var.name, var.shape))
            print("grad.name:     {:50s}  ::: grad.shape:     {}".format(grad.name, grad.shape))
            print("grad_var.name: {:50s}  ::: grad_var.shape: {}".format(grad_var.name, grad_var.shape))
            grad_var_backup = tf.assign(grad_var, grad)
            grad_vars_backup.append(grad_var_backup)
        
    # After you have taken a backup of gradients into gradient variables
    # Restore
    with tf.control_dependencies(grad_vars_backup):
        restore_grads_and_vars = []
        for i, (grad_var, var) in enumerate(zip(grad_vars, opt_vars)):
            restore_grads_and_vars.append([grad_var, var])
    
    # After you have restored gradients from gradient variables
    # Optimize
    with tf.control_dependencies([g_or_v for g_and_v in restore_grads_and_vars for g_or_v in g_and_v]):
        optimize = optimizer.apply_gradients(grads_and_vars=restore_grads_and_vars)

        
    print("#######################################################################")
    pprint("global_vars-----------------------------------------------")    
    pprint(tf.global_variables())
    
    
    with tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) as sess:
        sess.run(tf.local_variables_initializer())
        sess.run(tf.global_variables_initializer())  # ????????
        
        
        for i in range(1):
            print("#######################################################################")
            print("----------------------------------------------------------------------- loss")
            pprint(sess.run([loss], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("----------------------------------------------------------------------- loss, NO CHANGE")
            pprint(sess.run([loss], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("----------------------------------------------------------------------- grads")
            pprint(sess.run([grads], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("----------------------------------------------------------------------- grad_vars, NO FEED_DICT")
            pprint(sess.run([grad_vars], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("----------------------------------------------------------------------- grad_vars_backup")
            pprint(sess.run([grad_vars_backup], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("----------------------------------------------------------------------- grad_vars, NO FEED_DICT, grad_vars == grad_vars_backup")
            pprint(sess.run([grad_vars]))
            
            print("----------------------------------------------------------------------- restore_grads_and_vars, NO FEED_DICT")
            pprint(sess.run([restore_grads_and_vars]))
            
            print("----------------------------------------------------------------------- optimize, NO FEED_DICT")
            pprint(sess.run([optimize]))
            
            print("----------------------------------------------------------------------- loss, UPDATED")
            pprint(sess.run([loss], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
            
            print("----------------------------------------------------------------------- loss, NO CHANGE")
            pprint(sess.run([loss], feed_dict={x: [[1], [2]], y: [[1], [2]]}))
	#!/bin/bash
	# License: MIT. See license file in root directory
	# Copyright(c) JetsonHacks (2017)
	cd $HOME
	sudo apt-get install -y \
	libglew-dev \
	libtiff5-dev \
	zlib1g-dev \
	libjpeg-dev \
	libpng12-dev \
	libjasper-dev \
	libavcodec-dev \
	libavformat-dev \
	libavutil-dev \
	libpostproc-dev \
	libswscale-dev \
	libeigen3-dev \
	libtbb-dev \
	libgtk2.0-dev \
	cmake \
	pkg-config

	# Python 2.7
	sudo apt-get install -y python3-dev python3-numpy python3-py python3-pytest -y
	# sudo apt-get install -y python-dev python-numpy python-py python-pytest -y
	# GStreamer support
	sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev

	# git clone https://github.com/opencv/opencv.git
	# cd opencv
	# git checkout -b v3.3.0 3.3.0
	# This is for the test data
	# cd $HOME
	# git clone https://github.com/opencv/opencv_extra.git
	# cd opencv_extra
	# git checkout -b v3.3.0 3.3.0

	cd $HOME/opencv
	mkdir build
	cd build
	# Jetson TX2
	cmake \
	-DCMAKE_BUILD_TYPE=Release \
	-DCMAKE_INSTALL_PREFIX=/usr \
	-DBUILD_PNG=OFF \
	-DBUILD_TIFF=OFF \
	-DBUILD_TBB=OFF \
	-DBUILD_JPEG=OFF \
	-DBUILD_JASPER=OFF \
	-DBUILD_ZLIB=OFF \
	-DBUILD_EXAMPLES=ON \
	-DBUILD_opencv_java=OFF \
	-DBUILD_opencv_python2=OFF \
	-DBUILD_opencv_python3=ON \
	-DENABLE_PRECOMPILED_HEADERS=OFF \
	-DWITH_OPENCL=OFF \
	-DWITH_OPENMP=OFF \
	-DWITH_FFMPEG=ON \
	-DWITH_GSTREAMER=ON \
	-DWITH_GSTREAMER_0_10=OFF \
	-DWITH_CUDA=OFF \
	-DWITH_GTK=ON \
	-DWITH_VTK=OFF \
	-DWITH_TBB=ON \
	-DWITH_1394=OFF \
	-DWITH_OPENEXR=OFF \
	-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-9.0 \
	-DCUDA_ARCH_BIN=6.2 \
	-DCUDA_ARCH_PTX="" \
	-DINSTALL_C_EXAMPLES=ON \
	-DINSTALL_TESTS=ON \
	-DOPENCV_TEST_DATA_PATH=../opencv_extra/testdata \
	../

	# Consider using all 6 cores; $ sudo nvpmodel -m 2 or $ sudo nvpmodel -m 0
	make -j4
	make
	sudo make install