/ TensorFlow

A Guide to TensorFlow (Part 2)

Series Introduction

This blog is a part of "A Guide To TensorFlow", where we will explore the TensorFlow API and use it to build multiple machine learning models for real-life examples. In this blog we shall uncover TensorFlow Ops, managing multiple graphs and also get to learn about Sessions in TensorFlow.
Please go through the part 1 of this series before readng this.

Tensorflow Ops

TensorFlow Operations, also known as Ops, are nodes that perform computations on or with Tensor objects. After computation, they return zero or more tensors, which can be used by other Ops later in the graph.
To create an Op we need to call its constructor. This constructor takes a Tensor parameter needed for its calculation, this constitutes to the input to the Op. Along with the input, an Op takes in other additional information in the form of attributes. The constructor returns a Tensor as the output of the operation when we perform Session.run
We have seen how we use Ops in our graphs, here is another example.
Note that we are using numPy arrays, which is acceptable as we've learned in the previous guide.

import tensorflow as tf
import numpy as np

# Initialize some tensors to use in computation
a = np.array([2, 3], dtype=np.int32)
b = np.array([4, 5], dtype=np.int32)

# Use `tf.add()` to initialize an "add" Operation
# The variable `c` will be a handle to the Tensor output of this Op
c = tf.add(a, b)

Even though we have used Ops to perform mathematical operations, we need to understand that they are more than just mathematical computations, and can be used for tasks such as initializing state. This means that there can exist an Op that takes in zero input and return zero outputs, and also that not all nodes need to be connected to other nodes. One such example is the global variable initializer tf.global_variables_initializer, This initializes global variables in the graph. (We will see variables in the coming blogs in the series)

Operator Overloading

TensorFlow overloads most common mathematical operators to make multiplication, addition, subtraction, and other common operations more concise. Let's say we have two inputs x and y, if at least one of x or y is a tf.Tensor object, the expressions tf.add(x, y) and x + y are equivalent. The main reason you might use tf.add() is to specify an explicit name keyword argument for the created op, which is not possible with the overloaded operator version. It means that if we need to pass in a name to the Op, we can call the TensorFlow Operation directly.

Here is a list of few commonly overloaded Ops

z = -x  # z = tf.negative(x)
z = x + y  # z = tf.add(x, y)
z = x - y  # z = tf.subtract(x, y)
z = x * y  # z = tf.mul(x, y)
z = x / y  # z = tf.div(x, y)
z = x // y  # z = tf.floordiv(x, y)
z = x % y  # z = tf.mod(x, y)
z = x ** y  # z = tf.pow(x, y)
z = x @ y  # z = tf.matmul(x, y)
z = x > y  # z = tf.greater(x, y)
z = x >= y  # z = tf.greater_equal(x, y)
z = x < y  # z = tf.less(x, y)
z = x <= y  # z = tf.less_equal(x, y)
z = abs(x)  # z = tf.abs(x)
z = x & y  # z = tf.logical_and(x, y)
z = x | y  # z = tf.logical_or(x, y)
z = x ^ y  # z = tf.logical_xor(x, y)
z = ~x  # z = tf.logical_not(x)

Note that Python doesn't allow overloading "and", "or", and "not" keywords.
TensorFlow also doesn't allow using tensors as booleans, as it may be error prone:

x = tf.constant(1.)
if x:  # This will raise a TypeError error
    ...

Other operators that aren't supported are equal (==) and not equal (!=) operators which are overloaded in NumPy but not in TensorFlow. Use the function versions instead which are tf.equal and tf.not_equal.

Creating your own Op

Suppose you want to create an Op not already covered in the TensorFlow library, it is recommended to first try writing the op in Python as a composition of existing Python ops or functions.
We can use tf.py_func(func, inp, Tout) to create our own Ops. This wraps a python function and uses it as a tensorflow op. Given a python function func, which takes numPy arrays as its inputs and returns numPy arrays as its outputs.
Your python function needs to have:

  • numpy arrays as inputs, fed from the graph with the argument inp
  • numpy arrays as outputs, you need to specify their types to TensorFlow in the argument Tout.
    Here is an example:
def my_func(x):
  # x will be a numpy array with the contents of the placeholder below
  return np.sinh(x)
inp = tf.placeholder(tf.float32)
y = tf.py_func(my_func, [inp], tf.float32)

Inside the function, you can do whatever you like, if conditions of for loops, anything that is not possible in TensorFlow. However there can be more than a few challenges to that approach, sometimes it is not easy or possible to express your operation as a composition of existing ops. Efficiency can be a concern when we express our new op as a composition of existing primitives. In such a scenario we can create a custom C++ op.

To incorporate your custom op you'll need to:

  1. Register the new op in a C++ file.
  2. Implement the op in C++.
  3. Create a Python wrapper (optional).
  4. Write a function to compute gradients for the op (optional).
  5. Test the op.

This can be an extremely techncal process and is out of scope at the moment, so we can skip the actual process of creating our own Op and move to the next segment. You can read more about it in the docs.

More on Graphs

From our previous guide, we know that a Graph contains a set of tf.Operation objects, which represent units of computation; and tf.Tensor objects, which represent the units of data that flow between operations. Thus far, we’ve only referenced “the graph” as some sort of abstract, omnipresence in TensorFlow, and we haven’t questioned how Operations are automatically attached to a graph when we start coding. This happens because a default Graph is always registered, and accessible by calling tf.get_default_graph. To add an operation to the default graph, simply call one of the functions that define a new Operation.

TensorFlow allows us to create and use multiple graphs in conjunction with one another. Creating multiple graphs can be useful for defining multiple models that do not have interdependencies.

Creating a Graph is pretty straightforward, we simply call its constructor which doesn’t take any variables:

import tensorflow as tf

# Create a new graph:
g = tf.Graph()

We can use tf.Graph.as_default context manager to override the current default graph for the lifetime of the context. We can add Operations to it by using this Graph.as_default() method and access its context manager. In conjunction with the with statement, we can use the context manager to let TensorFlow know that we want to add Operations to a specific Graph:

with g.as_default():
    # Create Operations as usual; they will be added to graph `g`
    a = tf.mul(2, 3)
    ...

Any Operations, tensors, etc. defined outside of a Graph.as_default() context manager will automatically be placed in the default graph:

# Placed in the default graph
in_default_graph = tf.add(1,2)

# Placed in graph `g`
with g.as_default():
    in_graph_g = tf.mul(2,3)

# We are no longer in the `with` block, so this is placed in the default graph
also_in_default_graph = tf.sub(5,1)

When defining multiple graphs in one file, one of the best practice is to not use the default graph:

import tensorflow as tf

g1 = tf.Graph()
g2 = tf.Graph()

with g1.as_default():
    # Define g1 Operations, tensors, etc.
    ...

with g2.as_default():
    # Define g2 Operations, tensors, etc.
    ...

Another good practice is to immediately assign a handle to the default graph:

import tensorflow as tf

g1 = tf.get_default_graph()
g2 = tf.Graph()

with g1.as_default():
    # Define g1 Operations, tensors, etc.
    ...

with g2.as_default():
    # Define g2 Operations, tensors, etc.
    ...

About Sessions

A graph defines the computation but it doesn’t compute anything, it doesn’t hold any values, it just defines the operations that you specified in your code. A session allows us to execute graphs or part of graphs. It allocates resources (on one or more machines) for that and holds the actual values of intermediate results and variables. Essentially Session is a class for running TensorFlow operations. A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.

To create a session we use the tf.Session constructor. The constructor takes in three optional parameters:

  • target: (Optional) This specifies execution engine to connect to. When using sessions in a distributed setting, this parameter is used to connect to tf.train.Server instances. For most applications, this will be left at its default empty string value.
  • graph: (Optional) The specifies the Graph to be launched. The default value is None, which indicates that the current default graph should be used. When using multiple graphs, it is best to explicitly pass in the Graph we'd like to run
  • config: (Optional) A ConfigProto protocol buffer with configuration options for the session. This allows users to specify options to configure the session, such as limiting the number of CPUs or GPUs to use, setting optimization parameters for graphs, and logging options.

Example:

import tensorflow as tf

# Create Operations, Tensors, etc (using the default graph)
a = tf.add(2, 5)
b = tf.mul(a, 3)

# Start up a `Session` using the default graph
# sess = tf.Session() is same as sess = tf.Session(graph=tf.get_default_graph())
sess = tf.Session()

tf.Session has several methods that are commonly used:

  • run: Runs operations and evaluates tensors.
  • as_default: Returns a context manager that makes this object the default session.
  • close: Closes this session. Calling this method frees all resources associated with the session.
  • reset: Resets resource containers on target, and close all connected sessions.

Session.run()

Once a Session is opened, you can use its primary method, run(), to calculate the value of the desired Tensor output:

import tensorflow as tf

# Create Operations, Tensors, etc (using the default graph)
a = tf.add(2, 5)
b = tf.mul(a, 3)

# Start up a `Session` using the default graph
# sess = tf.Session() is same as sess = tf.Session(graph=tf.get_default_graph())
sess = tf.Session()

# Run the session using sess.run()
sess.run(b)  # Returns 21”

Session.run() takes in one required parameter, fetches, as well as three optional parameters: feed_dict, options, and run_metadata. For the most part fetches and feed_dicts is all you will need to know about.

fetches

fetches accepts any graph element, which specifies what the user would like to execute. The value returned by run() has the same shape as the fetches argument, where the leaves are replaced by the corresponding values returned by TensorFlow.

The fetches argument may be a single graph element, or an arbitrarily nested list, tuple, namedtuple, dict, or OrderedDict containing graph elements at its leaves.
A graph element can be one of the following types:

  • An tf.Operation. The corresponding fetched value will be None.
  • A tf.Tensor. The corresponding fetched value will be a numPy n-D array containing the value of that tensor.
  • A tf.SparseTensor. The corresponding fetched value will be a tf.SparseTensorValue containing the value of that sparse tensor. Read about SpareTensor here.
  • A get_tensor_handle op. The corresponding fetched value will be a numPy n-D array containing the handle of that tensor.
  • A string which is the name of a tensor or operation in the graph.

In the previous example, we set fetches to the tensor b (the output of the tf.mul Operation). This tells TensorFlow that the Session should find all of the nodes necessary to compute the value of b, execute them in order, and output the value of b. We can also pass in a list of graph elements:

sess.run([a, b])  # returns [7, 21]

feed_dict

The optional feed_dict argument allows the caller to override the value of tensors in the graph, and it expects a Python dictionary object as input. Each key in feed_dict can be one of the following types:

  • If the key is a tf.Tensor, the value may be a Python scalar, string, list, or numpy n-D array that can be converted to the same dtype as that tensor. Additionally, if the key is a tf.placeholder, the shape of the value will be checked for compatibility with the placeholder.
  • If the key is a tf.SparseTensor, the value should be a tf.SparseTensorValue.
  • If the key is a nested tuple of Tensors or SparseTensors, the value should be a nested tuple with the same structure that maps to their corresponding values as above.
    The following example illustrates how we can use feed_dict to overwrite the value of a in the previous graph:
import tensorflow as tf

# Create Operations, Tensors, etc (using the default graph)
a = tf.add(2, 5)
b = tf.mul(a, 3)

# Start up a `Session` using the default graph
sess = tf.Session()

# Define a dictionary that says to replace the value of `a` with 15
replace_dict = {a: 15}

# Run the session, passing in `replace_dict` as the value to `feed_dict`
sess.run(b, feed_dict=replace_dict)  # returns 45

It is important note that even though a would normally evaluate to 7, the dictionary we passed into feed_dict replaced that value with 15.

Using Session as a context manager

We can also use a Session as a context manager by using its as_default() method. Similarly to how Graph objects can be used implicitly by certain Operations, we can set a session to be used automatically by certain functions. The most common of such functions are Operation.run() and Tensor.eval(), which act as if you had passed them in to Session.run() directly.

# Define simple constant
a = tf.constant(5)

# Open up a Session
sess = tf.Session()

# Use the Session as a default inside of `with` block
with sess.as_default():
    a.eval()

# Have to close Session manually.
sess.close()

That's all for this guide, in the next blog we shall learn more about TensorFlow placeholders, variables and about Tensorboard, one of the best features of TensorFlow.