A Guide to TensorFlow: Linear regression (Part 5)
Series Introduction
This blog is a part of "A Guide To TensorFlow", where we will explore the TensorFlow API and use it to build multiple machine learning models for reallife examples. Uptil now we've explored much about TensorFlow API, in this guide we will try to use our knowledge to build simple machine learning models. This guide is about linear regression.
Check out the other parts of the series: Part 1, Part 2, Part 3 and Part 4
Linear Regression
If you don't know what linear regression exactly is I would recommend you to go through Understanding Gradient Descent, it covers linear regression in detail as well as gradient descent, the algorithm that makes linear regression work. TL;DR Linear regression is a way to find an equation that models a relationship between a dependent variable \(X\) and a explanatory variable \(Y\).
The overall idea of regression is to examine two things:
 Does a set of explanatory variables do a good job in predicting an dependent variable?
 Which explanatory variables in particular have significant influence on th value of the dependent variable
A general equation will look like
$$y(x_1, x_2, \cdots x_k) = w_1x_1 + w_2x_2 + \cdots + w_3x_3 + b$$
This expression can be represented as follows: \(Y = WX + B\)
Here, \(Y\) is the dependent value matrix, \(W\) is the weight matrix, \(X\) is the input matrix and \(B\) is the bias matrix.
So let's try to model this in tensorflow.
The Problem
A health organization has asked us to build a model that should predict the ideal blood fat content based on the patients weight and age. The company has provided us with the data of 25 healthy individuals along. (Link to Problem)
Data
The following is a sample from the dataset.
Weight  Age  Blood Fat Content 

84  46  354 
73  20  190 
65  52  405 
70  30  263 
76  57  451 
The first parameter is the weight of the person, the second parameter is the corresponding age of the person. These constitute to our input variables, the third variable is the blood fat content, which we have to predict.
The Code
The following template gives the gist of the overall code skeleton of our graph.
import tensorflow as tf
... # Declare Variables Here
def inference(X):
... # Return the predicted value for given input `X`
def loss(X,Y):
... # Return the loss value for a given prediction
def inputs():
... # Define Inputs
def train(total_loss):
... # Set the learing rate and use an optimizer to minimize the `total_loss`
def evaluates(sess, X, Y):
... # Evaluate the model for sample values of weight and age
with tf.Session() as sess:
... # Initialize variables and run the session
Lets start by importing tensorflow and declaring the variables we are going to use. The first variable is W
for storing the weights, this is a matrix of shape [2,1]. The matrix is initialized with all values equal to zero. The next variable is bias. We use tf.Variable
to create each of them.
import tensorflow as tf
W = tf.Variable(tf.zeros([2, 1]), name="weights")
b = tf.Variable(0., name="bias")
Defining The Input
For this guide we will hardcode the data, however it is fairly easy to use a csv file. Which will be the approach we will take in future guides.
Weight and age is defined as a 2D Matrix and the corresponding output value as a list or a 1D Matrix. These can be returned after converting the value to float data type.
def inputs():
weight_age = [[84, 46], [73, 20], [65, 52], [70, 30], [76, 57], [69, 25], [63, 28], [72, 36], [79, 57], [75, 44], [27, 24], [89, 31], [65, 52], [57, 23], [59, 60], [69, 48], [60, 34], [79, 51], [75, 50], [82, 34], [59, 46], [67, 23], [85, 37], [55, 40], [63, 30]]
blood_fat_content = [354, 190, 405, 263, 451, 302, 288, 385, 402, 365, 209, 290, 346, 254, 395, 434, 220, 374, 308, 220, 311, 181, 274, 303, 244]
return tf.to_float(weight_age), tf.to_float(blood_fat_content)
By calling this function, we can get access to the data whenever required. We can store these value using the following statement: X, Y = inputs()
Inference Method
This method returns the prediction using general equation we just devised above.
def inference(X):
return tf.matmul(X, W) + b
Computing the Loss
Now we have to define how to compute the loss. For this simple model we will use a squared error, which sums the squared difference of all the predicted values for each training example with their corresponding expected values. Algebraically it is the squared euclidean distance between the predicted output vector and the expected one. Graphically in a 2D dataset is the length of the vertical line that you can trace from the expected data point to the predicted regression line.
$$Loss = \sum_i ( y_i  y_{predicted} )^2$$
Programatically this can be modeled as follows:
def loss(X,Y):
Y_predicted = inference(X)
return tf.reduce_sum(tf.squared_difference(Y, Y_predicted))
Training Function
We will first define the learning rate for the model and then we will use the gradient descent optimizer for optimizing the model parameters.
def train(total_loss):
learning_rate = 0.0000001
return tf.train.GradientDescentOptimizer(learning_rate).minimize(total_loss)
Let's break this down to further understand what's happening inside the train function.
learning_rate
Learning rate, in an intuitive sense is how quickly a model abandons old beliefs for new ones. . Deciding upon the learning rate to be used is a highly complex task and often is an iterative process, here since the dataset is very small, we shall prefer to go for a very small learning rate.
Note: In the article 'Understanding Gradient Descent' we talk about a parameter \(\alpha\) in an equation that looks like this:
$$w_0 ← w_0 + \alpha (y − h_w(x))$$
tf.train.GradientDescentOptimizer
This is the optimizer the implements the Gradient Descent Algorithm, you can see it's implementation here, this program in turn uses optimizer.py
.
It is important to note that tf.train.GradientDescentOptimizer
is designed to use a constant learning rate for all variables in all steps. TensorFlow also provides outofthebox adaptive optimizers including the tf.train.AdagradOptimizer
and the tf.train.AdamOptimizer
, and these can be used as dropin replacements.
Evaluate Function
This function will check our model for a specified weight and age value.
def evaluates(sess, X, Y):
print(sess.run(inference([[80. ,25.]])))
print(sess.run(inference([[65. ,25.]])))
Now this function requires a session, that will be our next step, we will create our session and call evaluate from there.
Session
The first step is to initialize all variables
with tf.Session() as sess:
tf.global_variables_initializer().run()
In the next step we take the data input using our input
function and the define our loss function for the data in total_loss
.
X, Y = inputs()
total_loss = loss(X, Y)
Now we define our training op for our model, we use the train
function we defined earlier and pass it the total_loss
parameter to it
train_op = train(total_loss)
In TensorFlow tf.Session
objects are design to run multithreaded, Here's how it works, multiple threads prepare training examples and push them in the queue, a training thread then executes a training op that dequeues minibatches from the queue. This is great simply because it makes maximum utilization of available computational resources. However TensorFlow queues can’t run without proper threading, and threading isn’t exactly pleasant in Python. To handle this TensorFlow library comes with tf.train.Coordinator
and tf.train.QueueRunner
. QueueRunner
creates a number of threads cooperating to enqueue tensors in the same queue. Coordinator
helps multiple threads stop together and report exceptions to a program that waits for them to stop.
So we first start off by defining our coordinator
coord = tf.train.Coordinator()
Now we start our QueueRunner using tf.train.start_queue_runners
, this function starts all queue runners collected in the graph.
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
The rest of the code is pretty straightforward, we define the number of training steps and then run our training op for that many times. After this, we can call our evaluates
function to check our model. Following this we stop all the threads using coord.request_stop
and then we join all the threads using coord.join(threads)
and close the session.
training_steps = 1000
for step in range(training_steps):
sess.run([train_op])
evaluates(sess, X, Y)
coord.request_stop()
coord.join(threads)
sess.close()
Problems with Linear Regression
Linear regression works only if certain underlying assumptions. In order to actually be usable in practice, the model should conform to these assumptions of linear regression.

The regression model is linear in parameters
This means that the data more or less conforms to the following general equation
$$Ax + By + C = 0$$
Even if the power of \(A\) and \(B\) changes, the equation remains linear in \(x\) and \(y\). 
Average residual/loss is zero
For the given line we find fit for the data, the mean of the loss value with respect to every data point is zero.
There are many more of these assumptions, you can read them here
The nature of this model makes it extremely sensitive to outliers, data with higher standard deviation don't give good enough results.
Most important of all, most real world problems are for the most part nonlinear. This makes linear regression a model fit for only a few certain kind of problems. Linear regression still has many applications and real world use cases, but it is extremely important to find a way to add nonlinearity in the model in order to work for the rest of the cases.
This is what we will cover in the next guide.