Computations at a scale - Introduction to Tensorflow Part 3

in #tensorflow7 years ago (edited)

tensorflow_cover.jpg

This is part three of a multi part series. If you haven't already, you should read the previous parts first.

  • Part 1 where we discussed the design philosophy of Tensorflow.
  • Part 2 where we discussed how to do basic computations with Tensorflow.

This time, we will look at doing computations at a scale as well as how we can save our results.

This post originally appeared on kasperfred.com where I write more about machine learning.

Choosing devices

You can choose to compute some operations on a specific device using template below:

with tf.device("/gpu:0"):
    # do stuff with GPU

with tf.device("/cpu:0"):
    # do some other stuff with CPU

Where the string "/gpu:0", and "/cpu:0" can be replaced with any of the available device name strings you found when verifying that Tensorflow was correctly installed.

If you installed the GPU version, Tensorflow will automatically try and run the graph on the GPU without you having to explicitly define it.

If a GPU is available it will be prioritized over the CPU.

When using multiple devices, it's worth considering that switching between devices is rather slow because all the data has to be copied over to the memory of the new device.

Distributed computing

For when one computer simply isn't enough.

Tensorflow allows for distributed computing. I imagine that this will not be relevant for most of us, so feel free to skip this section as you please, however, if you believe you might use multiple computers to work on a problem, this section might have some value to you.

Tensorflow's distributed model can be broken down into several two parts:

  • Server
  • Cluster

These are analogous to a server/client model. While the server contains the master copy, the clusters contain a set of jobs that each have a set of tasks which are actual computations.

A server that manages a cluster with one job and two workers sharing the load between two tasks can be created like so:

cluster = tf.train.ClusterSpec({"my_job": ["worker1.ip:2222", "worker2.ip:2222"]})
server = tf.train.Server(cluster, job_name="my_job", task_index=1)

a = tf.Variable(5)

with tf.device("/job:my_job/task:0"):
    b = tf.multiply(a, 10)

with tf.device("/job:my_job/task:1"):
    c = tf.add(b, a)

with tf.Session("grpc://localhost:2222") as sess:
    res = sess.run(c)
    print(res)

A corresponding worker-client can be created like so:

# Get task number from command line
import sys
task_number = int(sys.argv[1])

import tensorflow as tf

cluster = tf.train.ClusterSpec({"my_job": ["worker1.ip:2222", "worker2.ip:2222"]})
server = tf.train.Server(cluster, job_name="my_job", task_index=task_number)

print("Worker #{}".format(task_number))

server.start()
server.join()

If the client code is saved to a file, you can start the workers by typing into a terminal:

python filename.py 0
python filename.py 1

This will start two workers that listen for task 0 and task 1 of the my_job job.
Once the server is startedk, it will send the tasks to the workers which will return the answers to the server.

For a more in-depth look at distributed computing with Tensorflow, please refer to the documentation.

Saving variables (model)

Having to throw out the hard learned parameters after they have been computed isn't much fun.

Luckily, saving a model in Tensorflow quite simple using the saver object as illustrated in the example below:

a = tf.Variable(5)
b = tf.Variable(4, name="my_variable")

# set the value of a to 3
op = tf.assign(a, 3) 

# create saver object
saver = tf.train.Saver()

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    sess.run(op)

    print ("a:", sess.run(a))
    print ("my_variable:", sess.run(b))

    # use saver object to save variables
    # within the context of the current session 
    saver.save(sess, "/tmp/my_model.ckpt")
a: 3
my_variable: 4

Loading variables (model)

As with saving the model, loading a model from a file is also simple.

Note: If you have specified a Tensorflow name, you must use that same name in your loader as it has higher priority than the Python name. If you haven't specified a Tensorflow name, the Variable is saved using the Python name.

# Only necessary if you use IDLE or a jupyter notebook
tf.reset_default_graph()

# make a dummy variable
# the value is arbitrary, here just zero
# but the shape must the the same as in the saved model
a = tf.Variable(0)
c = tf.Variable(0, name="my_variable")

saver = tf.train.Saver()

with tf.Session() as sess:

    # use saver object to load variables from the saved model
    saver.restore(sess, "/tmp/my_model.ckpt")

    print ("a:", sess.run(a))
    print ("my_variable:", sess.run(c))
INFO:tensorflow:Restoring parameters from /tmp/my_model.ckpt
a: 3
my_variable: 4

Come back tomorrow for part 4 where we will look at doing visualizations with Tensorflow.

Read part 4 here now.

Sort:  

Nice blog 👌👌👌👌

Hi! I am a robot. I just upvoted you! I found similar content that readers might be interested in:
https://kasperfred.com/posts/introduction-to-tensorflow-as-a-computational-library/