一步步帶你探究如何高效使用TensorFlow

更多深度文章，請關注：https://yq.aliyun.com/cloud

更詳細的Tensorflow教程：點擊

Tensorflow和其他數字計算庫（如numpy）之間最明顯的區別在於Tensorflow中操作的是符號。這是一個強大的功能，這保證了Tensorflow可以做很多其他庫（例如numpy）不能完成的事情（例如自動區分）。這可能也是它更複雜的原因。今天我們來一步步探秘Tensorflow，並為更有效地使用Tensorflow提供了一些指導方針和最佳實踐。

我們從一個簡單的例子開始，我們要乘以兩個隨機矩陣。首先我們來看一下在numpy中如何實現：

import numpy as np
x = np.random.normal(size=[10, 10])
y = np.random.normal(size=[10, 10])
z = np.dot(x, y)
print(z)

現在我們使用Tensorflow中執行完全相同的計算：

import tensorflow as tf
x = tf.random_normal([10, 10])
y = tf.random_normal([10, 10])
z = tf.matmul(x, y)
sess = tf.Session()
z_val = sess.run(z)
print(z_val)

與立即執行計算並將結果複製給輸出變量z的numpy不同，tensorflow隻給我們一個可以操作的張量類型。如果我們嚐試直接打印z的值，我們得到這樣的東西：

Tensor("MatMul:0", shape=(10, 10), dtype=float32)

由於兩個輸入都是已經定義的類型，tensorFlow能夠推斷張量的符號及其類型。為了計算張量的值，我們需要創建一個會話並使用Session.run方法進行評估。

要了解如此強大的符號計算到底是什麼，我們可以看看另一個例子。假設我們有一個曲線的樣本（例如f（x）= 5x ^ 2 + 3），並且我們要估計f（x）在不知道它的參數的前提下。我們定義參數函數為g（x，w）= w0 x ^ 2 + w1 x + w2，它是輸入x和潛在參數w的函數，我們的目標是找到潛在參數，使得g（x， w）≈f（x）。這可以通過最小化損失函數來完成：L（w）=（f（x）-g（x，w））^ 2。雖然這問題有一個簡單的封閉式的解決方案，但是我們選擇使用一種更為通用的方法，可以應用於任何可以區分的任務，那就是使用隨機梯度下降。我們在一組采樣點上簡單地計算相對於w的L（w）的平均梯度，並沿相反方向移動。

以下是在Tensorflow中如何完成：

import numpy as np
import tensorflow as tf
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
w = tf.get_variable("w", shape=[3, 1])
f = tf.stack([tf.square(x), x, tf.ones_like(x)], 1)
yhat = tf.squeeze(tf.matmul(f, w), 1)
loss = tf.nn.l2_loss(yhat - y) + 0.1 * tf.nn.l2_loss(w)
train_op = tf.train.AdamOptimizer(0.1).minimize(loss)
def generate_data():
    x_val = np.random.uniform(-10.0, 10.0, size=100)
    y_val = 5 * np.square(x_val) + 3
    return x_val, y_val
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for _ in range(1000):
    x_val, y_val = generate_data()
    _, loss_val = sess.run([train_op, loss], {x: x_val, y: y_val})
    print(loss_val)
print(sess.run([w]))

通過運行這段代碼，我們可以看到下麵這組數據：

[4.9924135, 0.00040895029, 3.4504161]

這與我們的參數已經相當接近。

這隻是Tensorflow可以做的冰山一角。許多問題，如優化具有數百萬個參數的大型神經網絡，都可以在Tensorflow中使用短短的幾行代碼高效地實現。而且Tensorflow可以跨多個設備和線程進行擴展，並支持各種平台。

Tensorflow中的張量在圖形構造期間具有靜態的形狀屬性。例如，我們可以定義一個形狀的張量[None，128]：

import tensorflow as tf
a = tf.placeholder([None, 128])

這意味著第一個維度可以是任意大小的，並且將在Session.run期間隨機確定。Tensorflow有一個非常簡單的API來展示靜態形狀：

static_shape = a.get_shape().as_list()  # returns [None, 128]

為了獲得張量的動態形狀，你可以調用tf.shape op，它將返回一個表示給定形狀的張量：

dynamic_shape = tf.shape(a)

我們可以使用Tensor.set_shape（）方法設置張量的靜態形狀：

a.set_shape([32, 128])

實際上使用tf.reshape（）操作更為安全：

a =  tf.reshape(a, [32, 128])

這裏有一個函數可以方便地返回靜態形狀，當靜態可用而動態不可用的時候。

def get_shape(tensor):
  static_shape = tensor.get_shape().as_list()
  dynamic_shape = tf.unstack(tf.shape(tensor))
  dims = [s[1] if s[0] is None else s[0]
          for s in zip(static_shape, dynamic_shape)]
  return dims

現在想象一下，如果我們要將三維的張量轉換成二維的張量。在TensorFlow中我們可以使用get_shape（）函數：

b = placeholder([None, 10, 32])
shape = get_shape(tensor)
b = tf.reshape(b, [shape[0], shape[1] * shape[2]])

請注意，無論是否靜態指定形狀，都可以這樣做。

實際上，我們可以寫一個通用的重塑功能來如何維度之間的轉換：

import tensorflow as tf
import numpy as np
def reshape(tensor, dims_list):
  shape = get_shape(tensor)
  dims_prod = []
  for dims in dims_list:
    if isinstance(dims, int):
      dims_prod.append(shape[dims])
    elif all([isinstance(shape[d], int) for d in dims]):
      dims_prod.append(np.prod([shape[d] for d in dims]))
    else:
      dims_prod.append(tf.prod([shape[d] for d in dims]))
  tensor = tf.reshape(tensor, dims_prod)
  return tensor

然後轉化為二維就變得非常容易了：

b = placeholder([None, 10, 32])
b = tf.reshape(b, [0, [1, 2]])

廣播機製（broadcasting）的好與壞：

Tensorflow同樣支持廣播機製。當要執行加法和乘法運算時，你需要確保操作數的形狀匹配，例如，你不能將形狀[3，2]的張量添加到形狀的張量[3,4]。但有一個特殊情況，那就是當你有一個單一的維度。Tensorflow隱含地功能可以將張量自動匹配另一個操作數的形狀。例如：

import tensorflow as tf
a = tf.constant([[1., 2.], [3., 4.]])
b = tf.constant([[1.], [2.]])
# c = a + tf.tile(a, [1, 2])
c = a + b

廣播允許我們執行隱藏的功能，這使代碼更簡單，並且提高了內存的使用效率，因為我們不需要再使用其他的操作。為了連接不同長度的特征，我們通常平鋪式的輸入張量。這是各種神經網絡架構的最常見模式：

a = tf.random_uniform([5, 3, 5])
b = tf.random_uniform([5, 1, 6])
# concat a and b and apply nonlinearity
tiled_b = tf.tile(b, [1, 3, 1])
c = tf.concat([a, tiled_b], 2)
d = tf.layers.dense(c, 10, activation=tf.nn.relu)

這可以通過廣播機製更有效地完成。我們使用f（m（x + y））等於f（mx + my）的事實。所以我們可以分別進行線性運算，並使用廣播進行隱式級聯：

pa = tf.layers.dense(a, 10, activation=None)

pb = tf.layers.dense(b, 10, activation=None)
d = tf.nn.relu(pa + pb)

實際上，這段代碼很普遍，隻要在張量之間進行廣播就可以應用於任意形狀的張量：

def tile_concat_dense(a, b, units, activation=tf.nn.relu):
    pa = tf.layers.dense(a, units, activation=None)
    pb = tf.layers.dense(b, units, activation=None)
    c = pa + pb
    if activation is not None:
        c = activation(c)
    return c

到目前為止，我們討論了廣播的好的部分。但是你可能會問什麼壞的部分？隱含的假設總是使調試更加困難，請考慮以下示例：

a = tf.constant([[1.], [2.]])
b = tf.constant([1., 2.])
c = tf.reduce_sum(a + b)

你認為C的數值是多少如果你猜到6，那是錯的。這是因為當兩個張量的等級不匹配時，Tensorflow會在元素操作之前自動擴展具有較低等級的張量，因此加法的結果將是[[2,3]， [3，4]]。

如果我們指定了我們想要減少的維度，避免這個錯誤就變得很容易了：

a = tf.constant([[1.], [2.]])
b = tf.constant([1., 2.])
c = tf.reduce_sum(a + b, 0)

這裏c的值將是[5,7]。

使用Python實現原型內核和高級可視化的操作：

為了提高效率，Tensorflow中的操作內核完全是用C ++編寫，但是在C ++中編寫Tensorflow內核可能會相當痛苦。。使用tf.py_func（），你可以將任何python代碼轉換為Tensorflow操作。

例如，這是python如何在Tensorflow中實現一個簡單的ReLU非線性內核：

import numpy as np
import tensorflow as tf
import uuid
def relu(inputs):
    # Define the op in python
    def _relu(x):
        return np.maximum(x, 0.)
    # Define the op's gradient in python
    def _relu_grad(x):
        return np.float32(x > 0)
    # An adapter that defines a gradient op compatible with Tensorflow
    def _relu_grad_op(op, grad):
        x = op.inputs[0]
        x_grad = grad * tf.py_func(_relu_grad, [x], tf.float32)
        return x_grad
    # Register the gradient with a unique id
    grad_name = "MyReluGrad_" + str(uuid.uuid4())
    tf.RegisterGradient(grad_name)(_relu_grad_op)
    # Override the gradient of the custom op
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": grad_name}):
        output = tf.py_func(_relu, [inputs], tf.float32)
    return output

要驗證梯度是否正確，你可以使用Tensorflow的梯度檢查器：

x = tf.random_normal([10])
y = relu(x * x)
with tf.Session():
    diff = tf.test.compute_gradient_error(x, [10], y, [10])
    print(diff)

compute_gradient_error（）是以數字的方式計算梯度，並返回與漸變的差異，因為我們想要的是一個很小的差異。

請注意，此實現效率非常低，隻對原型設計有用，因為python代碼不可並行化，不能在GPU上運行。

在實踐中，我們通常使用python ops在Tensorboard上進行可視化。試想一下你正在構建圖像分類模型，並希望在訓練期間可視化你的模型預測。Tensorflow允許使用函數tf.summary.image（）進行可視化：

image = tf.placeholder(tf.float32)
tf.summary.image("image", image)

但這隻能顯示輸入圖像，為了可視化預測，你必須找到一種方法來添加對圖像的注釋，這對於現有操作幾乎是不可能的。一個更簡單的方法是在python中進行繪圖，並將其包裝在一個python 方法中：

import io
import matplotlib.pyplot as plt
import numpy as np
import PIL
import tensorflow as tf
def visualize_labeled_images(images, labels, max_outputs=3, name='image'):
    def _visualize_image(image, label):
        # Do the actual drawing in python
        fig = plt.figure(figsize=(3, 3), dpi=80)
        ax = fig.add_subplot(111)
        ax.imshow(image[::-1,...])
        ax.text(0, 0, str(label), 
          horizontalalignment='left', 
          verticalalignment='top')
        fig.canvas.draw()
        # Write the plot as a memory file.
        buf = io.BytesIO()
        data = fig.savefig(buf, format='png')
        buf.seek(0)      
        # Read the image and convert to numpy array
        img = PIL.Image.open(buf)
        return np.array(img.getdata()).reshape(img.size[0], img.size[1], -1)
    def _visualize_images(images, labels):
        # Only display the given number of examples in the batch
        outputs = []
        for i in range(max_outputs):
            output = _visualize_image(images[i], labels[i])
            outputs.append(output)
        return np.array(outputs, dtype=np.uint8)
    # Run the python op.
    figs = tf.py_func(_visualize_images, [images, labels], tf.uint8)
    return tf.summary.image(name, figs)

請注意，由於概要通常隻能在一段時間內進行評估（不是每步），因此實施中可以使用該實現，而不用擔心效率。

本文由北郵@愛可可-愛生活老師推薦，@阿裏雲雲棲社區組織翻譯。

文章原標題《Effective Tensorflow - Guides and best practices for effective use of Tensorflow》

作者：google的件工程師，CS中的博士學位。從事機器學習，NLP和計算機視覺工作。

審閱：

文章為簡譯，更為詳細的內容，請查看原文

最後更新：2017-08-13 22:34:11

一步步帶你探究如何高效使用TensorFlow

上一篇：機器學習資料

下一篇： javascript複製數組的三種方式

相關內容

熱門內容

最新內容

一步步帶你探究如何高效使用TensorFlow

上一篇： 機器學習資料

下一篇： javascript複製數組的三種方式

相關內容

熱門內容

最新內容

上一篇：機器學習資料