可视化超参数调优过程

作者： Haifeng Jin
创建日期 2021/06/25
最后修改 2021/06/05
描述： 使用 TensorBoard 可视化 KerasTuner 中的超参数调优过程。

!pip install keras-tuner -q

引言

KerasTuner 会将日志打印到屏幕上，其中包括每个试验中超参数的值，供用户监控进度。然而，阅读日志不够直观，难以感知超参数对结果的影响。因此，我们提供了一种使用 TensorBoard 通过交互式图形可视化超参数值和相应评估结果的方法。

TensorBoard 是一个用于可视化机器学习实验的有用工具。它可以监控模型训练期间的损失和指标，并可视化模型架构。使用 TensorBoard 运行 KerasTuner 将为您提供额外的功能，以便使用其 HParams 插件可视化超参数调优结果。

我们将使用一个简单的 MNIST 图像分类数据集模型调优示例，来展示如何将 KerasTuner 与 TensorBoard 结合使用。

第一步是下载并格式化数据。

import numpy as np
import keras_tuner
import keras
from keras import layers

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Normalize the pixel values to the range of [0, 1].
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
# Add the channel dimension to the images.
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
# Print the shapes of the data.
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

(60000, 28, 28, 1)
(60000,)
(10000, 28, 28, 1)
(10000,)

然后，我们编写一个 build_model 函数，使用超参数构建模型并返回模型。超参数包括要使用的模型类型（多层感知机或卷积神经网络）、层数、单元或过滤器数量以及是否使用 dropout。

def build_model(hp):
    inputs = keras.Input(shape=(28, 28, 1))
    # Model type can be MLP or CNN.
    model_type = hp.Choice("model_type", ["mlp", "cnn"])
    x = inputs
    if model_type == "mlp":
        x = layers.Flatten()(x)
        # Number of layers of the MLP is a hyperparameter.
        for i in range(hp.Int("mlp_layers", 1, 3)):
            # Number of units of each layer are
            # different hyperparameters with different names.
            x = layers.Dense(
                units=hp.Int(f"units_{i}", 32, 128, step=32),
                activation="relu",
            )(x)
    else:
        # Number of layers of the CNN is also a hyperparameter.
        for i in range(hp.Int("cnn_layers", 1, 3)):
            x = layers.Conv2D(
                hp.Int(f"filters_{i}", 32, 128, step=32),
                kernel_size=(3, 3),
                activation="relu",
            )(x)
            x = layers.MaxPooling2D(pool_size=(2, 2))(x)
        x = layers.Flatten()(x)

    # A hyperparamter for whether to use dropout layer.
    if hp.Boolean("dropout"):
        x = layers.Dropout(0.5)(x)

    # The last layer contains 10 units,
    # which is the same as the number of classes.
    outputs = layers.Dense(units=10, activation="softmax")(x)
    model = keras.Model(inputs=inputs, outputs=outputs)

    # Compile the model.
    model.compile(
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
        optimizer="adam",
    )
    return model

我们可以快速测试模型，检查 CNN 和 MLP 是否都成功构建。

# Initialize the `HyperParameters` and set the values.
hp = keras_tuner.HyperParameters()
hp.values["model_type"] = "cnn"
# Build the model using the `HyperParameters`.
model = build_model(hp)
# Test if the model runs with our data.
model(x_train[:100])
# Print a summary of the model.
model.summary()

# Do the same for MLP model.
hp.values["model_type"] = "mlp"
model = build_model(hp)
model(x_train[:100])
model.summary()

Model: "functional_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ input_layer (InputLayer)        │ (None, 28, 28, 1)         │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ conv2d (Conv2D)                 │ (None, 26, 26, 32)        │        320 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 13, 13, 32)        │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ flatten (Flatten)               │ (None, 5408)              │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ dense (Dense)                   │ (None, 10)                │     54,090 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 54,410 (212.54 KB)

 Trainable params: 54,410 (212.54 KB)

 Non-trainable params: 0 (0.00 B)

Model: "functional_3"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape              ┃    Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ input_layer_1 (InputLayer)      │ (None, 28, 28, 1)         │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ flatten_1 (Flatten)             │ (None, 784)               │          0 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ dense_1 (Dense)                 │ (None, 32)                │     25,120 │
├─────────────────────────────────┼───────────────────────────┼────────────┤
│ dense_2 (Dense)                 │ (None, 10)                │        330 │
└─────────────────────────────────┴───────────────────────────┴────────────┘

 Total params: 25,450 (99.41 KB)

 Trainable params: 25,450 (99.41 KB)

 Non-trainable params: 0 (0.00 B)

初始化 RandomSearch tuner，进行 10 次试验，并使用验证准确率作为选择模型的指标。

tuner = keras_tuner.RandomSearch(
    build_model,
    max_trials=10,
    # Do not resume the previous search in the same directory.
    overwrite=True,
    objective="val_accuracy",
    # Set a directory to store the intermediate results.
    directory="/tmp/tb",
)

通过调用 tuner.search(...) 开始搜索。要使用 TensorBoard，我们需要将一个 keras.callbacks.TensorBoard 实例传递给 callbacks 参数。

tuner.search(
    x_train,
    y_train,
    validation_split=0.2,
    epochs=2,
    # Use the TensorBoard callback.
    # The logs will be write to "/tmp/tb_logs".
    callbacks=[keras.callbacks.TensorBoard("/tmp/tb_logs")],
)

Trial 10 Complete [00h 00m 06s]
val_accuracy: 0.9617499709129333

Best val_accuracy So Far: 0.9837499856948853
Total elapsed time: 00h 08m 32s

如果在 Colab 中运行，以下两条命令将在 Colab 中显示 TensorBoard。

%load_ext tensorboard

%tensorboard --logdir /tmp/tb_logs

您可以使用 TensorBoard 的所有常用功能。例如，您可以查看损失和指标曲线，并可视化不同试验中模型的计算图。

Loss and metrics curves Computational graphs

除了这些功能，我们还有一个 HParams 选项卡，其中包含三种视图。在表格视图中，您可以查看一个包含不同超参数值和评估指标的表格，显示 10 个不同的试验。

Table view

在左侧，您可以为某些超参数指定过滤器。例如，您可以指定只查看不带 dropout 层且包含 1 到 2 个密集层的 MLP 模型。

Filtered table view

除了表格视图，它还提供另外两种视图：平行坐标视图和散点图矩阵视图。它们只是同一种数据的不同可视化方法。您仍然可以使用左侧面板来过滤结果。

在平行坐标视图中，每条彩色的线代表一个试验。坐标轴是超参数和评估指标。

Parallel coordinates view

在散点图矩阵视图中，每个点代表一个试验。这些图是试验在不同超参数和指标作为坐标轴的平面上的投影。

Scatter plot matrix view

可视化超参数调优过程

引言